Hi Marcin, Thanks for sharing the problem. In the mailing list, i've seen couple of error reports around jackson library version conflict.
I would like to investigate the problem and share the result here. Thanks, moon On Wed, Mar 30, 2016 at 7:40 PM Marcin Pilarczyk < marcin.pilarc...@interia.pl> wrote: > Hi all, > > in one of previous threads I've described some problems with configuring > zeppelin + spark on EC2. Now I'm step further. On both server I have spark > 1.6.1, just updated to this version but the version itself is not a problem. > > What happens... > > 1. Pure environment, zeppelin connects to remote spark, some small piece > of of code is executed: > > val NUM_SAMPLES = 10000000 > > val count = sc.parallelize(1 to NUM_SAMPLES).map{i => > val x = Math.random() > val y = Math.random() > if (x*x + y*y < 1) 1 else 0 > }.reduce(_ + _) > println("Pi is roughly " + 4.0 * count / NUM_SAMPLES) > > 2. No problem at all. Step futher. I need to read data from Amazon S3. > > sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "xxx") > sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "xxx") > val rddFull = sc.textFile("xxx").zipWithIndex() > > And... > > java.io.IOException: No FileSystem for scheme: s3n > at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) > ... > > 3. Clear. Hadoop AWS is missing in CP. I'm adding the following > dependency: org.apache.hadoop:hadoop-aws:2.6.0 > 4. Executing the piece of code again: > > java.lang.NoSuchMethodError: > com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class; > at > com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49) > at > com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala) > at > com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule.<init>(DefaultScalaModule.scala:19) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule$.<init>(DefaultScalaModule.scala:35) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule$.<clinit>(DefaultScalaModule.scala) > at > org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:81) > at > org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala) > at org.apache.spark.SparkContext.withScope(SparkContext.scala:714) > at org.apache.spark.SparkContext.textFile(SparkContext.scala:830) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:37) > at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39) > at $iwC$$iwC$$iwC$$iwC.<init>(<console>:41) > at $iwC$$iwC$$iwC.<init>(<console>:43) > at $iwC$$iwC.<init>(<console>:45) > at $iwC.<init>(<console>:47) > at <init>(<console>:49) > at .<init>(<console>:53) > at .<clinit>(<console>) > at .<init>(<console>:7) > at .<clinit>(<console>) > at $print(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at > org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:813) > at > org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:756) > at > org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:748) > at > org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:331) > at org.apache.zeppelin.scheduler.Job.run(Job.java:171) > at > org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > 5. Small investigation in google shows that it's all about incompatible > versions of fastxml libraries. And the question: how to fix it???? > > What I've already done to fix it: > 1. tried with spark 1.5.2 hadoop 2.6 - same error > 2. tried with spark 1.5.2 hadoop 2.4 - same error > 3. tried to recompile spark with upgraded version of fastxml (2.6) - same > error > > Moreover, if I login onto the spark server and use the spark shell I'm > able to execute the piece of code without any problems. After a few seconds > I've got my file read and the basic count() shows the correct result. > > Regards, > Marcin > > > >