Hello,

I’m getting a Json mapping exception errors when trying to use any cluster 
map/reduce operations on Zeppelin with Apache Spark, running on Mesos. Could 
somebody provide guidance? I feel my configuration is correct according to 
documentation I’ve read, and I can’t seem to figure out why the reduce commands 
are failing. 

Here are three (Scala) commands I tried in a fresh Zeppelin notebook, the first 
two work fine and the third fails:

> sc.getConf.getAll
res0: Array[(String, String)] = 
Array((spark.submit.pyArchives,pyspark.zip:py4j-0.8.2.1-src.zip), 
(spark.home,/cluster/spark), (spark.executor.memory,512m), 
(spark.files,file:/cluster/spark/python/lib/pyspark.zip,file:/cluster/spark/python/lib/py4j-0.8.2.1-src.zip),
 (spark.repl.class.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), (args,""), 
(zeppelin.spark.concurrentSQL,false), 
(spark.fileserver.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), 
(zeppelin.pyspark.python,python), (spark.scheduler.mode,FAIR), 
(zeppelin.spark.maxResult,1000), (spark.executor.id,driver), 
(spark.driver.port,<PORT_HIDDEN>), (zeppelin.dep.localrepo,local-repo), 
(spark.app.id,20151007-143704-2255525248-5050-29909-0013), 
(spark.externalBlockStore.folderName,spark-e7edd394-1618-4c18-b76d-9...

> (1 to 10).reduce(_ + _)
res5: Int = 55

> sc.parallelize(1 to 10).reduce(_ + _)
com.fasterxml.jackson.databind.JsonMappingException: Could not find creator 
property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope) at 
[Source: {"id":"1","name":"parallelize"}; line: 1, column: 1] at 
com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
 at 
com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
 at 
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
 at 
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)
 at 
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143)
 at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409)
 at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358)
 at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265)
 at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245)
 at 
com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143)
 at 
com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439)
 at 
com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666)
 at 
com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558)
 at 
com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578) 
at org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82) 
at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at 
org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at 
scala.Option.map(Option.scala:145) at 
org.apache.spark.rdd.RDD.<init>(RDD.scala:1490) at 
org.apache.spark.rdd.ParallelCollectionRDD.<init>(ParallelCollectionRDD.scala:85)
 at 
org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:697)
 at 
org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:695)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) 
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) 
at org.apache.spark.SparkContext.withScope(SparkContext.scala:681) at 
org.apache.spark.SparkContext.parallelize(SparkContext.scala:695) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at 
$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at 
$iwC$$iwC$$iwC$$iwC.<init>(<console>:35) at $iwC$$iwC$$iwC.<init>(<console>:37) 
at $iwC$$iwC.<init>(<console>:39) at $iwC.<init>(<console>:41) at 
<init>(<console>:43) at .<init>(<console>:47) at .<clinit>(<console>) at 
.<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483) at 
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at 
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at 
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at 
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at 
org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:610)
 at 
org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:586) 
at 
org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:579) 
at 
org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
 at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
 at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)

My configuration is as below:
--------------------------------
Spark 1.4.1
Mesos 0.21.0
Zepplin 0.5.0-incubating
Hadoop 2.4.0
Cluster: 4 nodes, CentOS 6.7

zepplin-env.sh:
-----------------
export MASTER=mesos://master:5050
export SPARK_MASTER=mesos://master:5050
export MESOS_NATIVE_JAVA_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so
export MESOS_NATIVE_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so
export SPARK_EXECUTOR_URI=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz
export 
ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz"
export SPARK_PID_DIR=/tmp
export SPARK_LOCAL_DIRS=/cluster/spark/spark_tmp
export HADOOP_CONF_DIR=/cluster/hadoop-2.4.0/etc/hadoop

Note: zepplin was built from source using:
mvn clean package -Pspark-1.4 -Dhadoop.version=2.4.0 -Phadoop-2.4 -DskipTests


Thanks very much,
---
Rishi Verma
NASA Jet Propulsion Laboratory
California Institute of Technology

Reply via email to