Hello.

If you use cdh for hadoop can you try build order like 'mvn clean package
-Pvendor-repo  -DskipTests -Pspark-1.5 -Dspark.version=1.5.2
-Dhadoop.version=2.6.0-mr1-cdh5.4.8' ?

I hope this is help.

2015-11-22 6:55 GMT+09:00 Timur Shenkao <t...@timshenkao.su>:

> Hi!
>
> I use CentOS 6.7 + Spark 1.5.2 Standalone + Cloudera Hadoop 5.4.8 on the
> same cluster. I can't use Mesos or Spark on YARN.
> I decided to try Zeppelin. I tried to use binaries,  to build from sources
> with different parameters.
> At last, I built version 0.6.0 so:
> mvn clean package  –DskipTests  -Pspark-1.5 -Phadoop-2.6 -Pyarn -Ppyspark
> -Pbuild-distr
>
> But constantly get the error:
>
> com.fasterxml.jackson.databind.JsonMappingException: Could not find
> creator property with name 'id' (in class
> org.apache.spark.rdd.RDDOperationScope) at [Source:
> {"id":"0","name":"parallelize"}; line: 1, column: 1] at
> com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
> at
> com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
> at
> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
> at
> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)
> at
> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143)
> at
> com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439)
> at
> com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666)
> at
> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558)
> at
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)
> at
> org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82)
> at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1603) at
> org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1603) at
> scala.Option.map(Option.scala:145) at
> org.apache.spark.rdd.RDD.<init>(RDD.scala:1603) at
> org.apache.spark.rdd.ParallelCollectionRDD.<init>(ParallelCollectionRDD.scala:85)
> at
> org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:725)
> at
> org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:723)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.SparkContext.withScope(SparkContext.scala:709) at
> org.apache.spark.SparkContext.parallelize(SparkContext.scala:723) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42) at
> $iwC$$iwC$$iwC$$iwC$$i
> ...
> and so on.
>
> My code is:
> %spark
> import org.apache.spark.sql._
> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
>
> case class Contact(name: String, phone: String)
> case class Person(name: String, age: Int, contacts: Seq[Contact])
>
> val records = (1 to 100).map { i =>;
> Person(s"name_$i", i, (0 to 1).map { m => Contact(s"contact_$m",
> s"phone_$m") })
> }
>
> Then, it fails after the following line:
> sc.parallelize(records).toDF().write.format("orc").save("people")
>
> In spark-shell, this code works perfectly, so problem is in Zeppelin.
>
> By the way, your own tutorial gives the same error:
>
> // load bank data
> val bankText = sc.parallelize(
>     IOUtils.toString(
>          new URL("
> https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv";),
>          Charset.forName("utf8")).split("\n"))
>
> case class Bank(age: Integer, job: String, marital: String, education:
> String, balance: Integer)
>
> val bank = bankText.map(s => s.split(";")).filter(s => s(0) !=
> "\"age\"").map(
>     s => Bank(s(0).toInt,
>             s(1).replaceAll("\"", ""),
>             s(2).replaceAll("\"", ""),
>             s(3).replaceAll("\"", ""),
>             s(5).replaceAll("\"", "").toInt
>         )
> ).toDF()
> bank.registerTempTable("bank")
>
>
> How to fix it? Change some dependency in pom.xml?
>



-- 

[image: 본문 이미지 1]

(주)엔에프랩  |  콘텐츠서비스팀 |  팀장 심형성

*E. hsshim*@nflabs.com <c...@nflabs.com>

*T.* 02-3458-9650 *M. *010-4282-1230

*A.* 서울특별시 강남구 논현동 216-2 하림빌딩 2층 NFLABS

Reply via email to