Can you use: https://maven.apache.org/plugins/maven-shade-plugin/
to shade the dependencies unique to your project ? On Mon, Oct 19, 2015 at 7:47 AM, YiZhi Liu <javeli...@gmail.com> wrote: > Hi Ted, > > Unfortunately these two options cause following failure in my environment: > > (java.lang.RuntimeException: class > org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback not > > org.apache.hadoop.security.GroupMappingServiceProvider,java.lang.RuntimeException: > java.lang.RuntimeException: class > org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback not > org.apache.hadoop.security.GroupMappingServiceProvider) > > 2015-10-19 22:23 GMT+08:00 Ted Yu <yuzhih...@gmail.com>: > > Have you tried the following options ? > > > > --conf spark.driver.userClassPathFirst=true --conf > > spark.executor.userClassPathFirst=true > > > > Cheers > > > > On Mon, Oct 19, 2015 at 5:07 AM, YiZhi Liu <javeli...@gmail.com> wrote: > >> > >> I'm trying to read a Thrift object from SequenceFile, using > >> elephant-bird's ThriftWritable. My code looks like > >> > >> val rawData = sc.sequenceFile[BooleanWritable, > >> ThriftWritable[TrainingSample]](input) > >> val samples = rawData.map { case (key, value) => { > >> value.setConverter(classOf[TrainingSample]) > >> val conversion = if (key.get) 1 else 0 > >> val sample = value.get > >> (conversion, sample) > >> }} > >> > >> When I spark-submit in local mode, it failed with > >> > >> (Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, > >> most recent failure: Lost task 0.0 in stage 1.0 (TID 2, localhost): > >> java.lang.AbstractMethodError: > >> > >> > org.apache.thrift.TUnion.standardSchemeReadValue(Lorg/apache/thrift/protocol/TProtocol;Lorg/apache/thrift/protocol/TField;)Ljava/lang/Object; > >> ... ... > >> > >> I'm pretty sure it is caused by the conflict of libthrift, I use > >> thrift-0.6.1 while spark uses 0.9.2, which requires TUnion object to > >> implement the abstract 'standardSchemeReadValue' method. > >> > >> But when I set spark.files.userClassPathFirst=true, it failed even > >> earlier: > >> > >> (Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, > >> most recent failure: Lost task 1.0 in stage 0.0 (TID 1, localhost): > >> java.lang.ClassCastException: cannot assign instance of scala.None$ to > >> field org.apache.spark.scheduler.Task.metrics of type scala.Option in > >> instance of org.apache.spark.scheduler.ResultTask > >> at > >> > java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2089) > >> at > >> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1261) > >> at > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2006) > >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) > >> at > >> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > >> at > >> > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) > >> at > >> > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) > >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194) > >> at > >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > >> at > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > >> at java.lang.Thread.run(Thread.java:745) > >> > >> It seems I introduced more conflict, but I couldn't figure out which > >> one caused this failure. > >> > >> Interestingly, when I ran mvn test in my project, which test spark job > >> in locally mode, all worked fine. > >> > >> So what is the right way to take user jars precedence over Spark jars? > >> > >> -- > >> Yizhi Liu > >> Senior Software Engineer / Data Mining > >> www.mvad.com, Shanghai, China > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >> For additional commands, e-mail: user-h...@spark.apache.org > >> > > > > > > -- > Yizhi Liu > Senior Software Engineer / Data Mining > www.mvad.com, Shanghai, China >