Cool, didn't notice that, thanks Josh!
On Tue, Sep 2, 2014 at 11:55 AM, Josh Rosen <rosenvi...@gmail.com> wrote: > SPARK_PREPEND_CLASSES is documented on the Spark Wiki (which could > probably be easier to find): > https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools > > > On September 2, 2014 at 11:53:49 AM, Cheng Lian (lian.cs....@gmail.com) > wrote: > > Yea, SSD + SPARK_PREPEND_CLASSES totally changed my life :) > > Maybe we should add a "developer notes" page to document all these useful > black magic. > > > On Tue, Sep 2, 2014 at 10:54 AM, Reynold Xin <r...@databricks.com> wrote: > > > Having a SSD help tremendously with assembly time. > > > > Without that, you can do the following in order for Spark to pick up the > > compiled classes before assembly at runtime. > > > > export SPARK_PREPEND_CLASSES=true > > > > > > On Tue, Sep 2, 2014 at 9:10 AM, Sandy Ryza <sandy.r...@cloudera.com> > > wrote: > > > > > This doesn't help for every dependency, but Spark provides an option > to > > > build the assembly jar without Hadoop and its dependencies. We make > use > > of > > > this in CDH packaging. > > > > > > -Sandy > > > > > > > > > On Tue, Sep 2, 2014 at 2:12 AM, scwf <wangf...@huawei.com> wrote: > > > > > > > Hi sean owen, > > > > here are some problems when i used assembly jar > > > > 1 i put spark-assembly-*.jar to the lib directory of my application, > it > > > > throw compile error > > > > > > > > Error:scalac: Error: class scala.reflect.BeanInfo not found. > > > > scala.tools.nsc.MissingRequirementError: class > scala.reflect.BeanInfo > > not > > > > found. > > > > > > > > at scala.tools.nsc.symtab.Definitions$definitions$. > > > > getModuleOrClass(Definitions.scala:655) > > > > > > > > at scala.tools.nsc.symtab.Definitions$definitions$. > > > > getClass(Definitions.scala:608) > > > > > > > > at scala.tools.nsc.backend.jvm.GenJVM$BytecodeGenerator.< > > > > init>(GenJVM.scala:127) > > > > > > > > at scala.tools.nsc.backend.jvm.GenJVM$JvmPhase.run(GenJVM. > > > > scala:85) > > > > > > > > at scala.tools.nsc.Global$Run.compileSources(Global.scala:953) > > > > > > > > at scala.tools.nsc.Global$Run.compile(Global.scala:1041) > > > > > > > > at xsbt.CachedCompiler0.run(CompilerInterface.scala:126) > > > > > > > > at > > > xsbt.CachedCompiler0.liftedTree1$1(CompilerInterface.scala:102) > > > > > > > > at xsbt.CachedCompiler0.run(CompilerInterface.scala:102) > > > > > > > > at xsbt.CompilerInterface.run(CompilerInterface.scala:27) > > > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > > > NativeMethodAccessorImpl.java:39) > > > > > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > > DelegatingMethodAccessorImpl.java:25) > > > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > > > at sbt.compiler.AnalyzingCompiler.call( > > > > AnalyzingCompiler.scala:102) > > > > > > > > at sbt.compiler.AnalyzingCompiler.compile( > > > > AnalyzingCompiler.scala:48) > > > > > > > > at sbt.compiler.AnalyzingCompiler.compile( > > > > AnalyzingCompiler.scala:41) > > > > > > > > at org.jetbrains.jps.incremental.scala.local. > > > > IdeaIncrementalCompiler.compile(IdeaIncrementalCompiler.scala:28) > > > > > > > > at org.jetbrains.jps.incremental.scala.local.LocalServer. > > > > compile(LocalServer.scala:25) > > > > > > > > at org.jetbrains.jps.incremental.scala.remote.Main$.make(Main. > > > > scala:58) > > > > > > > > at org.jetbrains.jps.incremental.scala.remote.Main$.nailMain( > > > > Main.scala:21) > > > > > > > > at org.jetbrains.jps.incremental.scala.remote.Main.nailMain( > > > > Main.scala) > > > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > > > NativeMethodAccessorImpl.java:39) > > > > > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > > DelegatingMethodAccessorImpl.java:25) > > > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > > > at > > com.martiansoftware.nailgun.NGSession.run(NGSession.java:319) > > > > 2 i test my branch which updated hive version to org.apache.hive > 0.13.1 > > > > it run successfully when use a bag of 3rd jars as dependency but > > throw > > > > error using assembly jar, it seems assembly jar lead to conflict > > > > ERROR DDLTask: java.lang.NoSuchFieldError: doubleTypeInfo > > > > at org.apache.hadoop.hive.ql.io.parquet.serde. > > > > ArrayWritableObjectInspector.getObjectInspector( > > > > ArrayWritableObjectInspector.java:66) > > > > at org.apache.hadoop.hive.ql.io.parquet.serde. > > > > > > > ArrayWritableObjectInspector.<init>(ArrayWritableObjectInspector.java:59) > > > > at org.apache.hadoop.hive.ql.io.parquet.serde. > > > > ParquetHiveSerDe.initialize(ParquetHiveSerDe.java:113) > > > > at org.apache.hadoop.hive.metastore.MetaStoreUtils. > > > > getDeserializer(MetaStoreUtils.java:339) > > > > at org.apache.hadoop.hive.ql.metadata.Table. > > > > getDeserializerFromMetaStore(Table.java:283) > > > > at org.apache.hadoop.hive.ql.metadata.Table.checkValidity( > > > > Table.java:189) > > > > at org.apache.hadoop.hive.ql.metadata.Hive.createTable( > > > > Hive.java:597) > > > > at org.apache.hadoop.hive.ql.exec.DDLTask.createTable( > > > > DDLTask.java:4194) > > > > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask. > > > > java:281) > > > > at > > org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > > > > at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential( > > > > TaskRunner.java:85) > > > > > > > > > > > > > > > > > > > > > > > > On 2014/9/2 16:45, Sean Owen wrote: > > > > > > > >> Hm, are you suggesting that the Spark distribution be a bag of 100 > > > >> JARs? It doesn't quite seem reasonable. It does not remove version > > > >> conflicts, just pushes them to run-time, which isn't good. The > > > >> assembly is also necessary because that's where shading happens. In > > > >> development, you want to run against exactly what will be used in a > > > >> real Spark distro. > > > >> > > > >> On Tue, Sep 2, 2014 at 9:39 AM, scwf <wangf...@huawei.com> wrote: > > > >> > > > >>> hi, all > > > >>> I suggest spark not use assembly jar as default run-time > > > >>> dependency(spark-submit/spark-class depend on assembly jar),use a > > > >>> library of > > > >>> all 3rd dependency jar like hadoop/hive/hbase more reasonable. > > > >>> > > > >>> 1 assembly jar packaged all 3rd jars into a big one, so we need > > > >>> rebuild > > > >>> this jar if we want to update the version of some component(such > as > > > >>> hadoop) > > > >>> 2 in our practice with spark, sometimes we meet jar compatibility > > > >>> issue, > > > >>> it is hard to diagnose compatibility issue with assembly jar > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > --------------------------------------------------------------------- > > > >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > > >>> For additional commands, e-mail: dev-h...@spark.apache.org > > > >>> > > > >>> > > > >> > > > >> > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > > > For additional commands, e-mail: dev-h...@spark.apache.org > > > > > > > > > > > > > > >