[ https://issues.apache.org/jira/browse/SPARK-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209231#comment-14209231 ]
Matthew Daniel commented on SPARK-4267: --------------------------------------- I rebuilt the assembly using {{./sbt/sbt -Dhadoop.version=2.4.0 -Pyarn assembly}} and moved the old jars out of the {{lib}} in the unpacked distribution directory and moved the new {{assembly/target/scala-2.10/spark-assembly-1.1.0-hadoop2.4.0.jar}} into their place. Running {{bin/spark-shell}} yields the same output, and (as one might expect) the same NPE. I've tried to trim the log output to be contextual without being verbose, so please let me know if there are more details that would help with the diagnosis. {noformat} 14/11/13 03:57:13 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, PROXY_HOST=10.166.39.198,PROXY_URI_BASE=http://10.166.39.198:9046/proxy/application_1415840940647_0002, /proxy/application_1415840940647_0002 14/11/13 03:57:13 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 14/11/13 03:57:13 INFO cluster.YarnClientSchedulerBackend: Application report from ASM: appMasterRpcPort: 0 appStartTime: 1415851029508 yarnAppState: RUNNING 14/11/13 03:57:18 ERROR cluster.YarnClientSchedulerBackend: Yarn application already ended: FINISHED 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null} 14/11/13 03:57:18 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null} 14/11/13 03:57:18 INFO ui.SparkUI: Stopped Spark web UI at http://ip-10-166-39-198.ec2.internal:4040 14/11/13 03:57:18 INFO scheduler.DAGScheduler: Stopping DAGScheduler 14/11/13 03:57:18 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors 14/11/13 03:57:18 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down 14/11/13 03:57:18 INFO cluster.YarnClientSchedulerBackend: Stopped 14/11/13 03:57:19 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 14/11/13 03:57:19 INFO network.ConnectionManager: Selector thread was interrupted! 14/11/13 03:57:19 INFO network.ConnectionManager: ConnectionManager stopped 14/11/13 03:57:19 INFO storage.MemoryStore: MemoryStore cleared 14/11/13 03:57:19 INFO storage.BlockManager: BlockManager stopped 14/11/13 03:57:19 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 14/11/13 03:57:19 INFO spark.SparkContext: Successfully stopped SparkContext 14/11/13 03:57:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 14/11/13 03:57:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 14/11/13 03:57:19 INFO Remoting: Remoting shut down 14/11/13 03:57:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 14/11/13 03:57:37 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 14/11/13 03:57:37 INFO repl.SparkILoop: Created spark context.. Spark context available as sc. {noformat} > Failing to launch jobs on Spark on YARN with Hadoop 2.5.0 or later > ------------------------------------------------------------------ > > Key: SPARK-4267 > URL: https://issues.apache.org/jira/browse/SPARK-4267 > Project: Spark > Issue Type: Bug > Reporter: Tsuyoshi OZAWA > > Currently we're trying Spark on YARN included in Hadoop 2.5.1. Hadoop 2.5 > uses protobuf 2.5.0 so I compiled with protobuf 2.5.1 like this: > {code} > ./make-distribution.sh --name spark-1.1.1 --tgz -Pyarn > -Dhadoop.version=2.5.1 -Dprotobuf.version=2.5.0 > {code} > Then Spark on YARN fails to launch jobs with NPE. > {code} > $ bin/spark-shell --master yarn-client > scala> sc.textFile("hdfs:///user/ozawa/wordcountInput20G").flatMap(line > => line.split(" ")).map(word => (word, 1)).persist().reduceByKey((a, b) => a > + b, 16).saveAsTextFile("hdfs:///user/ozawa/sparkWordcountOutNew2"); > java.lang.NullPointerException > > > > at > org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1284) > at > org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1291) > > > > at > org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:480) > at $iwC$$iwC$$iwC$$iwC.<init>(<console>:13) > > > > at $iwC$$iwC$$iwC.<init>(<console>:18) > at $iwC$$iwC.<init>(<console>:20) > > > > at $iwC.<init>(<console>:22) > at <init>(<console>:24) > > > > at .<init>(<console>:28) > at .<clinit>(<console>) > > > > at .<init>(<console>:7) > at .<clinit>(<console>) > > > > at $print(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:789) > > > > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1062) > at > org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:615) > > > > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:646) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:610) > > > > at > org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:823) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:868) > > > > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:780) > at > org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:625) > > > > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:633) > at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:638) > > > > at > org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:963) > at > org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:911) > > > > at > org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:911) > at > scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) > > > > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:911) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1006) > > > > at org.apache.spark.repl.Main$.main(Main.scala:31) > at org.apache.spark.repl.Main.main(Main.scala) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > > > > at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:329) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) > > > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org