debug standalone Spark jobs?

Nan Zhu Sun, 05 Jan 2014 07:06:30 -0800

Hi, all  

I’m trying to run a standalone job in a Spark cluster on EC2,


obviously there is some bug in my code, after the job runs for several minutes, 
it failed with an exception  

Loading /usr/share/sbt/bin/sbt-launch-lib.bash  
[info] Set current project to rec_system (in build file:/home/ubuntu/rec_sys/)
[info] Running general.NetflixRecommender algorithm.SparkALS -b 20 -i 20 -l 
0.005 -m spark://172.31.32.76:7077 --moviepath 
s3n://trainingset/netflix/training_set/* -o 
s3n://training_set/netflix/training_set/output.txt --rank 20 -r 
s3n://trainingset/netflix/training_set/mv_*
log4j:WARN No appenders could be found for logger 
(akka.event.slf4j.Slf4jEventHandler).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
failed to init the engine class
org.apache.spark.SparkException: Job aborted: Task 43.0:9 failed more than 4 
times
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:827)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:825)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:825)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:440)
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:502)
at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:157)




However, this information does not mean anything to me, how can I print out the 
detailed log information in console

I’m not sure about the reasons of those WARNs from log4j, I received the same 
WARNING when I run spark-shell, while in there, I can see detailed information 
like which task is running, etc.

Best,

--  
Nan Zhu

debug standalone Spark jobs?

Reply via email to