You can set the log level to INFO, it looks like spark is logging applicative errors as INFO. When I have errors that I can reproduce only on live data, I am running a spark shell with my job in its classpath, then I debug & tweak things to find out what happens.
2014/1/5 Nan Zhu <[email protected]> > Yes, but my problem only appears when on a large dataset, anyway, Thanks > for the reply > > Best, > > -- > Nan Zhu > > On Sunday, January 5, 2014 at 11:09 AM, Archit Thakur wrote: > > You can run your spark application locally by setting SPARK_MASTER="local" > and then debug the launched jvm in your IDE. > > > On Sun, Jan 5, 2014 at 9:04 PM, Nan Zhu <[email protected]> wrote: > > Ah, yes, I think application logs really help > > Thank you > > -- > Nan Zhu > > On Sunday, January 5, 2014 at 10:13 AM, Sriram Ramachandrasekaran wrote: > > Did you get to look at the spark worker logs? They would be at > SPARK_HOME/logs/ > Also, you should look at the application logs itself. They would be under > SPARK_HOME/work/APP_ID > > > > On Sun, Jan 5, 2014 at 8:36 PM, Nan Zhu <[email protected]> wrote: > > Hi, all > > I’m trying to run a standalone job in a Spark cluster on EC2, > > obviously there is some bug in my code, after the job runs for several > minutes, it failed with an exception > > Loading /usr/share/sbt/bin/sbt-launch-lib.bash > > [info] Set current project to rec_system (in build > file:/home/ubuntu/rec_sys/) > > [info] Running general.NetflixRecommender algorithm.SparkALS -b 20 -i 20 > -l 0.005 -m spark://172.31.32.76:7077 --moviepath > s3n://trainingset/netflix/training_set/* -o > s3n://training_set/netflix/training_set/output.txt --rank 20 -r > s3n://trainingset/netflix/training_set/mv_* > > log4j:WARN No appenders could be found for logger > (akka.event.slf4j.Slf4jEventHandler). > > log4j:WARN Please initialize the log4j system properly. > > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for > more info. > > failed to init the engine class > > org.apache.spark.SparkException: Job aborted: Task 43.0:9 failed more than > 4 times > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:827) > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:825) > > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:825) > > at > org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:440) > > at org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:502) > > at > org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:157) > > > > > > However, this information does not mean anything to me, how can I print > out the detailed log information in console > > I’m not sure about the reasons of those WARNs from log4j, I received the > same WARNING when I run spark-shell, while in there, I can see detailed > information like which task is running, etc. > > Best, > > -- > Nan Zhu > > > > > -- > It's just about how deep your longing is! > > > > >
