Re: Where can I find logs set inside RDD processing functions?
yarn.nodemanager.remote-app-log-dir is set to /tmp/logs On Fri, Feb 6, 2015 at 4:14 PM, Ted Yu wrote: > To add to What Petar said, when YARN log aggregation is enabled, consider > specifying yarn.nodemanager.remote-app-log-dir which is where aggregated > logs are saved. > > Cheers > > On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic > wrote: > >> >> You can enable YARN log aggregation (yarn.log-aggregation-enable to true) >> and execute command >> yarn logs -applicationId >> after your application finishes. >> >> Or you can look at them directly in HDFS in /tmp/logs//logs/< >> applicationid>/ >> >> >> On 6.2.2015. 19:50, nitinkak001 wrote: >> >>> I am trying to debug my mapPartitionsFunction. Here is the code. There >>> are >>> two ways I am trying to log using log.info() or println(). I am running >>> in >>> yarn-cluster mode. While I can see the logs from driver code, I am not >>> able >>> to see logs from map, mapPartition functions in the Application Tracking >>> URL. Where can I find the logs? >>> >>> /var outputRDD = partitionedRDD.mapPartitions(p => { >>>val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] >>>p.map({ case(key, value) => { >>> log.info("Inside map") >>> println("Inside map"); >>> for(i <- 0 until outputTuples.size()){ >>> val outputRecord = outputTuples.get(i) >>> if(outputRecord != null){ >>> outputList.add(outputRecord. >>> getCurrRecordProfileID(), >>> outputRecord.getWindowRecordProfileID, outputRecord.getScore()) >>> } >>> } >>> } >>>}) >>>outputList.iterator() >>> })/ >>> >>> Here is my log4j.properties >>> >>> /log4j.rootCategory=INFO, console >>> log4j.appender.console=org.apache.log4j.ConsoleAppender >>> log4j.appender.console.target=System.err >>> log4j.appender.console.layout=org.apache.log4j.PatternLayout >>> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p >>> %c{1}: %m%n >>> >>> # Settings to quiet third party logs that are too verbose >>> log4j.logger.org.eclipse.jetty=WARN >>> log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR >>> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO >>> log4j.logger.org.apache.spark.repl.SparkILoop$ >>> SparkILoopInterpreter=INFO/ >>> >>> >>> >>> >>> -- >>> View this message in context: http://apache-spark-user-list. >>> 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside- >>> RDD-processing-functions-tp21537.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >
Re: Where can I find logs set inside RDD processing functions?
The yarn log aggregation is enabled and the logs which I get through "yarn logs -applicationId " are no different than what I get through logs in Yarn Application tracking URL. They still dont have the above logs. On Fri, Feb 6, 2015 at 3:36 PM, Petar Zecevic wrote: > > You can enable YARN log aggregation (yarn.log-aggregation-enable to true) > and execute command > yarn logs -applicationId > after your application finishes. > > Or you can look at them directly in HDFS in /tmp/logs//logs/< > applicationid>/ > > On 6.2.2015. 19:50, nitinkak001 wrote: > >> I am trying to debug my mapPartitionsFunction. Here is the code. There are >> two ways I am trying to log using log.info() or println(). I am running >> in >> yarn-cluster mode. While I can see the logs from driver code, I am not >> able >> to see logs from map, mapPartition functions in the Application Tracking >> URL. Where can I find the logs? >> >> /var outputRDD = partitionedRDD.mapPartitions(p => { >>val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] >>p.map({ case(key, value) => { >> log.info("Inside map") >> println("Inside map"); >> for(i <- 0 until outputTuples.size()){ >> val outputRecord = outputTuples.get(i) >> if(outputRecord != null){ >> outputList.add(outputRecord.getCurrRecordProfileID(), >> outputRecord.getWindowRecordProfileID, outputRecord.getScore()) >> } >> } >> } >>}) >>outputList.iterator() >> })/ >> >> Here is my log4j.properties >> >> /log4j.rootCategory=INFO, console >> log4j.appender.console=org.apache.log4j.ConsoleAppender >> log4j.appender.console.target=System.err >> log4j.appender.console.layout=org.apache.log4j.PatternLayout >> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p >> %c{1}: %m%n >> >> # Settings to quiet third party logs that are too verbose >> log4j.logger.org.eclipse.jetty=WARN >> log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR >> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO >> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ >> >> >> >> >> -- >> View this message in context: http://apache-spark-user-list. >> 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside- >> RDD-processing-functions-tp21537.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >
Re: Where can I find logs set inside RDD processing functions?
To add to What Petar said, when YARN log aggregation is enabled, consider specifying yarn.nodemanager.remote-app-log-dir which is where aggregated logs are saved. Cheers On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic wrote: > > You can enable YARN log aggregation (yarn.log-aggregation-enable to true) > and execute command > yarn logs -applicationId > after your application finishes. > > Or you can look at them directly in HDFS in /tmp/logs//logs/< > applicationid>/ > > > On 6.2.2015. 19:50, nitinkak001 wrote: > >> I am trying to debug my mapPartitionsFunction. Here is the code. There are >> two ways I am trying to log using log.info() or println(). I am running >> in >> yarn-cluster mode. While I can see the logs from driver code, I am not >> able >> to see logs from map, mapPartition functions in the Application Tracking >> URL. Where can I find the logs? >> >> /var outputRDD = partitionedRDD.mapPartitions(p => { >>val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] >>p.map({ case(key, value) => { >> log.info("Inside map") >> println("Inside map"); >> for(i <- 0 until outputTuples.size()){ >> val outputRecord = outputTuples.get(i) >> if(outputRecord != null){ >> outputList.add(outputRecord.getCurrRecordProfileID(), >> outputRecord.getWindowRecordProfileID, outputRecord.getScore()) >> } >> } >> } >>}) >>outputList.iterator() >> })/ >> >> Here is my log4j.properties >> >> /log4j.rootCategory=INFO, console >> log4j.appender.console=org.apache.log4j.ConsoleAppender >> log4j.appender.console.target=System.err >> log4j.appender.console.layout=org.apache.log4j.PatternLayout >> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p >> %c{1}: %m%n >> >> # Settings to quiet third party logs that are too verbose >> log4j.logger.org.eclipse.jetty=WARN >> log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR >> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO >> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ >> >> >> >> >> -- >> View this message in context: http://apache-spark-user-list. >> 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside- >> RDD-processing-functions-tp21537.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: Where can I find logs set inside RDD processing functions?
You can enable YARN log aggregation (yarn.log-aggregation-enable to true) and execute command yarn logs -applicationId after your application finishes. Or you can look at them directly in HDFS in /tmp/logs//logs// On 6.2.2015. 19:50, nitinkak001 wrote: I am trying to debug my mapPartitionsFunction. Here is the code. There are two ways I am trying to log using log.info() or println(). I am running in yarn-cluster mode. While I can see the logs from driver code, I am not able to see logs from map, mapPartition functions in the Application Tracking URL. Where can I find the logs? /var outputRDD = partitionedRDD.mapPartitions(p => { val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] p.map({ case(key, value) => { log.info("Inside map") println("Inside map"); for(i <- 0 until outputTuples.size()){ val outputRecord = outputTuples.get(i) if(outputRecord != null){ outputList.add(outputRecord.getCurrRecordProfileID(), outputRecord.getWindowRecordProfileID, outputRecord.getScore()) } } } }) outputList.iterator() })/ Here is my log4j.properties /log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Where can I find logs set inside RDD processing functions?
I am trying to debug my mapPartitionsFunction. Here is the code. There are two ways I am trying to log using log.info() or println(). I am running in yarn-cluster mode. While I can see the logs from driver code, I am not able to see logs from map, mapPartition functions in the Application Tracking URL. Where can I find the logs? /var outputRDD = partitionedRDD.mapPartitions(p => { val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] p.map({ case(key, value) => { log.info("Inside map") println("Inside map"); for(i <- 0 until outputTuples.size()){ val outputRecord = outputTuples.get(i) if(outputRecord != null){ outputList.add(outputRecord.getCurrRecordProfileID(), outputRecord.getWindowRecordProfileID, outputRecord.getScore()) } } } }) outputList.iterator() })/ Here is my log4j.properties /log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org