Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Nitin kak
yarn.nodemanager.remote-app-log-dir  is set to /tmp/logs

On Fri, Feb 6, 2015 at 4:14 PM, Ted Yu  wrote:

> To add to What Petar said, when YARN log aggregation is enabled, consider
> specifying yarn.nodemanager.remote-app-log-dir which is where aggregated
> logs are saved.
>
> Cheers
>
> On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic 
> wrote:
>
>>
>> You can enable YARN log aggregation (yarn.log-aggregation-enable to true)
>> and execute command
>> yarn logs -applicationId 
>> after your application finishes.
>>
>> Or you can look at them directly in HDFS in /tmp/logs//logs/<
>> applicationid>/
>>
>>
>> On 6.2.2015. 19:50, nitinkak001 wrote:
>>
>>> I am trying to debug my mapPartitionsFunction. Here is the code. There
>>> are
>>> two ways I am trying to log using log.info() or println(). I am running
>>> in
>>> yarn-cluster mode. While I can see the logs from driver code, I am not
>>> able
>>> to see logs from map, mapPartition functions in the Application Tracking
>>> URL. Where can I find the logs?
>>>
>>>   /var outputRDD = partitionedRDD.mapPartitions(p => {
>>>val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
>>>p.map({ case(key, value) => {
>>> log.info("Inside map")
>>> println("Inside map");
>>> for(i <- 0 until outputTuples.size()){
>>>   val outputRecord = outputTuples.get(i)
>>>   if(outputRecord != null){
>>> outputList.add(outputRecord.
>>> getCurrRecordProfileID(),
>>> outputRecord.getWindowRecordProfileID, outputRecord.getScore())
>>>   }
>>> }
>>>  }
>>>})
>>>outputList.iterator()
>>>  })/
>>>
>>> Here is my log4j.properties
>>>
>>> /log4j.rootCategory=INFO, console
>>> log4j.appender.console=org.apache.log4j.ConsoleAppender
>>> log4j.appender.console.target=System.err
>>> log4j.appender.console.layout=org.apache.log4j.PatternLayout
>>> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
>>> %c{1}: %m%n
>>>
>>> # Settings to quiet third party logs that are too verbose
>>> log4j.logger.org.eclipse.jetty=WARN
>>> log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
>>> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
>>> log4j.logger.org.apache.spark.repl.SparkILoop$
>>> SparkILoopInterpreter=INFO/
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-
>>> RDD-processing-functions-tp21537.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Nitin kak
The yarn log aggregation is enabled and the logs which I get through "yarn
logs -applicationId "
are no different than what I get through logs in Yarn Application tracking
URL. They still dont have the above logs.

On Fri, Feb 6, 2015 at 3:36 PM, Petar Zecevic 
wrote:

>
> You can enable YARN log aggregation (yarn.log-aggregation-enable to true)
> and execute command
> yarn logs -applicationId 
> after your application finishes.
>
> Or you can look at them directly in HDFS in /tmp/logs//logs/<
> applicationid>/
>
> On 6.2.2015. 19:50, nitinkak001 wrote:
>
>> I am trying to debug my mapPartitionsFunction. Here is the code. There are
>> two ways I am trying to log using log.info() or println(). I am running
>> in
>> yarn-cluster mode. While I can see the logs from driver code, I am not
>> able
>> to see logs from map, mapPartition functions in the Application Tracking
>> URL. Where can I find the logs?
>>
>>   /var outputRDD = partitionedRDD.mapPartitions(p => {
>>val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
>>p.map({ case(key, value) => {
>> log.info("Inside map")
>> println("Inside map");
>> for(i <- 0 until outputTuples.size()){
>>   val outputRecord = outputTuples.get(i)
>>   if(outputRecord != null){
>> outputList.add(outputRecord.getCurrRecordProfileID(),
>> outputRecord.getWindowRecordProfileID, outputRecord.getScore())
>>   }
>> }
>>  }
>>})
>>outputList.iterator()
>>  })/
>>
>> Here is my log4j.properties
>>
>> /log4j.rootCategory=INFO, console
>> log4j.appender.console=org.apache.log4j.ConsoleAppender
>> log4j.appender.console.target=System.err
>> log4j.appender.console.layout=org.apache.log4j.PatternLayout
>> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
>> %c{1}: %m%n
>>
>> # Settings to quiet third party logs that are too verbose
>> log4j.logger.org.eclipse.jetty=WARN
>> log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
>> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
>> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-
>> RDD-processing-functions-tp21537.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Ted Yu
To add to What Petar said, when YARN log aggregation is enabled, consider
specifying yarn.nodemanager.remote-app-log-dir which is where aggregated
logs are saved.

Cheers

On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic 
wrote:

>
> You can enable YARN log aggregation (yarn.log-aggregation-enable to true)
> and execute command
> yarn logs -applicationId 
> after your application finishes.
>
> Or you can look at them directly in HDFS in /tmp/logs//logs/<
> applicationid>/
>
>
> On 6.2.2015. 19:50, nitinkak001 wrote:
>
>> I am trying to debug my mapPartitionsFunction. Here is the code. There are
>> two ways I am trying to log using log.info() or println(). I am running
>> in
>> yarn-cluster mode. While I can see the logs from driver code, I am not
>> able
>> to see logs from map, mapPartition functions in the Application Tracking
>> URL. Where can I find the logs?
>>
>>   /var outputRDD = partitionedRDD.mapPartitions(p => {
>>val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
>>p.map({ case(key, value) => {
>> log.info("Inside map")
>> println("Inside map");
>> for(i <- 0 until outputTuples.size()){
>>   val outputRecord = outputTuples.get(i)
>>   if(outputRecord != null){
>> outputList.add(outputRecord.getCurrRecordProfileID(),
>> outputRecord.getWindowRecordProfileID, outputRecord.getScore())
>>   }
>> }
>>  }
>>})
>>outputList.iterator()
>>  })/
>>
>> Here is my log4j.properties
>>
>> /log4j.rootCategory=INFO, console
>> log4j.appender.console=org.apache.log4j.ConsoleAppender
>> log4j.appender.console.target=System.err
>> log4j.appender.console.layout=org.apache.log4j.PatternLayout
>> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
>> %c{1}: %m%n
>>
>> # Settings to quiet third party logs that are too verbose
>> log4j.logger.org.eclipse.jetty=WARN
>> log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
>> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
>> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-
>> RDD-processing-functions-tp21537.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Petar Zecevic


You can enable YARN log aggregation (yarn.log-aggregation-enable to 
true) and execute command

yarn logs -applicationId 
after your application finishes.

Or you can look at them directly in HDFS in 
/tmp/logs//logs//


On 6.2.2015. 19:50, nitinkak001 wrote:

I am trying to debug my mapPartitionsFunction. Here is the code. There are
two ways I am trying to log using log.info() or println(). I am running in
yarn-cluster mode. While I can see the logs from driver code, I am not able
to see logs from map, mapPartition functions in the Application Tracking
URL. Where can I find the logs?

  /var outputRDD = partitionedRDD.mapPartitions(p => {
   val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
   p.map({ case(key, value) => {
log.info("Inside map")
println("Inside map");
for(i <- 0 until outputTuples.size()){
  val outputRecord = outputTuples.get(i)
  if(outputRecord != null){
outputList.add(outputRecord.getCurrRecordProfileID(),
outputRecord.getWindowRecordProfileID, outputRecord.getScore())
  }
}
 }
   })
   outputList.iterator()
 })/

Here is my log4j.properties

/log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
%c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Where can I find logs set inside RDD processing functions?

2015-02-06 Thread nitinkak001
I am trying to debug my mapPartitionsFunction. Here is the code. There are
two ways I am trying to log using log.info() or println(). I am running in
yarn-cluster mode. While I can see the logs from driver code, I am not able
to see logs from map, mapPartition functions in the Application Tracking
URL. Where can I find the logs?

 /var outputRDD = partitionedRDD.mapPartitions(p => {
  val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
  p.map({ case(key, value) => {
   log.info("Inside map")
   println("Inside map");
   for(i <- 0 until outputTuples.size()){
 val outputRecord = outputTuples.get(i)
 if(outputRecord != null){
   outputList.add(outputRecord.getCurrRecordProfileID(),
outputRecord.getWindowRecordProfileID, outputRecord.getScore())
 }  
   }
}
  })
  outputList.iterator()
})/

Here is my log4j.properties

/log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
%c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org