Re: Review Request 13035: SQOOP-744: log4j configuration for generated mapreduce job

Jarek Cecho Wed, 21 Aug 2013 22:21:44 -0700


> On Aug. 14, 2013, 11:57 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/resources/META-INF/log4j.properties, lines 
> > 20-23
> > <https://reviews.apache.org/r/13035/diff/2/?file=330782#file330782line20>
> >
> >     I've tried the patch on a real cluster and got following output (please 
> > accept my apologies for the really long text):
> >     
> >     Task Logs: 'attempt_201308141631_0001_m_000001_0'
> >     
> >     
> >     stdout logs
> >     
> >     
> >     stderr logs
> >     2660 [OutputFormatLoader-consumer] INFO  
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - 
> > SqoopOutputFormatLoadExecutor consumer thread is starting
> >     2748 [OutputFormatLoader-consumer] INFO  
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - Running loader 
> > class org.apache.sqoop.job.etl.HdfsTextImportLoader
> >     2752 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Starting 
> > progress service
> >     2777 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Running 
> > extractor class org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> >     2782 [pool-2-thread-1] DEBUG org.apache.sqoop.job.mr.ProgressRunnable  
> > - Auto-progress thread reporting progress
> >     3969 [main] INFO  
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor  - Using query: 
> > SELECT * FROM text WHERE 100001 <= id AND id < 200001
> >     32122 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Extractor has 
> > finished
> >     32129 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Stopping 
> > progress service
> >     32136 [main] INFO  
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - 
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> >     34002 [OutputFormatLoader-consumer] INFO  
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - Loader has finished
> >     34002 [main] INFO  
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - 
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> >     
> >     
> >     syslog logs
> >     2013-08-14 16:47:17,950 WARN mapreduce.Counters: Group 
> > org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> > org.apache.hadoop.mapreduce.TaskCounter instead
> >     2013-08-14 16:47:19,451 WARN org.apache.hadoop.conf.Configuration: 
> > session.id is deprecated. Instead, use dfs.metrics.session-id
> >     2013-08-14 16:47:19,452 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
> > Initializing JVM Metrics with processName=MAP, sessionId=
> >     2013-08-14 16:47:20,160 INFO org.apache.hadoop.util.ProcessTree: setsid 
> > exited with exit code 0
> >     2013-08-14 16:47:20,172 INFO org.apache.hadoop.mapred.Task:  Using 
> > ResourceCalculatorPlugin : 
> > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@549b6220
> >     2013-08-14 16:47:20,585 INFO org.apache.hadoop.mapred.MapTask: 
> > Processing split: org.apache.sqoop.job.mr.SqoopSplit@23de4dd8
> >     2013-08-14 16:47:20,610 INFO 
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: 
> > SqoopOutputFormatLoadExecutor consumer thread is starting
> >     2013-08-14 16:47:20,698 INFO 
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: Running loader class 
> > org.apache.sqoop.job.etl.HdfsTextImportLoader
> >     2013-08-14 16:47:20,702 INFO org.apache.sqoop.job.mr.SqoopMapper: 
> > Starting progress service
> >     2013-08-14 16:47:20,727 INFO org.apache.sqoop.job.mr.SqoopMapper: 
> > Running extractor class 
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> >     2013-08-14 16:47:20,732 DEBUG org.apache.sqoop.job.mr.ProgressRunnable: 
> > Auto-progress thread reporting progress
> >     2013-08-14 16:47:21,919 INFO 
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor: Using query: 
> > SELECT * FROM text WHERE 100001 <= id AND id < 200001
> >     2013-08-14 16:47:50,072 INFO org.apache.sqoop.job.mr.SqoopMapper: 
> > Extractor has finished
> >     2013-08-14 16:47:50,079 INFO org.apache.sqoop.job.mr.SqoopMapper: 
> > Stopping progress service
> >     2013-08-14 16:47:50,086 INFO 
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: 
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> >     2013-08-14 16:47:51,952 INFO 
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: Loader has finished
> >     2013-08-14 16:47:51,952 INFO 
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: 
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> >     2013-08-14 16:47:51,952 INFO org.apache.hadoop.mapred.Task: 
> > Task:attempt_201308141631_0001_m_000001_0 is done. And is in the process of 
> > commiting
> >     2013-08-14 16:47:53,201 INFO org.apache.hadoop.mapred.Task: Task 
> > attempt_201308141631_0001_m_000001_0 is allowed to commit now
> >     2013-08-14 16:47:53,282 INFO 
> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of 
> > task 'attempt_201308141631_0001_m_000001_0' to /user/root/text
> >     2013-08-14 16:47:53,289 INFO org.apache.hadoop.mapred.Task: Task 
> > 'attempt_201308141631_0001_m_000001_0' done.
> >     2013-08-14 16:47:53,295 INFO 
> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater 
> > with mapRetainSize=-1 and reduceRetainSize=-1
> >     
> >     It seems that all the message are printed out twice, once into the 
> > syslog and secondly to the error log. As our configuration is configuring 
> > only error log, I'm assuming that different log4j is being loaded as well 
> > (I would expect one from Hadoop for the MR related log lines). Considering 
> > that we do have all logs in the normal syslog, I'm wondering if the JIRA is 
> > still valid. What do you think Raghav?
> 
> Raghav Gautam wrote:
>     Syslog is where all the logging goes and I think this is controlled by 
> $HADOOP_CONF_DIR/log4j.properties. The issue with this is that it has logs 
> that are completely unrelated to our Sqoop jobs. And since 
> $HADOOP_CONF_DIR/log4j.properties is not under Sqoop's control there is not 
> much we can do there.
>     
>     The patch allows Sqoop job to have it's own log4j.properties. This allows 
> it to have it's own appender and conversion pattern and print sqoop's logs to 
> stderr exclusively. This would be useful in situations where we need to pull 
> these logs from Hadoop and show them to the user to help them with debugging 
> and stuff.
> 
> Jarek Cecho wrote:
>     Thank you for the feedback Raghav! Your explanation makes complete sense 
> to me. Would you be open to change the time from "number of seconds after 
> start" to normal timestamp? As Hadoop won't be starting all tasks at the same 
> time, it will be hard to correlate multiple mappers without having timestamps 
> (assuming that the time itself will be sync across Hadoop cluster).
> 
> Raghav Gautam wrote:
>     Shall I use %d{ISO8601} ? It would make the timestamp look like:
>     2013-08-21 13:50:56,421
>     (sorry for the late reply, I am busy with some other stuff)


No worries Raghav, thank you for working on this and helping with Sqoop!

The %d{ISO8601} format for the time stamp part seems fine to me.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13035/#review25193
-----------------------------------------------------------


On July 30, 2013, 7:37 p.m., Raghav Gautam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13035/
> -----------------------------------------------------------
> 
> (Updated July 30, 2013, 7:37 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-744
>     https://issues.apache.org/jira/browse/SQOOP-744
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Adding log4j.properties for the generated job.
> 
> 
> Diffs
> -----
> 
>   
> execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/ConfigurationUtils.java
>  f5f6d8e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 
> 59cf391 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 
> b31161c 
>   execution/mapreduce/src/main/resources/META-INF/log4j.properties 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13035/diff/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Raghav Gautam
> 
>

Re: Review Request 13035: SQOOP-744: log4j configuration for generated mapreduce job

Reply via email to