> On Aug. 14, 2013, 11:57 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/resources/META-INF/log4j.properties, lines
> > 20-23
> > <https://reviews.apache.org/r/13035/diff/2/?file=330782#file330782line20>
> >
> > I've tried the patch on a real cluster and got following output (please
> > accept my apologies for the really long text):
> >
> > Task Logs: 'attempt_201308141631_0001_m_000001_0'
> >
> >
> > stdout logs
> >
> >
> > stderr logs
> > 2660 [OutputFormatLoader-consumer] INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor -
> > SqoopOutputFormatLoadExecutor consumer thread is starting
> > 2748 [OutputFormatLoader-consumer] INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor - Running loader
> > class org.apache.sqoop.job.etl.HdfsTextImportLoader
> > 2752 [main] INFO org.apache.sqoop.job.mr.SqoopMapper - Starting
> > progress service
> > 2777 [main] INFO org.apache.sqoop.job.mr.SqoopMapper - Running
> > extractor class org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> > 2782 [pool-2-thread-1] DEBUG org.apache.sqoop.job.mr.ProgressRunnable
> > - Auto-progress thread reporting progress
> > 3969 [main] INFO
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor - Using query:
> > SELECT * FROM text WHERE 100001 <= id AND id < 200001
> > 32122 [main] INFO org.apache.sqoop.job.mr.SqoopMapper - Extractor has
> > finished
> > 32129 [main] INFO org.apache.sqoop.job.mr.SqoopMapper - Stopping
> > progress service
> > 32136 [main] INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor -
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> > 34002 [OutputFormatLoader-consumer] INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor - Loader has finished
> > 34002 [main] INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor -
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> >
> >
> > syslog logs
> > 2013-08-14 16:47:17,950 WARN mapreduce.Counters: Group
> > org.apache.hadoop.mapred.Task$Counter is deprecated. Use
> > org.apache.hadoop.mapreduce.TaskCounter instead
> > 2013-08-14 16:47:19,451 WARN org.apache.hadoop.conf.Configuration:
> > session.id is deprecated. Instead, use dfs.metrics.session-id
> > 2013-08-14 16:47:19,452 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=MAP, sessionId=
> > 2013-08-14 16:47:20,160 INFO org.apache.hadoop.util.ProcessTree: setsid
> > exited with exit code 0
> > 2013-08-14 16:47:20,172 INFO org.apache.hadoop.mapred.Task: Using
> > ResourceCalculatorPlugin :
> > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@549b6220
> > 2013-08-14 16:47:20,585 INFO org.apache.hadoop.mapred.MapTask:
> > Processing split: org.apache.sqoop.job.mr.SqoopSplit@23de4dd8
> > 2013-08-14 16:47:20,610 INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
> > SqoopOutputFormatLoadExecutor consumer thread is starting
> > 2013-08-14 16:47:20,698 INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: Running loader class
> > org.apache.sqoop.job.etl.HdfsTextImportLoader
> > 2013-08-14 16:47:20,702 INFO org.apache.sqoop.job.mr.SqoopMapper:
> > Starting progress service
> > 2013-08-14 16:47:20,727 INFO org.apache.sqoop.job.mr.SqoopMapper:
> > Running extractor class
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> > 2013-08-14 16:47:20,732 DEBUG org.apache.sqoop.job.mr.ProgressRunnable:
> > Auto-progress thread reporting progress
> > 2013-08-14 16:47:21,919 INFO
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor: Using query:
> > SELECT * FROM text WHERE 100001 <= id AND id < 200001
> > 2013-08-14 16:47:50,072 INFO org.apache.sqoop.job.mr.SqoopMapper:
> > Extractor has finished
> > 2013-08-14 16:47:50,079 INFO org.apache.sqoop.job.mr.SqoopMapper:
> > Stopping progress service
> > 2013-08-14 16:47:50,086 INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> > 2013-08-14 16:47:51,952 INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: Loader has finished
> > 2013-08-14 16:47:51,952 INFO
> > org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
> > SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> > 2013-08-14 16:47:51,952 INFO org.apache.hadoop.mapred.Task:
> > Task:attempt_201308141631_0001_m_000001_0 is done. And is in the process of
> > commiting
> > 2013-08-14 16:47:53,201 INFO org.apache.hadoop.mapred.Task: Task
> > attempt_201308141631_0001_m_000001_0 is allowed to commit now
> > 2013-08-14 16:47:53,282 INFO
> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of
> > task 'attempt_201308141631_0001_m_000001_0' to /user/root/text
> > 2013-08-14 16:47:53,289 INFO org.apache.hadoop.mapred.Task: Task
> > 'attempt_201308141631_0001_m_000001_0' done.
> > 2013-08-14 16:47:53,295 INFO
> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
> > with mapRetainSize=-1 and reduceRetainSize=-1
> >
> > It seems that all the message are printed out twice, once into the
> > syslog and secondly to the error log. As our configuration is configuring
> > only error log, I'm assuming that different log4j is being loaded as well
> > (I would expect one from Hadoop for the MR related log lines). Considering
> > that we do have all logs in the normal syslog, I'm wondering if the JIRA is
> > still valid. What do you think Raghav?
>
> Raghav Gautam wrote:
> Syslog is where all the logging goes and I think this is controlled by
> $HADOOP_CONF_DIR/log4j.properties. The issue with this is that it has logs
> that are completely unrelated to our Sqoop jobs. And since
> $HADOOP_CONF_DIR/log4j.properties is not under Sqoop's control there is not
> much we can do there.
>
> The patch allows Sqoop job to have it's own log4j.properties. This allows
> it to have it's own appender and conversion pattern and print sqoop's logs to
> stderr exclusively. This would be useful in situations where we need to pull
> these logs from Hadoop and show them to the user to help them with debugging
> and stuff.
>
> Jarek Cecho wrote:
> Thank you for the feedback Raghav! Your explanation makes complete sense
> to me. Would you be open to change the time from "number of seconds after
> start" to normal timestamp? As Hadoop won't be starting all tasks at the same
> time, it will be hard to correlate multiple mappers without having timestamps
> (assuming that the time itself will be sync across Hadoop cluster).
>
> Raghav Gautam wrote:
> Shall I use %d{ISO8601} ? It would make the timestamp look like:
> 2013-08-21 13:50:56,421
> (sorry for the late reply, I am busy with some other stuff)
No worries Raghav, thank you for working on this and helping with Sqoop!
The %d{ISO8601} format for the time stamp part seems fine to me.
- Jarek
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13035/#review25193
-----------------------------------------------------------
On July 30, 2013, 7:37 p.m., Raghav Gautam wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13035/
> -----------------------------------------------------------
>
> (Updated July 30, 2013, 7:37 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-744
> https://issues.apache.org/jira/browse/SQOOP-744
>
>
> Repository: sqoop-sqoop2
>
>
> Description
> -------
>
> Adding log4j.properties for the generated job.
>
>
> Diffs
> -----
>
>
> execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/ConfigurationUtils.java
> f5f6d8e
> execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
> 59cf391
> execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
> b31161c
> execution/mapreduce/src/main/resources/META-INF/log4j.properties
> PRE-CREATION
>
> Diff: https://reviews.apache.org/r/13035/diff/
>
>
> Testing
> -------
>
> Manually tested.
>
>
> Thanks,
>
> Raghav Gautam
>
>