[
https://issues.apache.org/jira/browse/KYLIN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728112#comment-16728112
]
Zhong Yanghong commented on KYLIN-3021:
---------------------------------------
By this fix, we can get more error info. One example is as follows:
{code}
Counters: 33
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=1203010
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2261
HDFS: Number of bytes written=4661
HDFS: Number of read operations=58
HDFS: Number of large read operations=0
HDFS: Number of write operations=18
Job Counters
Failed map tasks=4
Killed map tasks=1
Launched map tasks=11
Other local map tasks=9
Rack-local map tasks=2
Total time spent by all maps in occupied slots (ms)=1038159
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=346053
Total vcore-seconds taken by all map tasks=346053
Total megabyte-seconds taken by all map tasks=974485248
Map-Reduce Framework
Map input records=6
Map output records=0
Input split bytes=1484
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=598
CPU time spent (ms)=26830
Physical memory (bytes) snapshot=3199647744
Virtual memory (bytes) snapshot=25723592704
Total committed heap usage (bytes)=8977383424
File Input Format Counters
Bytes Read=732
File Output Format Counters
Bytes Written=720
Job Diagnostics:Task failed task_1544857205985_80511_m_000007
Job failed as tasks failed. failedMaps:1 failedReduces:0
Failure task Diagnostics:
Error: java.lang.IllegalStateException: Table snapshot should be no greater
than 300 MB, but
...
...
...
at
org.apache.kylin.dict.lookup.SnapshotManager.checkBeforeBuild(SnapshotManager.java:141)
at
org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshotOnly(SnapshotManager.java:166)
at
org.apache.kylin.engine.mr.steps.BuildDictionaryMapper.buildSnapshot(BuildDictionaryMapper.java:290)
at
org.apache.kylin.engine.mr.steps.BuildDictionaryMapper.doCleanup(BuildDictionaryMapper.java:191)
at org.apache.kylin.engine.mr.KylinMapper.cleanup(KylinMapper.java:71)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild\$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
{code}
> Check MapReduce job failed reason and include the diagnostics into email
> notification
> -------------------------------------------------------------------------------------
>
> Key: KYLIN-3021
> URL: https://issues.apache.org/jira/browse/KYLIN-3021
> Project: Kylin
> Issue Type: Improvement
> Reporter: Zhong Yanghong
> Assignee: Zhong Yanghong
> Priority: Major
> Fix For: v2.6.0
>
>
> the current kylin.log and failed job email notification, we do not have the
> detailed error info that why the map reduce jobs are failed. We just log "no
> counters for job" or "Counters: 0".
>
> 2017-08-03 18:24:10,197 WARN [pool-10-thread-17] common.HadoopCmdOutput:90 :
> no counters for job job_1497957612021_709431
>
> 2017-08-03 15:08:02,351 DEBUG [pool-10-thread-3] common.HadoopCmdOutput:95 :
> Counters: 0
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)