[
https://issues.apache.org/jira/browse/HADOOP-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Runping Qi updated HADOOP-2978:
-------------------------------
Description:
For the lines like:
Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820"
JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0"
FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters
.Launched reduce tasks=15Map-Reduce Framework.Map input
records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce
Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output
bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce
Framework.Combine output records=0,Map-Reduce Framework.Reduce input
groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce
Framework.Reduce output records=2894276"
The extracted value for COUNTERS is
Job Counters .Launched map tasks
which is clearly wrong.
was:
For the lines like:
Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820"
JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0"
FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters
.Launched reduce tasks=15Map-Reduce Framework.Map input
records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce
Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output
bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce
Framework.Combine output records=0,Map-Reduce Framework.Reduce input
groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce
Framework.Reduce output records=2894276"
The extracted value for COUNTERS is
Job Counters .Launched map tasks
which is clearly wrong.
Summary: JobHistory log format for COUNTER is ambigurous (was:
JobHistory parser cannot extract the value for COUNTERS )
An item in a job history log line is separated by "=".
However, the value for the item "COUNTERS" contains "=", which cause the parser
misbehaves.
For the lines like:
Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820"
JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0"
FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters
.Launched reduce tasks=15Map-Reduce Framework.Map input
records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce
Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output
bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce
Framework.Combine output records=0,Map-Reduce Framework.Reduce input
groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce
Framework.Reduce output records=2894276"
The extracted value for COUNTERS is
Job Counters .Launched map tasks
which is clearly wrong.
The expected value is:
Job Counters .Launched map tasks=24,Job Counters .Launched reduce
tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map
output records=2894276,Map-Reduce Framework.Map input
bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce
Framework.Combine input records=0,Map-Reduce Framework.Combine output
records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce
Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output
records=2894276"
Clearly, the "=" chars in the value cuased the confusion.
The chars "=" in the value is added by the makeCompactString method of Counters
class
as separators between a counter name and its value.
I suggest we use colon char (":") instead as the separator.
I'll attach a patch sortly.
> JobHistory log format for COUNTER is ambigurous
> ------------------------------------------------
>
> Key: HADOOP-2978
> URL: https://issues.apache.org/jira/browse/HADOOP-2978
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.16.0
> Reporter: Runping Qi
>
> For the lines like:
> Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820"
> JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0"
> FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters
> .Launched reduce tasks=15Map-Reduce Framework.Map input
> records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce
> Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output
> bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce
> Framework.Combine output records=0,Map-Reduce Framework.Reduce input
> groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce
> Framework.Reduce output records=2894276"
> The extracted value for COUNTERS is
> Job Counters .Launched map tasks
> which is clearly wrong.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.