[ 
https://issues.apache.org/jira/browse/HADOOP-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Runping Qi updated HADOOP-2978:
-------------------------------

    Description: 

For the lines like: 

Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" 
JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" 
FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters 
.Launched reduce tasks=15Map-Reduce Framework.Map input 
records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce 
Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output 
bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce 
Framework.Combine output records=0,Map-Reduce Framework.Reduce input 
groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce 
Framework.Reduce output records=2894276"

The extracted value for COUNTERS is 

Job Counters .Launched map tasks


which is clearly wrong.



  was:


For the lines like: 

Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" 
JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" 
FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters 
.Launched reduce tasks=15Map-Reduce Framework.Map input 
records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce 
Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output 
bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce 
Framework.Combine output records=0,Map-Reduce Framework.Reduce input 
groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce 
Framework.Reduce output records=2894276"

The extracted value for COUNTERS is 

Job Counters .Launched map tasks


which is clearly wrong.



        Summary: JobHistory log format for COUNTER is ambigurous   (was: 
JobHistory parser cannot extract  the value for  COUNTERS )


An item in a job history log line is separated by "=".
However, the value for the item "COUNTERS" contains "=", which cause the parser 
misbehaves.

For the lines like: 

Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" 
JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" 
FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters 
.Launched reduce tasks=15Map-Reduce Framework.Map input 
records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce 
Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output 
bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce 
Framework.Combine output records=0,Map-Reduce Framework.Reduce input 
groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce 
Framework.Reduce output records=2894276"

The extracted value for COUNTERS is 

Job Counters .Launched map tasks


which is clearly wrong.

The expected value is:

Job Counters .Launched map tasks=24,Job Counters .Launched reduce 
tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map 
output records=2894276,Map-Reduce Framework.Map input 
bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce 
Framework.Combine input records=0,Map-Reduce Framework.Combine output 
records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce 
Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output 
records=2894276"

Clearly, the "=" chars in the value cuased the confusion.

The chars "=" in the value is added by the makeCompactString method of Counters 
class
as separators between a counter name and its value.

I suggest we use colon char (":") instead as the separator.
I'll attach a patch sortly.
 

> JobHistory log format for COUNTER is ambigurous 
> ------------------------------------------------
>
>                 Key: HADOOP-2978
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2978
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>
> For the lines like: 
> Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" 
> JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" 
> FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters 
> .Launched reduce tasks=15Map-Reduce Framework.Map input 
> records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce 
> Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output 
> bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce 
> Framework.Combine output records=0,Map-Reduce Framework.Reduce input 
> groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce 
> Framework.Reduce output records=2894276"
> The extracted value for COUNTERS is 
> Job Counters .Launched map tasks
> which is clearly wrong.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to