[ https://issues.apache.org/jira/browse/HADOOP-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644637#action_12644637 ]
Matei Zaharia commented on HADOOP-3970: --------------------------------------- Will this be a standardized format for counters in future Hadoop releases, or are there other issues you guys think will need to be fixed? I've been working on some scripts for parsing job history logs to compute statistics about how a Hadoop cluster is being used (see https://issues.apache.org/jira/browse/HADOOP-3708) and I've seen small things being different in different versions of Hadoop. I think it would be beneficial to choose a format and make it standard for Hadoop 1.0, because I imagine others are also interested in being able to parse out info from the job history logs. > Counters written to the job history cannot be recovered back > ------------------------------------------------------------ > > Key: HADOOP-3970 > URL: https://issues.apache.org/jira/browse/HADOOP-3970 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Reporter: Amar Kamat > Assignee: Amar Kamat > Fix For: 0.19.0 > > Attachments: HADOOP-3970-v1.patch, HADOOP-3970-v2.patch, > HADOOP-3970-v3.patch, HADOOP-3970-v4.1.patch, HADOOP-3970-v4.patch > > > Counters that are written to the JobHistory are stringified using > {{Counters.makeCompactString()}}. The format in which this api converts the > counter into a string is _groupname.countername:value_. The problem is that > _groupname_ and _countername_ can contain a '.' and hence recovering the > counter becomes difficult. Since JobHistory can be used for various purposes, > reconstructing the counter object back might be useful. One such usecase is > HADOOP-3245. There should be some way to recover the counter object back from > its string representation and also to keep the string version readable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.