[ https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753542#action_12753542 ]
Sharad Agarwal commented on MAPREDUCE-157: ------------------------------------------ Some comments: - Responsibilities of JobHistory and JobHistoryWriter are not clear. I think these should be merged in single class and name it say JobHistoryManager - Base interface HistoryEvent should be public - Getters in event classes should be public - HISTORY_VERSION should be 1.0 instead of '0.21' - JobHistory#initDone is swallowing the exception - Code for localizing of job conf should be moved out of history - JobHistory has two logEvents methods. That it confusing. Better would be to just have one logEvent method, and no state check in logEvents. JobInProgress will explicitly create/close job history writers - HistoryCleaner thread should have a check as lastRan != 0 and not lastRan == 0. Also it should use doneDirFs - Using: if (jobName == null || jobName.length() == 0) { jobName = "NA"; } is not a nice thing. This was perhaps done for some reason in the old format but in new one not required. we can have empty string in json - file name encoding not required. JobHistoryUtil class can go away - TaskInProgress and JobInProgress do a jobHistory != null check everywhere. This is not elegant. - test case is missed from TestJobHistory#testDoneFolderOnHDFS - EventReader -> better msg in readOneGroup - > throw new IOException("Internal error"); - EventReader -> getHistoryEventType need not be static. It can use the instance parser handle - History Viewer command line has changed. It takes file name instead of output folder. JobClient#displayUsage and commands manual documentation should be corrected. - All public APIs including constructors must be documentated > Job History log file format is not friendly for external tools. > --------------------------------------------------------------- > > Key: MAPREDUCE-157 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-157 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Affects Versions: 0.20.1 > Reporter: Owen O'Malley > Assignee: Jothi Padmanabhan > Fix For: 0.21.0 > > Attachments: mapred-157-10Sep.patch, mapred-157-4Sep.patch, > mapred-157-7Sep-v1.patch, mapred-157-7Sep.patch, mapred-157-prelim.patch, > MAPREDUCE-157-avro.patch > > > Currently, parsing the job history logs with external tools is very difficult > because of the format. The most critical problem is that newlines aren't > escaped in the strings. That makes using tools like grep, sed, and awk very > tricky. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.