[ 
https://issues.apache.org/jira/browse/HADOOP-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12621734#action_12621734
 ] 

Amar Kamat commented on HADOOP-2403:
------------------------------------

I think we should fix the general problem to do with history parsing which are
1) Detect if the record is complete or not. The client can fail while writing 
to the history and the failure can be exactly on the key-val boundary.
2) Detect if the key-val pairs are correct. The error message can contain tabs 
and other characters like {{"}} which can error the history parsing. Currently  
a tab is the delimiter for records and a {{"}} is used for value encapsulation. 
Similarly other strings in the history can have these characters like 
counter-names etc. 
This problems can fail HADOOP-3245.

I would go for having a record _delimiter_ like a {{.}}(dot) to detect if the 
record is complete or not. Also incomplete records should not be parsed and 
should be ignored. We also need to make sure that the characters that are used 
as delimiter ({{.}}, {{"}}, tab) should not occur in a _value_.
----
Thoughts?

> JobHistory log files contain data that cannot be parsed by 
> org.apache.hadoop.mapred.JobHistory
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2403
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2403
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Critical
>             Fix For: 0.19.0
>
>         Attachments: EncodeDecode.java, patch-2403.txt
>
>
> When some tasks failed, the job tracker writes an line to the history file 
> with error message.
> However, the error message may mess up with the history file format, choking 
> the history parser. Here is an example:
> MapAttempt TASK_TYPE="MAP" TASKID="tip_200712102254_0001_m_000090" 
> TASK_ATTEMPT_ID="task_200712102254_0001_m_000090_0" TASK_STATUS="FAILED" 
> FINISH_TIME="1197327293253" HOSTNAME="XXXX:50050" 
> ERROR="java.lang.IllegalArgumentException: Trouble to get key or value (<,> 
> substituted by null 
> . Key XML-Ori:
>         <Root>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to