[
https://issues.apache.org/jira/browse/HADOOP-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amareshwari Sriramadasu updated HADOOP-2403:
--------------------------------------------
Attachment: patch-2403.txt
Existing history parsing code actually incorporates new lines in the values.
The parsing problem occurs when the character *"* is followed by *\n*, because
the value doesnt allow *"* inside. Since the JobHistory looks for
*KEY="VALUE"* Pattern for parsing keys and values, parsing fails if value has
*"* and *=* in it.
The attached patch escapes *"* and *=* in the value and logs it. Regular
expression for VALUE is modified to allow any character otherthan quote, but
escaped quotes will be allowed. After parsing the value, both *"* and *=* are
unescaped and returned.
> JobHistory log files contain data that cannot be parsed by
> org.apache.hadoop.mapred.JobHistory
> ----------------------------------------------------------------------------------------------
>
> Key: HADOOP-2403
> URL: https://issues.apache.org/jira/browse/HADOOP-2403
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Amareshwari Sriramadasu
> Priority: Critical
> Fix For: 0.19.0
>
> Attachments: EncodeDecode.java, patch-2403.txt, patch-2403.txt
>
>
> When some tasks failed, the job tracker writes an line to the history file
> with error message.
> However, the error message may mess up with the history file format, choking
> the history parser. Here is an example:
> MapAttempt TASK_TYPE="MAP" TASKID="tip_200712102254_0001_m_000090"
> TASK_ATTEMPT_ID="task_200712102254_0001_m_000090_0" TASK_STATUS="FAILED"
> FINISH_TIME="1197327293253" HOSTNAME="XXXX:50050"
> ERROR="java.lang.IllegalArgumentException: Trouble to get key or value (<,>
> substituted by null
> . Key XML-Ori:
> <Root>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.