[
https://issues.apache.org/jira/browse/HADOOP-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676183#action_12676183
]
Amar Kamat commented on HADOOP-5306:
------------------------------------
Tasks that are running on the tracker that is lost move to KILLED_UNCLEAN
state. In this state a task level cleanup attempt (with the same id) is
launched on a different tracker. Once this (cleanup) attempt returns, the
attempt is marked KILLED_CLEAN (i.e KILLED). Once the (cleanup) attempt
returns, the jobtracker tries to kill the attempt using the old tracker (lost
tracker's) information which got deleted and hence the port information goes
missing. Actually the port information is of no use to the jobtracker as the
tracker is lost. Hence we can ignore this case. HADOOP-4638 can take care of
this.
> Job History file can have empty string as http port after JobTracker Restart
> in case of lost TT, which can result in NumberFormatException when JT is
> restarted 2nd time
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5306
> URL: https://issues.apache.org/jira/browse/HADOOP-5306
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Ravi Gummadi
> Priority: Minor
> Fix For: 0.21.0
>
>
> HTTP_PORT="" is seen in job history file after JT recovery in case of lost
> TT. The .recover file of TestJobTrackerRestartWithLostTracker has empty
> string as HTTP_PORT. If another time JT is restarted and then JT tries to
> read the history line and tries to createTaskAttempt, it would get
> NumberFormatException because of Integer.parseInt(httpPort). We somehow need
> to log a legal value as HTTP_PORT in the history file OR the exception needs
> to be caught and proper action is to be taken.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.