[ 
https://issues.apache.org/jira/browse/HADOOP-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695342#action_12695342
 ] 

Amar Kamat commented on HADOOP-5394:
------------------------------------

TestSocketFactory tests if the clients can connect to the server using socket 
factory. It does it in the following fashion 
# Define a socket factory that uses (_port_ - 10) instead of _port_.
# Start the server
# Configure a client conf to use this socket factory implementation and server 
url as _hostname:port+10_
# At the client, the socket factory does a (-10) and thus is able to connect to 
the server.

This doesnt work with the current patch because the JobTracker tries to create 
a file on the DataNode using the socket factory but the DataNode info passed to 
the JobTracker is correct (i.e no +10 is done). And DataNode information cant 
be changed as it is obtained from the NameNode. Hence this patch starts the 
JobTracker with the correct conf and not the modified conf. JobTracker to 
NameNode connection need not be checked as DFSClient to NameNode connection is 
checked and for the NameNode, the JobTracker is a client. 

> JobTracker might schedule 2 attempts of the same task with the same attempt 
> id across restarts
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5394
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5394
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Critical
>         Attachments: HADOOP-5394-v1.10.patch, HADOOP-5394-v1.2.patch, 
> HADOOP-5394-v1.5.patch, HADOOP-5394-v1.9.1.patch
>
>
> This can happen when the jobtracker gets restarted more than once. In such 
> cases, the jobtracker depends on the jobhistory file for the next restart 
> count. If the new restart-count is not flushed to the file then there is a 
> fair chance that upon next restart, the jobtracker might schedule a new 
> attempt with an existing id. This can cause problems not only with the 
> side-effect files but also can cause the jobtracker to be in an inconsistent 
> state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to