[ 
https://issues.apache.org/jira/browse/HADOOP-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646237#action_12646237
 ] 

Steve Loughran commented on HADOOP-4016:
----------------------------------------

Assuming Amar changed the DNS entry for the job tracker, then it won't be enough

- the JVM caches hostnames forever unless you tell it otherwise

"Otherwise" means setting the JVM security properties
networkaddress.cache.ttl and  networkaddress.cache.ttl 

http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html

- That's regardless of any caching done in process. The task tracker reads in 
"mapred.job.tracker" from the configuration on startup only.

To do failover of job tracker you'd need to change the JVM to not cache the 
addresses forever -which will have other consequences, good and bad, and then 
change TaskTracker to try and redo the nslookup when the job tracker 
heartbeat's failed. 

This will be a fun test to automate. You could do it in-VM by starting a second 
job tracker on a different port of localhost and then stop the original 
tracker, check that the tasktracker failed its hearbeat, reread the config and 
picked up the new (host,port) setting. This would not test DNS caching, but 
would show the tasktracker was rereading its configuration. DNS Caching tests 
are hard outside of a VMWare/Xen cluster. 



> TaskTrackers never (re)connect back to the JobTracker if the JobTracker 
> node/machine is changed
> -----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4016
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4016
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>
> I tried the following 
> 1) Started a hadoop cluster.
> 2) Killed the JT
> 3) Selected a new node for starting JT. 
> 4) Changed the entry on the tasktracker to reflect the new (old) hostname to 
> (new) ip mapping. Checked if the tracker node correctly resolves the hostname 
> to the new ip.
> 5) Start the JT on the new node
> The tasktracker fails to connect to the new jobtracker. It seems that the 
> hostname resolution remains stale and is never updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to