[jira] [Updated] (HADOOP-17504) New connection requires a retry to refresh NameNode IP changes

Aihua Xu (Jira) Thu, 28 Jan 2021 13:49:25 -0800


     [ 
https://issues.apache.org/jira/browse/HADOOP-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Aihua Xu updated HADOOP-17504:
------------------------------
    Attachment: HADOOP-17504.patch

> New connection requires a retry to refresh NameNode IP changes
> --------------------------------------------------------------
>
>                 Key: HADOOP-17504
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17504
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>    Affects Versions: 2.8.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>            Priority: Major
>         Attachments: HADOOP-17504.patch
>
>
> Hadoop-17068 is to handle the case of NameNode IP address changes in which 
> HDFS client will update the IP address after the connection failure.  
> DataNodes also use the same logic to refresh IP address for the connection. 
> Such connection is reused with the default idle time 10 seconds. (set by 
> ipc.client.connection.maxidletime). If the connection is closed and the 
> DataNode will use the old NameNode IP address to connect and refresh to the 
> new IP address after the first failure.  
> The problem with the refresh logic in org.apache.hadoop.ipc.Client is: the 
> server value getting refreshed will not reflect in remoteId.address, while 
> the next connection creation will use remoteId.address.
> {{if (!server.equals(currentAddr)) {}}
> {{  LOG.warn("Address change detected. Old: " + server.toString() +}}
> {{          " New: " + currentAddr.toString()); }}
> {{   server = currentAddr;}}
>  
> Such kind of retry in a big cluster will cause random "BLOCK* 
> blk_16987635027_18010098516 is COMMITTED but not COMPLETE(numNodes= 0 < 
> minimum = 1) in fie" error if all three replicas take one retry to read/write 
> the block. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-17504) New connection requires a retry to refresh NameNode IP changes

Reply via email to