[ 
https://issues.apache.org/jira/browse/HDFS-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716886#comment-15716886
 ] 

Mingliang Liu commented on HDFS-11094:
--------------------------------------

For the protocol changes, end-to-end tests are very helpful. Starting a mini 
dfs cluster is not very expensive; I can usually finish start and shutdown an 
empty mini cluster in 3~5 seconds on my dev machine.

The first heartbeat will bypass the large interval; so 1) Choosing 
{{HAServiceStateProto}} instead of {{HAServiceStateProto}} makes sense as 
{{lastActiveClaimTxId}} will be updated in a timely manner, and we can save the 
complexity of updating it in this patch; 2) Unfortunately, current methods 
(e.g. set large config {{DFS_HEARTBEAT_INTERVAL_KEY}}, or 
{{DataNode#setHeartbeatsDisabledForTests()}}) are not working without change 
for testing this patch. I can accept that existing tests in patch are somehow 
adequate. So this will not block the progress of this patch.

Thanks,

> Send back HAState along with NamespaceInfo during a versionRequest as an 
> optional parameter
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11094
>                 URL: https://issues.apache.org/jira/browse/HDFS-11094
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>         Attachments: HDFS-11094.001.patch, HDFS-11094.002.patch, 
> HDFS-11094.003.patch, HDFS-11094.004.patch, HDFS-11094.005.patch, 
> HDFS-11094.006.patch, HDFS-11094.007.patch, HDFS-11094.008.patch, 
> HDFS-11094.009.patch
>
>
> The datanode should know which NN is active when it is connecting/registering 
> to the NN. Currently, it only figures this out during its first (and 
> subsequent) heartbeat(s) and so there is a period of time where the datanode 
> is alive and registered, but can't actually do anything because it doesn't 
> know which NN is active. A byproduct of this is that the MiniDFSCluster will 
> become active before it knows what NN is active, which can lead to NPEs when 
> calling getActiveNN(). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to