[ 
https://issues.apache.org/jira/browse/HDFS-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379616#comment-16379616
 ] 

Konstantin Shvachko edited comment on HDFS-12975 at 2/28/18 1:31 AM:
---------------------------------------------------------------------

Chao, the state part of the patch looks really good to me. I think we can 
commit it if you can separate it from the rest, after some minor cleanup - see 
below.

*_Configuration needs more thinking_.* I missed to describe the cluster startup 
process in the design doc. May be we should have a separate doc or I can update 
the design. The main thing is that we should configure NNs altogether the same 
way as it is currently done:
{code:java}
dfs.ha.namenodes.mycluster = nn1,nn2,nn3
dfs.namenode.rpc-address.mycluster.nn1 = machine1.example.com:8020
dfs.namenode.rpc-address.mycluster.nn2 = machine2.example.com:8020
dfs.namenode.rpc-address.mycluster.nn3 = machine3.example.com:8020
{code}
Currently all nodes start in standby state and admins should transition to 
active one of them, e.g. nn1.
 We have two choices for observers.
 # Use {{haadmin transitionToObserver}}, which will bring say nn3 from standby 
to observer state.
 # Or add configuration parameter, which allows nn3 to start in observer state
{code:java}
dfs.namenode.observer.mycluster.nn3 = true // false by default
{code}
In the end we will probably need both.

_*Comments on the rest fo the patch*_
 # {{allowStaleStandbyReads}} for observers should be true - ignore the config
 # NameNodeStatusMXBean methods should not be in this patch
 ** {{NameNode.getRole()}} looks like incompatible change
 ** {{NameNode.getState()}} shouldn't change. You need {{synchronized}} to 
access the {{state}} member.
 # {{HAState}} adds only an unused import
 # {{HAUtil.getConfForOtherNodes()}} the refactoring (otherNn -> 
otherNameNodes) should go into trunk. Better avoid renaming even though the 
original choice of the variable name is bad.
 # {{StandbyState.checkOperation()}} should not change since 
{{allowStaleReads()}} for observer is true
 # A think Command {{-namenodes}} should return all namenodes including 
standbys and observers. As it does now. We can postpone this decision and do it 
in the next jira if you want.
 # Even in tests we should not count observers separately from NNs. They are 
all NNs.
 # I see some Javadoc spelling corrections, white spaces, and long lines
 ** If formatting or spelling corrections of the existing code are needed they 
should go into trunk
 ** New ones should be fixed in the patch


was (Author: shv):
Chao, the state part of the patch looks really good to me. I think we can 
commit it if you can separate it from the rest, after some minor cleanup - see 
below.

*_Configuration needs more thinking_.* I missed to describe the cluster startup 
process in the design doc. May be we should have a separate doc or I can update 
the design. The main thing is that we should configure NNs altogether the same 
way as it is currently done:
{code:java}
dfs.ha.namenodes.mycluster = nn1,nn2,nn3
dfs.namenode.rpc-address.mycluster.nn1 = machine1.example.com:8020
dfs.namenode.rpc-address.mycluster.nn2 = machine2.example.com:8020
dfs.namenode.rpc-address.mycluster.nn3 = machine3.example.com:8020
{code}
Currently all nodes start in standby state and admins should transition to 
active one of them, e.g. nn1.
 We have two choices for observers.
 # Use {{haadmin transitionToObserver}}, which will bring say nn3 from standby 
to observer state.
 # Or add configuration parameter, which allows nn3 to start in observer state
{code:java}
dfs.namenode.observer.mycluster.nn3 = true // false by default
{code}
In the end we will probably need both.

_*Comments on the rest fo the patch*_
 # {{allowStaleStandbyReads}} for observers should be true - ignore the config
 # NameNodeStatusMXBean methods should not be in this patch
 ** {{NameNode.getRole()}} looks like incompatible change
 ** {{NameNode.getState()}} shouldn't change. You need {{synchronized}} to 
access the {{state}} member.
 # {{HAState}} adds only an unused import
 # {{HAUtil.getConfForOtherNodes()}} the refactoring (otherNn -> 
otherNameNodes) should go into trunk. Better avoid renaming even though the 
original choice of the variable name is bad.
 # {{StandbyState.checkOperation()}} should not change since 
{{allowStaleReads()}} for observer is true
 # A think Command {{-namenodes}} should return all namenodes including 
standbys and observers. As it does now. We can postpone this decision and do it 
in the next jira if you want.
 # Even in tests we should not count observers separately from NNs. They are 
all NNs.
 # I see some Javadoc spelling corrections, white spaces, and long lines
 ** If formatting or spelling corrections of the existing code are needed they 
should go into trunk
 ** New ones should be fixed in the

> Changes to the NameNode to support reads from standby
> -----------------------------------------------------
>
>                 Key: HDFS-12975
>                 URL: https://issues.apache.org/jira/browse/HDFS-12975
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Konstantin Shvachko
>            Assignee: Chao Sun
>            Priority: Major
>         Attachments: HDFS-12975.000.patch, HDFS-12975.001.patch
>
>
> In order to support reads from standby NameNode needs changes to add Observer 
> role, turn off checkpointing and such.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to