[
https://issues.apache.org/jira/browse/HDFS-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379616#comment-16379616
]
Konstantin Shvachko edited comment on HDFS-12975 at 2/28/18 1:31 AM:
---------------------------------------------------------------------
Chao, the state part of the patch looks really good to me. I think we can
commit it if you can separate it from the rest, after some minor cleanup - see
below.
*_Configuration needs more thinking_.* I missed to describe the cluster startup
process in the design doc. May be we should have a separate doc or I can update
the design. The main thing is that we should configure NNs altogether the same
way as it is currently done:
{code:java}
dfs.ha.namenodes.mycluster = nn1,nn2,nn3
dfs.namenode.rpc-address.mycluster.nn1 = machine1.example.com:8020
dfs.namenode.rpc-address.mycluster.nn2 = machine2.example.com:8020
dfs.namenode.rpc-address.mycluster.nn3 = machine3.example.com:8020
{code}
Currently all nodes start in standby state and admins should transition to
active one of them, e.g. nn1.
We have two choices for observers.
# Use {{haadmin transitionToObserver}}, which will bring say nn3 from standby
to observer state.
# Or add configuration parameter, which allows nn3 to start in observer state
{code:java}
dfs.namenode.observer.mycluster.nn3 = true // false by default
{code}
In the end we will probably need both.
_*Comments on the rest fo the patch*_
# {{allowStaleStandbyReads}} for observers should be true - ignore the config
# NameNodeStatusMXBean methods should not be in this patch
** {{NameNode.getRole()}} looks like incompatible change
** {{NameNode.getState()}} shouldn't change. You need {{synchronized}} to
access the {{state}} member.
# {{HAState}} adds only an unused import
# {{HAUtil.getConfForOtherNodes()}} the refactoring (otherNn ->
otherNameNodes) should go into trunk. Better avoid renaming even though the
original choice of the variable name is bad.
# {{StandbyState.checkOperation()}} should not change since
{{allowStaleReads()}} for observer is true
# A think Command {{-namenodes}} should return all namenodes including
standbys and observers. As it does now. We can postpone this decision and do it
in the next jira if you want.
# Even in tests we should not count observers separately from NNs. They are
all NNs.
# I see some Javadoc spelling corrections, white spaces, and long lines
** If formatting or spelling corrections of the existing code are needed they
should go into trunk
** New ones should be fixed in the patch
was (Author: shv):
Chao, the state part of the patch looks really good to me. I think we can
commit it if you can separate it from the rest, after some minor cleanup - see
below.
*_Configuration needs more thinking_.* I missed to describe the cluster startup
process in the design doc. May be we should have a separate doc or I can update
the design. The main thing is that we should configure NNs altogether the same
way as it is currently done:
{code:java}
dfs.ha.namenodes.mycluster = nn1,nn2,nn3
dfs.namenode.rpc-address.mycluster.nn1 = machine1.example.com:8020
dfs.namenode.rpc-address.mycluster.nn2 = machine2.example.com:8020
dfs.namenode.rpc-address.mycluster.nn3 = machine3.example.com:8020
{code}
Currently all nodes start in standby state and admins should transition to
active one of them, e.g. nn1.
We have two choices for observers.
# Use {{haadmin transitionToObserver}}, which will bring say nn3 from standby
to observer state.
# Or add configuration parameter, which allows nn3 to start in observer state
{code:java}
dfs.namenode.observer.mycluster.nn3 = true // false by default
{code}
In the end we will probably need both.
_*Comments on the rest fo the patch*_
# {{allowStaleStandbyReads}} for observers should be true - ignore the config
# NameNodeStatusMXBean methods should not be in this patch
** {{NameNode.getRole()}} looks like incompatible change
** {{NameNode.getState()}} shouldn't change. You need {{synchronized}} to
access the {{state}} member.
# {{HAState}} adds only an unused import
# {{HAUtil.getConfForOtherNodes()}} the refactoring (otherNn ->
otherNameNodes) should go into trunk. Better avoid renaming even though the
original choice of the variable name is bad.
# {{StandbyState.checkOperation()}} should not change since
{{allowStaleReads()}} for observer is true
# A think Command {{-namenodes}} should return all namenodes including
standbys and observers. As it does now. We can postpone this decision and do it
in the next jira if you want.
# Even in tests we should not count observers separately from NNs. They are
all NNs.
# I see some Javadoc spelling corrections, white spaces, and long lines
** If formatting or spelling corrections of the existing code are needed they
should go into trunk
** New ones should be fixed in the
> Changes to the NameNode to support reads from standby
> -----------------------------------------------------
>
> Key: HDFS-12975
> URL: https://issues.apache.org/jira/browse/HDFS-12975
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Konstantin Shvachko
> Assignee: Chao Sun
> Priority: Major
> Attachments: HDFS-12975.000.patch, HDFS-12975.001.patch
>
>
> In order to support reads from standby NameNode needs changes to add Observer
> role, turn off checkpointing and such.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]