[
https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428200#comment-13428200
]
Raju commented on HDFS-3744:
----------------------------
Oh Yesterday was my bad day @work :(
Let me correct my second opinion first
{quote}2. I would like to with your first opinion with STANDBY check in
replication(or move replication to Active service).
{quote}
Here first opinion I meant here is Uma Maheshwara Rao's opinion.
I accept that DFSAdmin command need to be sent for both NN. This can solve the
problem here.
That's u also meant Arron.
And I would like to add Standby check at replication monitor to avoid load in
cluster.
My First opinion
Before to support my opinion of persisting the node decommisioned consider the
scenarios where
{quote}
Decommission command is given by admin but Standby NN is down ...etc
Scenarios where Standby NN is not available when issuing command.
{quote}
DFSAdmin will just create a DFS client and calls refreshNodes()(Client will
retry for NN 10 times by default not forever if I am not wrong), in this case
Standby NN will not get to know the Decommission request and when Standby is up
and switches to active???
I guess It will consider the decommissioned node as well.
{quote}
I'm hesitant to go with this suggestion. How would differences be rectified
between what's persisted in the edit log and what's present in the excluded
hosts file?
{quote}
Initially I also thought of same thing. Since we are persisting the message in
different forms, but consider
{quote}
Admin could have just configured the exclude nodes in the file, but he wouldn't
have issued refreshnode command.
{quote}
By persisting into edit logs we can be sure of which DN is decommissioned? Not
only by Standby NN but also when Standalone NN restarts.
Or is there any way NN can identify Decommissioned node without persisting?
Here I mean to persist all the stages of recommission and decommission too.
Please correct me If I am not correct.
I will be very glad to hear more about my opinion of persisting.
> Decommissioned nodes are included in cluster after switch which is not
> expected
> -------------------------------------------------------------------------------
>
> Key: HDFS-3744
> URL: https://issues.apache.org/jira/browse/HDFS-3744
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: ha
> Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
> Reporter: Brahma Reddy Battula
>
> Scenario:
> =========
> Start ANN and SNN with three DN's
> Exclude DN1 from cluster by using decommission feature
> (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
> After decommission successful,do switch such that SNN will become Active.
> Here exclude node(DN1) is included in cluster.Able to write files to excluded
> node since it's not excluded.
> Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI
> decommissioned=0
> One more Observation:
> ====================
> All dfsadmin commands will create proxy only on nn1 irrespective of Active or
> standby.I think this also we need to re-look once..
> I am not getting , why we are not given HA for dfsadmin commands..?
> Please correct me,,If I am wrong.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira