[ 
https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428200#comment-13428200
 ] 

Raju commented on HDFS-3744:
----------------------------

Oh Yesterday was my bad day @work :(

Let me correct my second opinion first 

{quote}2. I would like to with your first opinion with STANDBY check in 
replication(or move replication to Active service).
{quote}

Here first opinion I meant here is Uma Maheshwara Rao's opinion.
I accept that DFSAdmin command need to be sent for both NN. This can solve the 
problem here.

That's u also meant Arron.
And I would like to add Standby check at replication monitor to avoid load in 
cluster.

My First opinion

Before to support my opinion of persisting the node decommisioned consider the 
scenarios where 
{quote}
 Decommission command is given by admin but Standby NN is down ...etc  
Scenarios where Standby NN is not available when issuing command.
{quote}

DFSAdmin will just create a DFS client and calls refreshNodes()(Client will 
retry for NN 10 times by default not forever if I am not wrong), in this case 
Standby NN will not get to know the Decommission request and when Standby is up 
and switches to active???

I guess It will consider the decommissioned node as well.

{quote}
I'm hesitant to go with this suggestion. How would differences be rectified 
between what's persisted in the edit log and what's present in the excluded 
hosts file?
{quote}

Initially I also thought of same thing. Since we are persisting the message in 
different forms, but consider 

{quote}
Admin could have just configured the exclude nodes in the file, but he wouldn't 
have issued refreshnode command. 
{quote}

By persisting into edit logs we can be sure of which DN is decommissioned? Not 
only by Standby NN but also when Standalone NN restarts. 

Or is there any way NN can identify Decommissioned node without persisting?

Here I mean to persist all the stages of recommission and decommission too.

Please correct me If I am not correct.
I will be very glad to hear more about my opinion of persisting.
                
> Decommissioned nodes are included in cluster after switch which is not 
> expected
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-3744
>                 URL: https://issues.apache.org/jira/browse/HDFS-3744
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
>            Reporter: Brahma Reddy Battula
>
> Scenario:
> =========
> Start ANN and SNN with three DN's
> Exclude DN1 from cluster by using decommission feature 
> (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
> After decommission successful,do switch such that SNN will become Active.
> Here exclude node(DN1) is included in cluster.Able to write files to excluded 
> node since it's not excluded.
> Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI 
> decommissioned=0
> One more Observation:
> ====================
> All dfsadmin commands will create proxy only on nn1 irrespective of Active or 
> standby.I think this also we need to re-look once..
> I am not getting , why we are not given HA for dfsadmin commands..?
> Please correct me,,If I am wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to