[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461275#comment-16461275
 ] 

Erik Krogen edited comment on HDFS-13286 at 5/2/18 5:28 PM:
------------------------------------------------------------

Hey [~csun], sorry for taking a while to get back to you on this. Your new 
{{HAServiceState}} changes inspired me to do a more comprehensive look at how 
those states are used and I noticed a few things:
* Take a look at {{BPServiceActor}} L905-909. I haven't looked at HDFS-9917 
yet, but judging from the comment it seems we probably want to change this 
if-condition to {{state == HAServiceState.STANDBY || state == 
HAServiceState.OBSERVER}}. Can you see if you agree?
* The constructor of {{StandbyState}} needs to be modified to 
{{super(isObserver ? HAServiceState.OBSERVER : HAServiceState.STANDBY);}} 
instead of just always passing {{STANDBY}}, else things which use 
{{HAState#getServiceState()}} will be wrong (see for example 
{{NameNode#getServiceStatus()}}).
* It looks like we also need to update the enums in 
{{FederationNamenodeServiceState}} and {{NNHAStatusHeartbeatProto}}, and their 
associated usages
* We should be able to remove the TODO on {{NameNode}} L1866?
* I think this section of {{FailoverController#preFailoverChecks()}} may need 
some work:
{code}
    if (!toSvcStatus.getState().equals(HAServiceState.STANDBY)) {
      throw new FailoverFailedException(
          "Can't failover to an active service");
    }
    
    if (!toSvcStatus.isReadyToBecomeActive()) {
      String notReadyReason = toSvcStatus.getNotReadyReason();
      if (!forceActive) {
        throw new FailoverFailedException(
            target + " is not ready to become active: " +
            notReadyReason);
      } else {
        LOG.warn("Service is not ready to become active, but forcing: {}",
            notReadyReason);
      }
    }
{code}
It seems the first if-condition is assuming there are only two possible states, 
so if the state is not STANDBY, it must be ACTIVE. I think we should update 
this to explicitly check for ACTIVE. Next, is the service is in OBSERVER state, 
{{isReadyToBecomeActive()}} will be false. In this case, 
{{FailoverController#preFailoverChecks()}} will still allow this operation if 
{{forceActive}} is true. I don't think we want to allow {{forceActive}} to 
attempt to failover an observer, right?
* For all three usages of {{FSNameystem#isInStandbyState()}}, it actually seems 
to me that they should apply if it is in observer or standby state, can you 
double check and if so update accordingly?


was (Author: xkrogen):
Hey [~csun], sorry for taking a while to get back to you on this. Your new 
{{HAServiceState}} changes inspired me to do a more comprehensive look at how 
those states are used and I noticed a few things:
* Take a look at {{BPServiceActor}} L905-909. I haven't looked at HDFS-9917 
yet, but judging from the comment it seems we probably want to change this 
if-condition to {{state == HAServiceState.STANDBY || state == 
HAServiceState.OBSERVER}}. Can you see if you agree?
* The constructor of {{StandbyState}} needs to be modified to 
{{super(isObserver ? HAServiceState.OBSERVER : HAServiceState.STANDBY);}} 
instead of just always passing {{STANDBY}}, else things which use 
{{HAState#getServiceState()}} will be wrong (see for example 
{{NameNode#getServiceStatus()}}).
* It looks like we also need to update the enums in 
{{FederationNamenodeServiceState}} and {{NNHAStatusHeartbeatProto}}, and their 
associated usages
* We should be able to remove the TODO on {{NameNode}} L1866?

> Add haadmin commands to transition between standby and observer
> ---------------------------------------------------------------
>
>                 Key: HDFS-13286
>                 URL: https://issues.apache.org/jira/browse/HDFS-13286
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>         Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to