[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

Shashikant Banerjee (JIRA) Wed, 14 Nov 2018 06:25:35 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ]


Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 2:24 PM:
--------------------------------------------------------------------

Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
      INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
      container.getContainerData().getContainerID() + " while in " +
      state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

5. KeyValueContainer.java : 310 ->

The comments look misleading here. The first comment specifies the compaction 
should be done asynchronously as otherwise it will be lot slower . The next 
comment says it is ok if the operation is slow. Can you please check?
{code:java}
@Override
public void close() throws StorageContainerException {

  //TODO: writing .container file and compaction can be done
  // asynchronously, otherwise rpc call for this will take a lot of time to
  // complete this action
  ContainerDataProto.State oldState = null;
  try {
    writeLock();
    oldState = containerData.getState();
    containerData.closeContainer();
    File containerFile = getContainerFile();
    // update the new container data to .container File
    updateContainerFile(containerFile);

  } catch (StorageContainerException ex) {
    if (oldState != null) {
      // Failed to update .container file. Reset the state to CLOSING
      containerData.setState(oldState);
    }
    throw ex;
  } finally {
    writeUnlock();
  }
  // It is ok if this operation takes a bit of time.
  // Close container is not expected to be instantaneous.
  compactDB();
}

{code}


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
      INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
      container.getContainerData().getContainerID() + " while in " +
      state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

5. KeyValueContainer.java : 310 ->

The comments look misleading here. The first comment specifies we compaction 
should be done asynchronously as otherwise it will be lot slower . The next 
comment says it is ok if thhe opeartion is slow. Can you please check?
{code:java}
@Override
public void close() throws StorageContainerException {

  //TODO: writing .container file and compaction can be done
  // asynchronously, otherwise rpc call for this will take a lot of time to
  // complete this action
  ContainerDataProto.State oldState = null;
  try {
    writeLock();
    oldState = containerData.getState();
    containerData.closeContainer();
    File containerFile = getContainerFile();
    // update the new container data to .container File
    updateContainerFile(containerFile);

  } catch (StorageContainerException ex) {
    if (oldState != null) {
      // Failed to update .container file. Reset the state to CLOSING
      containerData.setState(oldState);
    }
    throw ex;
  } finally {
    writeUnlock();
  }
  // It is ok if this operation takes a bit of time.
  // Close container is not expected to be instantaneous.
  compactDB();
}

{code}

> Quasi close the container when close is not executed via Ratis
> --------------------------------------------------------------
>
>                 Key: HDDS-801
>                 URL: https://issues.apache.org/jira/browse/HDDS-801
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Datanode
>    Affects Versions: 0.3.0
>            Reporter: Nanda kumar
>            Assignee: Nanda kumar
>            Priority: Major
>         Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

Reply via email to