[ 
https://issues.apache.org/jira/browse/HDDS-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-13976:
-------------------------------
    Description: 
This is simply an idea.

Currently, we mainly use BCSID and originNodeId to check whether we should 
close a container. However, these two information are sometimes not enough to 
ensure that the close quasi closed containers can be safely closed, thus 
causing quasi-closed stuck containers.

It might be possible for us to use the last leader ID and use the Raft leader 
guarantees (i.e. highest committed index) to add more cases where container can 
still be closed if the origin node ID of the replicas are equal to last leader 
ID (despite the <= 3 unique origin node ID). This is because unlike Raft leader 
needs to apply the transaction before replying to the client, whereas Raft 
follower in the majority quorum only requires to commit, but not necessary 
apply it. Additionally, we might need to ensure that the leader is ready (i.e. 
it already applied all the logs until its term's start up log entry).

 

  was:
This is simply an idea.

Currently, we mainly use BCSID and originNodeId to check whether we should 
close a container. However, these two information are sometimes not enough to 
ensure that the close quasi closed containers can be safely closed, thus 
causing quasi-closed stuck containers.

It might be possible for us to use the last leader ID and use the Raft leader 
guarantees (i.e. highest committed index) to add more cases where container can 
still be closed if the origin node ID of the replicas are equal to last leader 
ID (despite the <= 3 unique origin node ID). We might need to ensure that the 
leader is ready (i.e. it already applied all the logs until its term's start up 
log entry).

 


> Use last pipeline leader ID to handle quasi closed stuck containers
> -------------------------------------------------------------------
>
>                 Key: HDDS-13976
>                 URL: https://issues.apache.org/jira/browse/HDDS-13976
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ivan Andika
>            Priority: Major
>
> This is simply an idea.
> Currently, we mainly use BCSID and originNodeId to check whether we should 
> close a container. However, these two information are sometimes not enough to 
> ensure that the close quasi closed containers can be safely closed, thus 
> causing quasi-closed stuck containers.
> It might be possible for us to use the last leader ID and use the Raft leader 
> guarantees (i.e. highest committed index) to add more cases where container 
> can still be closed if the origin node ID of the replicas are equal to last 
> leader ID (despite the <= 3 unique origin node ID). This is because unlike 
> Raft leader needs to apply the transaction before replying to the client, 
> whereas Raft follower in the majority quorum only requires to commit, but not 
> necessary apply it. Additionally, we might need to ensure that the leader is 
> ready (i.e. it already applied all the logs until its term's start up log 
> entry).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to