[ 
https://issues.apache.org/jira/browse/HDDS-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988936#comment-16988936
 ] 

Marton Elek commented on HDDS-2679:
-----------------------------------

Discussed earlier with [~msingh] and [~arp]

> Ratis ring creation might be failed with async pipeline creation 
> -----------------------------------------------------------------
>
>                 Key: HDDS-2679
>                 URL: https://issues.apache.org/jira/browse/HDDS-2679
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Datanode, SCM
>            Reporter: Marton Elek
>            Priority: Blocker
>
> The problem introduced with async pipeline creation:
>  
>  # Let's say the SCM got registration from three datanodes.
>  # A Ratis/THREE pipeline will be created on SCM
>  # With the next HB Datanode1(DN1) will receive the CreatePipeline command
>  # Datanode1 will start the Ratis server which tries to get votes from DN2 
> and DN3
>  # If DN2 has not yet received the CreatePipeline command (which has high 
> chance with 30sec HB) it will refuse to vote to DN1
>  # DN1 will request a  pipeline close from the SCM as there are no votes from 
> DN2 and DN3
>  # Pipeline is closed on SCM side, but in the mean time DN2 (finally) 
> receives the pipeline creation command and tries to get votes, but DN1 has a 
> newer group/pipeline id.
>  # And so on
> If we are lucky enough after a while all DN will receive the container 
> creation at more or less the same time, but if not, SCM couldn't create an 
> Open Ratis
>  
> Possible solutions:
>  * At the very beginning datanode can trust in the peers and learn the group 
> id (but it doesn't cover the case when one pipeline has been closed on DN1 
> *and* a new pipeline is created but DN2 still has the old pipeline).
>  * We can use bidirectional GRPC streaming for datanode scm communication 
> (which is a good idea anyway to make the communication faster). It makes the 
> communication faster but the problem is still there if there is a network 
> blip between scm and DN1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to