ChenSammi commented on a change in pull request #1469: HDDS-2034. Async RATIS 
pipeline creation and destroy through heartbea…
URL: https://github.com/apache/hadoop/pull/1469#discussion_r329335596
 
 

 ##########
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/BlockManagerImpl.java
 ##########
 @@ -188,6 +208,15 @@ public AllocatedBlock allocateBlock(final long size, 
ReplicationType type,
           // TODO: #CLUTIL Remove creation logic when all replication types and
           // factors are handled by pipeline creator
           pipeline = pipelineManager.createPipeline(type, factor);
+          // wait until pipeline is ready
+          long current = System.currentTimeMillis();
+          while (!pipeline.isOpen() && System.currentTimeMillis() <
+              (current + pipelineCreateWaitTimeout)) {
+            try {
+              Thread.sleep(1000);
+            } catch (InterruptedException e) {
+            }
+          }
 
 Review comment:
   This create pipeline in block allocation path is kind of debating.  A 
current comment sys "TODO: #CLUTIL Remove creation logic when all replication 
types and factors are handled by pipeline creator". 
   To the detail,  "ALLOCATED" state will be handled in task HDDS-2177, "Add a 
srubber thread to detect creation failure pipelines in ALLOCATED state".  
Currently the pipelineCreateWaitTimeout is calculated based on 
"hdds.command.status.report.interval" and "hdds.heartbeat.interval", under the 
condition that the connection between Datanode and SCM is in good state.  What 
if pipeline is created successfully, while the connection to SCM broken and 
restored after a while. Would we wait a little longer to decide whether 
pipeline creation success or failure.  So in HDDS-2177,  I plan to have a 
configurable property for the pipeline creation timeout.  Every ALLOCATED 
pipeline, which exceeds the creation timeout will be claimed failure and 
garbage collected. 
   Whether using CompleteFuture or while loop,  we all need a timeout.  This is 
on block allocation path,  how many latency can a synchronous API tolerate?  
Maybe the best way is not create pipeline in such case if we can make sure 
there are enough pipelines to use after  exited safe mode.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to