RexXiong commented on code in PR #2535:
URL: https://github.com/apache/celeborn/pull/2535#discussion_r1623707056


##########
master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/ha/HAMasterMetaManager.java:
##########
@@ -365,6 +365,11 @@ public void handleWorkerEvent(
     }
   }
 
+  @Override
+  public void handleReportWorkerDecommission(List<WorkerInfo> workers, String 
requestId) {

Review Comment:
   should submit the request to ratis server



##########
master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/AbstractMetaManager.java:
##########
@@ -436,6 +446,12 @@ public void updateWorkerEventMeta(int 
workerEventTypeValue, List<WorkerInfo> wor
     }
   }
 
+  public void updateMetaByReportWorkerDecommission(List<WorkerInfo> workers) {
+    synchronized (this.workers) {
+      decommissionWorkers.addAll(workers);

Review Comment:
   Before this, worker decommission will ReportWorkerUnavailable, which 
indicates the worker would be shutdown, then Client would quickly tell worker 
   commits those associated partitions,  If we change shutdownWorkers to 
decommissionWorkers, May be decommission take longer time than before.



##########
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/Worker.scala:
##########
@@ -971,6 +986,14 @@ private[celeborn] class Worker(
     }
     serverBootstraps
   }
+
+  private def isDecommissioning: Int = {
+    if (shutdown.get() && workerStatusManager.exitEventType == 
WorkerEventType.Decommission) {

Review Comment:
   Wd need use currentWorkerStatus.getState to check the worker status.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to