slfan1989 commented on code in PR #6473:
URL: https://github.com/apache/hadoop/pull/6473#discussion_r1479929985


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator/src/main/java/org/apache/hadoop/yarn/server/globalpolicygenerator/applicationcleaner/DefaultApplicationCleaner.java:
##########
@@ -46,47 +49,37 @@ public void run() {
     LOG.info("Application cleaner run at time {}", now);
 
     FederationStateStoreFacade facade = getGPGContext().getStateStoreFacade();
+
     try {
-      // Get the candidate list from StateStore before calling router
-      Set<ApplicationId> allStateStoreApps = new HashSet<>();
-      List<ApplicationHomeSubCluster> response =
+      // Step1. Get the candidate list from StateStore before calling router
+      List<ApplicationHomeSubCluster> applicationHomeSubClusters =
           facade.getApplicationsHomeSubCluster();
-      for (ApplicationHomeSubCluster app : response) {
-        allStateStoreApps.add(app.getApplicationId());
-      }
-      LOG.info("{} app entries in FederationStateStore", 
allStateStoreApps.size());
-
-      // Get the candidate list from Registry before calling router
-      List<String> allRegistryApps = getRegistryClient().getAllApplications();
-      LOG.info("{} app entries in FederationRegistry", 
allStateStoreApps.size());
-
-      // Get the list of known apps from Router
-      Set<ApplicationId> routerApps = getRouterKnownApplications();
-      LOG.info("{} known applications from Router", routerApps.size());
+      LOG.info("FederationStateStore has {} applications.", 
applicationHomeSubClusters.size());
 
-      // Clean up StateStore entries
-      Set<ApplicationId> toDelete =
-          Sets.difference(allStateStoreApps, routerApps);
-

Review Comment:
   Step 1: Retrieve all applications stored in the StateStore, which represents 
all applications submitted to the Router.
   
   Step 2: Use the Router's REST API to fetch all running tasks. This API will 
invoke applications from all active SubClusters.
   
   Step 3: Compare the results of `Step1` and `Step2` to identify applications 
that exist in `Step1` but not in `Step2`.  Delete these applications.
   
   There is a potential issue with this approach. If a particular SubCluster is 
undergoing maintenance, such as RM restart, `Step2` will not be able to fetch 
the complete list of running applications. As a result, during the comparison 
in `Step3`, there is a risk of mistakenly deleting applications that are still 
running.
   
   
   We have three SubClusters: `subClusterA`, `subClusterB`, and `subClusterC`, 
with an equal allocation ratio of 1:1:1.
   
   We submit six applications through `routerA`. 
   
   - `app1` and `app2` are allocated to `subClusterA`
   -  `app3` and `app4` to `subClusterB` 
   - `app5` and `app6` to `subClusterC`. 
   
   Among these, `app1`, `app3`, and `app5` have completed their execution,  and 
we expect to retain `app2`, `app4`, and `app6` in the StateStore.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to