Will-Lo commented on code in PR #4030:
URL: https://github.com/apache/gobblin/pull/4030#discussion_r1757344912


##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnHelixUtils.java:
##########
@@ -203,16 +203,36 @@ public static void setYarnClassPath(Config config, 
Configuration yarnConfigurati
     }
   }
 
-  public static Path getJarPathCacheAndCleanIfNeeded(Config config, FileSystem 
fs) throws IOException {
+  /**
+   * Calculate the path of a jar cache on HDFS, which is retained on a monthly 
basis.
+   * @param config
+   * @return
+   * @throws IOException
+   */
+  public static Path calculateJarCachePath(Config config) throws IOException {
     Path jarsCacheDirMonthly = new 
Path(config.getString(GobblinYarnConfigurationKeys.JAR_CACHE_DIR));
     String monthSuffix = new 
SimpleDateFormat("yyyy-MM").format(config.getLong(GobblinYarnConfigurationKeys.YARN_APPLICATION_LAUNCHER_START_TIME_KEY));
+    return new Path(jarsCacheDirMonthly, monthSuffix);
+
+  }
+
+  /**
+   * Retain the latest k jar cache paths that are children of the parent cache 
path.
+   * @param parentCachePath
+   * @param k the number of latest jar cache paths to retain
+   * @param fs
+   * @return
+   * @throws IllegalAccessException
+   * @throws IOException
+   */
+  public static boolean retainKLatestJarCachePaths(Path parentCachePath, int 
k, FileSystem fs) throws IOException {
     // Cleanup old cache if necessary
-    List<FileStatus> jarDirs = Arrays.stream(fs.exists(jarsCacheDirMonthly)
-        ? fs.listStatus(jarsCacheDirMonthly) : new 
FileStatus[0]).sorted().collect(Collectors.toList());
-    if (jarDirs.size() > 2) {
-      fs.delete(jarDirs.get(0).getPath(), true);
+    List<FileStatus> jarDirs =
+        Arrays.stream(fs.exists(parentCachePath) ? 
fs.listStatus(parentCachePath) : new 
FileStatus[0]).sorted().collect(Collectors.toList());
+    if (jarDirs.size() > k) {
+      return fs.delete(jarDirs.get(0).getPath(), true);

Review Comment:
   Changed it to use a loop, for consistency with naming convention. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to