yihua commented on code in PR #7612:
URL: https://github.com/apache/hudi/pull/7612#discussion_r1068539565


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##########
@@ -358,23 +364,27 @@ public static String createNewFileId(String idPfx, int 
id) {
    * Get the file extension from the log file.
    */
   public static String getFileExtensionFromLog(Path logPath) {
-    Matcher matcher = LOG_FILE_PATTERN.matcher(logPath.getName());
+    boolean isArchivedLog = logPath.getName().contains(ARCHIVED_LOG_PREFIX);
+    Matcher matcher =  isArchivedLog ? 
ARCHIVED_LOG_FILE_PATTERN.matcher(logPath.getName()) :
+        LOG_FILE_PATTERN.matcher(logPath.getName());
     if (!matcher.find()) {
       throw new InvalidHoodiePathException(logPath, "LogFile");
     }
-    return matcher.group(3);
+    return isArchivedLog ? ARCHIVE_STR : matcher.group(3);

Review Comment:
   Any reason to handle archived log separately?  Can't the `LOG_FILE_PATTERN` 
be adjusted using OR operand (`log|archive`) to match `.log` or `.archive`?



##########
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##########
@@ -358,23 +364,27 @@ public static String createNewFileId(String idPfx, int 
id) {
    * Get the file extension from the log file.
    */
   public static String getFileExtensionFromLog(Path logPath) {
-    Matcher matcher = LOG_FILE_PATTERN.matcher(logPath.getName());
+    boolean isArchivedLog = logPath.getName().contains(ARCHIVED_LOG_PREFIX);
+    Matcher matcher =  isArchivedLog ? 
ARCHIVED_LOG_FILE_PATTERN.matcher(logPath.getName()) :
+        LOG_FILE_PATTERN.matcher(logPath.getName());
     if (!matcher.find()) {
       throw new InvalidHoodiePathException(logPath, "LogFile");
     }
-    return matcher.group(3);
+    return isArchivedLog ? ARCHIVE_STR : matcher.group(3);
   }
 
   /**
    * Get the first part of the file name in the log file. That will be the 
fileId. Log file do not have instantTime in
    * the file name.
    */
   public static String getFileIdFromLogPath(Path path) {
-    Matcher matcher = LOG_FILE_PATTERN.matcher(path.getName());
+    boolean isArchivedLog = path.getName().contains(ARCHIVED_LOG_PREFIX);
+    Matcher matcher =  isArchivedLog ? 
ARCHIVED_LOG_FILE_PATTERN.matcher(path.getName())
+        : LOG_FILE_PATTERN.matcher(path.getName());
     if (!matcher.find()) {
       throw new InvalidHoodiePathException(path, "LogFile");
     }
-    return matcher.group(1);
+    return isArchivedLog ? COMMITS_STR : matcher.group(1);

Review Comment:
   Similar here and the rest.  I'm not in favor of special handling unless 
necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to