Hexiaoqiao commented on code in PR #6384:
URL: https://github.com/apache/hudi/pull/6384#discussion_r945567984
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieLogFile.java:
##########
@@ -42,63 +45,98 @@ public class HoodieLogFile implements Serializable {
private static final Comparator<HoodieLogFile> LOG_FILE_COMPARATOR = new
LogFileComparator();
private static final Comparator<HoodieLogFile> LOG_FILE_COMPARATOR_REVERSED
= new LogFileComparator().reversed();
+ // Log files are of this pattern -
.b5068208-e1a4-11e6-bf01-fe55135034f3_20170101134598.log.1
+ public static final Pattern LOG_FILE_PATTERN =
+
Pattern.compile("\\.(.*)_(.*)\\.(.*)\\.([0-9]*)(_(([0-9]*)-([0-9]*)-([0-9]*)))?");
private transient FileStatus fileStatus;
private final String pathStr;
private long fileLen;
+ private transient Path path;
+
+ private String baseCommitTime;
+ private int fileVersion;
+ private String logWriteToken;
+ private String fileExtension;
+ private String fileId;
Review Comment:
This will be very helpful to improve the performance when meet a lot of
logfiles which need to resolve using RegEx. I am a little concerned if it will
occupy more heap footprint. Anyway compare to memory overhead, the performance
improvement is more considerable in my opinion.
##########
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##########
@@ -64,19 +64,17 @@
import java.util.function.Function;
import java.util.function.Predicate;
import java.util.regex.Matcher;
-import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.Stream;
+import static org.apache.hudi.common.model.HoodieLogFile.LOG_FILE_PATTERN;
+
/**
* Utility functions related to accessing the file storage.
*/
public class FSUtils {
private static final Logger LOG = LogManager.getLogger(FSUtils.class);
- // Log files are of this pattern -
.b5068208-e1a4-11e6-bf01-fe55135034f3_20170101134598.log.1
- private static final Pattern LOG_FILE_PATTERN =
-
Pattern.compile("\\.(.*)_(.*)\\.(.*)\\.([0-9]*)(_(([0-9]*)-([0-9]*)-([0-9]*)))?");
private static final String LOG_FILE_PREFIX = ".";
Review Comment:
I think we should also move `LOG_FILE_PREFIX` to `HoodieLogFile.java` and
set the visibility to Public.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]