vinothchandar commented on a change in pull request #1009:  [HUDI-308] Avoid 
Renames for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r357181383
 
 

 ##########
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
 ##########
 @@ -414,23 +436,41 @@ public String getCommitActionType() {
   /**
    * Helper method to scan all hoodie-instant metafiles and construct 
HoodieInstant objects.
    *
-   * @param fs FileSystem
-   * @param metaPath Meta Path where hoodie instants are present
    * @param includedExtensions Included hoodie extensions
+   * @param applyLayoutVersionFilters Depending on Timeline layout version, if 
there are multiple states for the same
+   * action instant, only include the highest state
    * @return List of Hoodie Instants generated
    * @throws IOException in case of failure
    */
-  public static List<HoodieInstant> 
scanHoodieInstantsFromFileSystem(FileSystem fs, Path metaPath,
-      Set<String> includedExtensions) throws IOException {
-    return Arrays.stream(HoodieTableMetaClient.scanFiles(fs, metaPath, path -> 
{
-      // Include only the meta files with extensions that needs to be included
-      String extension = FSUtils.getFileExtension(path.getName());
-      return includedExtensions.contains(extension);
-    })).sorted(Comparator.comparing(
-        // Sort the meta-data by the instant time (first part of the file name)
-        fileStatus -> FSUtils.getInstantTime(fileStatus.getPath().getName())))
-        // create HoodieInstantMarkers from FileStatus, which extracts 
properties
-        .map(HoodieInstant::new).collect(Collectors.toList());
+  public List<HoodieInstant> scanHoodieInstantsFromFileSystem(Set<String> 
includedExtensions,
+      boolean applyLayoutVersionFilters) throws IOException {
+    return scanHoodieInstantsFromFileSystem(new Path(metaPath), 
includedExtensions, applyLayoutVersionFilters);
+  }
+
+  /**
+   * Helper method to scan all hoodie-instant metafiles and construct 
HoodieInstant objects.
+   *
+   * @param timelinePath MetaPath where instant files are stored
+   * @param includedExtensions Included hoodie extensions
+   * @param applyLayoutVersionFilters Depending on Timeline layout version, if 
there are multiple states for the same
+   * action instant, only include the highest state
+   * @return List of Hoodie Instants generated
+   * @throws IOException in case of failure
+   */
+  public List<HoodieInstant> scanHoodieInstantsFromFileSystem(Path 
timelinePath, Set<String> includedExtensions,
+      boolean applyLayoutVersionFilters) throws IOException {
+    Stream<HoodieInstant> instantStream = Arrays.stream(
+        HoodieTableMetaClient
+            .scanFiles(getFs(), timelinePath, path -> {
+              // Include only the meta files with extensions that needs to be 
included
+              String extension = FSUtils.getFileExtension(path.getName());
+              return includedExtensions.contains(extension);
+            })).map(HoodieInstant::new);
+
+    if (applyLayoutVersionFilters) {
+      instantStream = 
TimelineLayout.getLayout(getTimelineLayoutVersion()).filterHoodieInstants(instantStream);
 
 Review comment:
   Seems the `applyLayoutVersionFilters` is set selectively using which 
HoodieActiveTimeline constructor is invoked? Would this be fragile.. Thinking 
out loud, applying filters on V0, has no effect since there are nothing to get 
rid off. Only thing that could do wrong is not filtering V1.. hmmm

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to