umehrot2 commented on a change in pull request #1876:
URL: https://github.com/apache/hudi/pull/1876#discussion_r463904220



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
##########
@@ -516,4 +528,73 @@ public static Configuration registerFileSystem(Path file, 
Configuration conf) {
     return returnConf;
   }
 
+  /**
+   * Get the FS implementation for this table.
+   * @param path  Path String
+   * @param hadoopConf  Serializable Hadoop Configuration
+   * @param consistencyGuardConfig Consistency Guard Config
+   * @return HoodieWrapperFileSystem
+   */
+  public static HoodieWrapperFileSystem getFs(String path, 
SerializableConfiguration hadoopConf,
+      ConsistencyGuardConfig consistencyGuardConfig) {
+    FileSystem fileSystem = FSUtils.getFs(path, hadoopConf.newCopy());
+    //Preconditions.checkArgument(!(fileSystem instanceof 
HoodieWrapperFileSystem),
+    //    "File System not expected to be that of HoodieWrapperFileSystem");
+    return new HoodieWrapperFileSystem(fileSystem,
+        consistencyGuardConfig.isConsistencyCheckEnabled()
+            ? new FailSafeConsistencyGuard(fileSystem, consistencyGuardConfig)
+            : new NoOpConsistencyGuard());
+  }
+
+  /**
+   * Returns leaf folders with files under a path.
+   * @param fs  File System
+   * @param basePathStr Base Path to look for leaf folders
+   * @param filePathFilter  Filters to skip directories/paths
+   * @return list of partition paths with files under them.
+   * @throws IOException
+   */
+  public static List<Pair<String, List<HoodieFileStatus>>> 
getAllLeafFoldersWithFiles(FileSystem fs, String basePathStr,
+      PathFilter filePathFilter) throws IOException {

Review comment:
       Can we move this to `hudi-spark` or `hudi-client` instead which has 
spark as a dependency ? For https://issues.apache.org/jira/browse/HUDI-999 I am 
parallelizing this using spark context, and as we have discussed earlier we do 
not want spark dependency in `hudi-common`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to