imbajin commented on code in PR #571:
URL: 
https://github.com/apache/incubator-hugegraph-toolchain/pull/571#discussion_r1474264588


##########
hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/hdfs/HDFSFileReader.java:
##########
@@ -62,7 +65,20 @@ public HDFSFileReader(HDFSSource source) {
         } catch (IOException e) {
             throw new LoadException("Failed to create HDFS file system", e);
         }
-        Path path = new Path(source.path());
+        String input = source.path();
+        if (input.contains("*")) {
+            int lastSlashIndex = input.lastIndexOf('/');
+            if (lastSlashIndex != -1) {
+                input_path = input.substring(0, lastSlashIndex);
+                prefix = input.substring(lastSlashIndex + 1, input.length() - 
1);

Review Comment:
   if we meet `/a/*b/c`, seems we may fail here? 
   
   Also may fail with `/a/*b/*c` ?
   
   
   if we want to support it, maybe we could try regex here and used it in the 
`FilterFilter` like this:
   ```java
       if (input.contains("*")) {
           int lastSlashIndex = input.lastIndexOf('/');
           if (lastSlashIndex != -1) {
               input_path = input.substring(0, lastSlashIndex);
               String wildcard = input.substring(lastSlashIndex + 1);
               // relace to the reg-str
               prefix = wildcard.replace("*", "[^/]*");
           }
   
           FileStatus[] statuses;
           if (prefix == null || prefix.isEmpty()) {
               statuses = this.hdfs.listStatus(path);
           } else {
               // then match it in filter
               PathFilter prefixFilter = scanPath -> 
scanPath.getName().matches(prefix);
               statuses = this.hdfs.listStatus(path, prefixFilter);
           }
   ```
   
   Note: better to add some test cases for it or test it locally first 🔢 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to