onlywangyh commented on code in PR #7323:
URL: https://github.com/apache/hudi/pull/7323#discussion_r1035597947


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java:
##########
@@ -146,7 +146,7 @@ protected Option<HoodieRecord<HoodieMetadataPayload>> 
getRecordByKey(String key,
   @Override
   public List<String> getPartitionPathsWithPrefixes(List<String> prefixes) 
throws IOException {
     return getAllPartitionPaths().stream()
-        .filter(p -> prefixes.stream().anyMatch(p::startsWith))
+        .filter(p -> prefixes.stream().anyMatch(queryPaths -> 
p.startsWith(queryPaths + "/") || queryPaths.equals(p)))

Review Comment:
   This method will return a match partition path in hudi. When this table has 
a partition like [/inc_day=20221120/opcode=501, /inc_day=20221120/opcode=50, 
/inc_day=20221120-back/opcode=5000], and the query path prefixes may be a few 
of the following possibilities:
   1) a empty path like "";  
   2) a part of partition like "/inc_day=20221120";  
   3) a absoulty path like "/inc_day=20221120/opcode=50";
   
   If we just use startWith filter match paths, we will reurn a list with 
unnecessary partiton paths. Like this:
   
   prefixes="/inc_day=20221120/opcode=50"
   matchedPartitionPaths=[/inc_day=20221120/opcode=50, 
/inc_day=20221120/opcode=501]
   
   or
   
   prefixes="/inc_day=20221120"
   matchedPartitionPaths=[/inc_day=20221120/opcode=50, 
/inc_day=20221120/opcode=501, /inc_day=20221120-back/opcode=5000]
   
   While in most of scenarios the matchedPartitionPaths contains unnecessary 
partiton paths is right. But in hive will caused a java.lang.RuntimeException: 
Invalid input path. So we want make the matchedPartitionPaths exclude these 
unnecessary partiton path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to