the-other-tim-brown commented on code in PR #11219:
URL: https://github.com/apache/hudi/pull/11219#discussion_r1601667084
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -2000,16 +2000,15 @@ public DirectoryInfo(String relativePath,
List<StoragePathInfo> pathInfos, Strin
// Pre-allocate with the maximum length possible
filenameToSizeMap = new HashMap<>(pathInfos.size());
+ // Presence of partition meta file implies this is a HUDI partition
+ isHoodiePartition = pathInfos.stream().anyMatch(status ->
status.getPath().getName().startsWith(HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE_PREFIX));
Review Comment:
If you expose your Hudi table as a Delta Lake table with XTable, you will
have parquet files in the `_delta_log` and this will lead to a parsing issue.
This is the proper way to fix the issue in my opinion. The intention of this
code is to only add files that are in directories with a partition marker file.
I'm worried that changing the `isDataFile` may lead to some unintended side
effects
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]