prashantwason opened a new issue, #18046: URL: https://github.com/apache/hudi/issues/18046
### Describe the problem you faced When loading files in `AbstractTableFileSystemView.listPartition()`, all files in a partition are loaded without validating that they are valid HUDI data or log files. This can cause validation exceptions later in the code when stray files (temporary files, hidden files, or files with corrupted names) are processed, since HUDI requires file names to have specific formats. ### Expected behavior Only valid HUDI data files (base files and log files) should be loaded when listing partitions. Files that don't match the expected HUDI file name patterns should be filtered out. ### Environment Description * Hudi version: master * Spark version: N/A * Storage: Any ### Additional context The fix adds a `filterValidDataFiles()` method that uses `FSUtils.isDataFile()` to validate files before they are added to the file system view. This filters files at two entry points: 1. `getAllFilesInPartition()` 2. `ensurePartitionsLoadedCorrectly()` where `listPartitions()` is called -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
