DanielCarter-stack commented on issue #10369: URL: https://github.com/apache/seatunnel/issues/10369#issuecomment-3770842400
<!-- code-pr-reviewer --> Thanks for reporting this. I can confirm this is a valid bug that affects multiple file formats. **Root cause:** When the file list is empty (e.g., empty directory or all files filtered by `sync_mode=update`), calling `fileNames.get(0)` triggers `IndexOutOfBoundsException` in schema inference methods. **Affected files:** - `.../file/source/reader/TextReadStrategy.java` (lines 276, 306, 346) - `.../file/source/reader/AbstractReadStrategy.java` (line 139 - affects all subclasses) - `.../file/source/reader/CsvReadStrategy.java` (3 locations) - `.../file/source/reader/ExcelReadStrategy.java` (2 locations) - `.../file/source/reader/XmlReadStrategy.java` (2 locations) **Questions to reproduce:** 1. What file format are you using (Text/CSV/JSON/Excel/XML)? 2. Is the directory truly empty or were files filtered out? 3. Can you provide the full exception stack trace? **Workaround:** Ensure the source directory is not empty, or verify that files match the filter conditions when using `sync_mode=update`. **Suggested fix:** Add `fileNames.isEmpty()` checks before calling `fileNames.get(0)` in `getSeaTunnelRowTypeInfo()` and `setCatalogTable()` methods. When empty, return the base schema without partition info instead of throwing an exception. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
