DanielCarter-stack commented on issue #10369:
URL: https://github.com/apache/seatunnel/issues/10369#issuecomment-3770842400

   <!-- code-pr-reviewer -->
   Thanks for reporting this. I can confirm this is a valid bug that affects 
multiple file formats.
   
   **Root cause:** When the file list is empty (e.g., empty directory or all 
files filtered by `sync_mode=update`), calling `fileNames.get(0)` triggers 
`IndexOutOfBoundsException` in schema inference methods.
   
   **Affected files:**
   - `.../file/source/reader/TextReadStrategy.java` (lines 276, 306, 346)
   - `.../file/source/reader/AbstractReadStrategy.java` (line 139 - affects all 
subclasses)
   - `.../file/source/reader/CsvReadStrategy.java` (3 locations)
   - `.../file/source/reader/ExcelReadStrategy.java` (2 locations)
   - `.../file/source/reader/XmlReadStrategy.java` (2 locations)
   
   **Questions to reproduce:**
   1. What file format are you using (Text/CSV/JSON/Excel/XML)?
   2. Is the directory truly empty or were files filtered out?
   3. Can you provide the full exception stack trace?
   
   **Workaround:** Ensure the source directory is not empty, or verify that 
files match the filter conditions when using `sync_mode=update`.
   
   **Suggested fix:** Add `fileNames.isEmpty()` checks before calling 
`fileNames.get(0)` in `getSeaTunnelRowTypeInfo()` and `setCatalogTable()` 
methods. When empty, return the base schema without partition info instead of 
throwing an exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to