Re: [PR] [Fix][Connector-file] Fix the new schema cannot be fetched when the parquet file is read [seatunnel]

via GitHub Thu, 22 Jan 2026 06:18:40 -0800


JeremyXin commented on code in PR #10378:
URL: https://github.com/apache/seatunnel/pull/10378#discussion_r2717102574



##########
seatunnel-connectors-v2/connector-file/connector-file-base/src/main/java/org/apache/seatunnel/connectors/seatunnel/file/source/reader/ParquetReadStrategy.java:
##########
@@ -140,7 +138,8 @@ public void read(FileSourceSplit split, 
Collector<SeaTunnelRow> output)
                     fields = new Object[fieldsCount];
                 }
                 for (int i = 0; i < fieldsCount; i++) {
-                    Object data = record.get(indexes[i]);

Review Comment:
   Because the records read from the old files may not match the new schema 
field index. For example, the old file contains columns [id, name, age], but 
the schema obtained from the new file is [id, name, sex, age]. If we still use 
the index to read data from the record, there will be problems. For example, 
when reading from the old file, the position with an index of 2 will read the 
"age" field instead of the "sex" field of the target table. Moreover, it may 
cause issues such as index out-of-bounds.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [Fix][Connector-file] Fix the new schema cannot be fetched when the parquet file is read [seatunnel]

Reply via email to