RussellSpitzer commented on issue #1735:
URL: https://github.com/apache/iceberg/issues/1735#issuecomment-723369468
Pretty sure the specific error is here
```
Set<Integer> projectedIds = TypeUtil.getProjectedIds(expectedSchema)
```
Given an expected schema
```
< 0: Status, 1: snapshot_id, 3: Sequence_number, 2: data_file >
```
The code returns
```
<0 , 1, 3>
```
So our pruned schema looks like
Schema prunedSchema = AvroSchemaUtil.pruneColumns(newFileSchema,
projectedIds, nameMapping);
```
0 status, 1 snapshot_id
```
Missing the requested data_file column
This all seems to happen on the executors, and since we shouldn't be reading
those other columns in the first place it seem to me like the Projection is
actually not being serialized correctly to the executors apart from this error
with the bad mapping for projectedIds
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]