RussellSpitzer edited a comment on issue #1735:
URL: https://github.com/apache/iceberg/issues/1735#issuecomment-723373179
From what I can tell this actually shouldn't be working on 0.9.1 but works
because of a fluke.
When running through prune Columns we check whether or not our field posses
a selected ID
```java
Schema fieldSchema = fields.get(field.pos());
// All primitives are selected by selecting the field, but map and list
// types can be selected by projecting the keys, values, or elements.
// This creates two conditions where the field should be selected: if
the
// id is selected or if the result of the field is non-null. The only
// case where the converted field is non-null is when a map or list is
// selected by lower IDs.
if (selectedIds.contains(fieldId)) {
filteredFields.add(copyField(field, field.schema(), fieldId));
} else if (fieldSchema != null) {
hasChange = true;
filteredFields.add(copyField(field, fieldSchema, fieldId));
}
```
In the 0.9.1 table this correctly also passes through selected field ID's of
0,1,3
We still get the field.pos() on data_file of 2 so it doesn't match. BUT
fields.get(2) returns the Partition schema in "r2" `partition type:Record
pos:0`
Now we aren't actually looking for that field, we are looking for the
data_file field. BUT if "filedSchema != null" we follow the secondary pruning
pathway above. This means we get to add in a copy where we set
"data_file to have schema RECORD and the fieldId" we expect. Luckily we are
pruning this out at the spark level but we are reading the wrong column data
here.
The 0.8.0 table doesn't have the "r2" record placed in "fields" and instead
just get's null
leading to the error at the beginning of this ticket.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]