leaves12138 commented on PR #7934:
URL: https://github.com/apache/paimon/pull/7934#issuecomment-4517026750
I re-reviewed the latest head (`0ca020c63c05`) and found one remaining
correctness blocker around nested projection through collection types.
`NestedProjectedRow` now handles nested `ROW` projection when the projected
field itself is a `ROW`, but `getArray()` and `getMap()` still return the
underlying values as-is. At the same time,
`FormatReaderMapping.pruneDataType()` can recursively prune through `ARRAY` and
`MAP`, so the row format can be asked to read a projected type such as
`ARRAY<ROW<b INT>>` from stored data `ARRAY<ROW<a INT, b INT>>`.
A minimal reproducer is:
```java
RowType elementType = new RowType(Arrays.asList(
new DataField(10, "a", new IntType()),
new DataField(11, "b", new IntType())));
RowType dataSchema = new RowType(Collections.singletonList(
new DataField(0, "arr", new ArrayType(elementType))));
RowType projectedElementType = new RowType(Collections.singletonList(
new DataField(11, "b", new IntType())));
RowType projectedSchema = new RowType(Collections.singletonList(
new DataField(0, "arr", new ArrayType(projectedElementType))));
// Write: arr = [ROW<a=1, b=100>]
// Read with projectedSchema:
InternalArray array = row.getArray(0);
assertThat(array.getRow(0, 1).getInt(0)).isEqualTo(100);
```
This currently returns `1` instead of `100`, because the array element row
is still the stored full row and no recursive projection is applied inside the
array. The same problem should also apply to `MAP` values/keys if they contain
pruned `ROW`s.
Could you either make nested projection preserve the same semantics through
`ARRAY`/`MAP`, or avoid recursively pruning collection element/value types for
this format until those projected wrappers are supported?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]