rdblue opened a new pull request #3309: URL: https://github.com/apache/iceberg/pull/3309
This fixes the Parquet map projection bug introduced by https://github.com/apache/parquet-mr/pull/798 The projection code in Iceberg would create map projections by using the Parquet `Types.map` builder. But, the type created by this builder changed by renaming the key-value pair, `map` to `key_value`, so the projection was no longer valid for Parquet files. As a result, Parquet would not project the `map` column and loading it would fail with an error like this: ``` Caused by: java.lang.IllegalArgumentException: [mapCol, map, key] required binary key (STRING) = 2 is not in the store: [] 1000 at org.apache.iceberg.shaded.org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:272) at org.apache.iceberg.parquet.ParquetValueReaders$PrimitiveReader.setPageSource(ParquetValueReaders.java:185) at org.apache.iceberg.parquet.ParquetValueReaders$RepeatedKeyValueReader.setPageSource(ParquetValueReaders.java:529) at org.apache.iceberg.parquet.ParquetValueReaders$StructReader.setPageSource(ParquetValueReaders.java:685) ``` The solution is to copy the map structure and ensure that the names are preserved rather than generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
