rdblue opened a new pull request #3309:
URL: https://github.com/apache/iceberg/pull/3309


   This fixes the Parquet map projection bug introduced by 
https://github.com/apache/parquet-mr/pull/798
   
   The projection code in Iceberg would create map projections by using the 
Parquet `Types.map` builder. But, the type created by this builder changed by 
renaming the key-value pair, `map` to `key_value`, so the projection was no 
longer valid for Parquet files. As a result, Parquet would not project the 
`map` column and loading it would fail with an error like this:
   
   ```
   Caused by: java.lang.IllegalArgumentException: [mapCol, map, key] required 
binary key (STRING) = 2 is not in the store: [] 1000
           at 
org.apache.iceberg.shaded.org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:272)
           at 
org.apache.iceberg.parquet.ParquetValueReaders$PrimitiveReader.setPageSource(ParquetValueReaders.java:185)
           at 
org.apache.iceberg.parquet.ParquetValueReaders$RepeatedKeyValueReader.setPageSource(ParquetValueReaders.java:529)
           at 
org.apache.iceberg.parquet.ParquetValueReaders$StructReader.setPageSource(ParquetValueReaders.java:685)
   ```
   
   The solution is to copy the map structure and ensure that the names are 
preserved rather than generated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to