samarthjain commented on pull request #2740:
URL: https://github.com/apache/iceberg/pull/2740#issuecomment-868719981
@RussellSpitzer - I am hoping we can find a better solution here. I am
generally not a fan of catching NPEs :)
There are a few other approaches possible here:
1) Parquet v2 actually isn't that well tested. The later versions of Trino
though have started writing parquet files in V2 format. We encountered this
issue in Iceberg vectorized reads when we upgraded our Presto clusters to trino
350 release. We worked around the issue by reintroducing the older parquet
write path in Trino that writes Parquet V1 files.
2) To fix this in Iceberg
- We should either look into supporting vectorized reads for v2
- We should disable vectorized reads when/if we can detect that the parquet
files are in V2 format.
I can take up looking into 2).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]