steveloughran commented on issue #15628: URL: https://github.com/apache/iceberg/issues/15628#issuecomment-4100227084
@RussellSpitzer and here's the results of a the spark reader benchmark I've added to the PR. The longer lines are all the shedded ones, even when filtering or selecting on a shedded column. [human-readable-output.txt](https://github.com/user-attachments/files/26148519/human-readable-output.txt) I can think of some more tests there * cost of adding rows * extending the persisted structure (I'd parameterize for a simple vs complex variant) * and optionally partition by the category:int field, both direct and via the variant. Suggestions welcome -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
