alamb commented on PR #7401: URL: https://github.com/apache/arrow-datafusion/pull/7401#issuecomment-1717903189
Update: I think this PR is ready to go except for figuring out what the proper value for the LOW_CARDINALITY cutoff is I believe @tustvold is checking setting it to zero via https://github.com/apache/arrow-rs/issues/4811. @JayjeetAtGithub can you look into seeing what the performance threshold is? The idea would be to test the performance of merge on a column of different cardinalities -- maybe cardinality `4`, `8`, `12`, `20`, `50` and `100`. Maybe there is an existing benchmark that could be used in https://github.com/apache/arrow-rs/blob/master/arrow/benches/row_format.rs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
