pitrou commented on PR #34323: URL: https://github.com/apache/arrow/pull/34323#issuecomment-1447914280
A couple general comments: * we should avoid combinatorial explosion of benchmark variations; the longer benchmarks take to run, the rarer it is to run them; you could easily reduce the number of "max-string-length" values for example * we should test only meaningful or reasonable parameters; if "batch-size:8" means process 8 items at a time, I think the easy answer is "don't do it"; Parquet encoding/decoding need batch sizes in the hundreds or thousands to be efficient * the memory footprint of each benchmark should be similar and reasonable; if "byte_array_bytes" is the memory footprint then this must be fixed; most of the time the footprint should not be larger than ~10 MB -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org