pitrou commented on issue #44084:
URL: https://github.com/apache/arrow/issues/44084#issuecomment-2368491413
I made some initial experiments on this and I came to the following
conclusion:
1. The performance is a mixed bag, with some non-negligible speedups on
small input sizes (32k rows in the sort benchmarks) but also apparent slowdowns
on larger inputs (8M rows). This is probably a combination of
1) allocation cost, since 16 bytes per input row are allocated for a
`int64_t` pair
2) increased memory footprint and decreased cached efficiency, both
because of enlarged indices and the temporary memory area
2. Therefore, further exploration should go towards
1) compressing resolved indices to make them fit in 64 bits (e.g. 20 bits
of `chunk_index`, 44 bits of `index_in_chunk`)
2) transforming the logical indices to physical _in place_ before merging
the chunks, and transforming them back to physical in place after merging
I might dedicate some time to this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]