sahuagin commented on PR #9786: URL: https://github.com/apache/arrow-rs/pull/9786#issuecomment-4299581864
Comments addressed in e302663a6. Full benchmark results on the updated branch against the `upstream` baseline: ### Wins on the bw=0 paths `binary packed skip single value` (all-same values → bw=0 throughout): | Type | Change | |---|---| | Int8Array | **−22.0%** | | UInt8Array | **−19.1%** | | Int16Array | **−17.9%** | | UInt16Array | **−19.8%** | | Int32Array | **−21.3%** | | UInt32Array | **−21.0%** | | Int64Array | **−18.1%** | | UInt64Array | **−20.9%** | | INT32/Decimal128Array | **−9.7%** | | INT64/Decimal128Array | **−12.4%** | `binary packed skip increasing value` (fixed stride → bw=0): | Type | Change | |---|---| | Int8Array | **−21.7%** | | UInt8Array | **−14.8%** | | Int16Array | **−17.2%** | | UInt16Array | **−17.0%** | | Int32Array | **−16.1%** | | UInt32Array | **−18.0%** | | Int64Array | **−21.2%** | | UInt64Array | **−20.4%** | | INT32/Decimal128Array | **−9.1%** | | INT64/Decimal128Array | **−12.8%** | `binary packed skip stepped increasing value` (mixed bw, some miniblocks bw=0): | Type | Change | |---|---| | Int8Array | **−10.4%** | | UInt8Array | **−4.1%** | | Int16Array | **−6.3%** | | UInt16Array | **−5.9%** | | Int32Array | **−6.4%** | | UInt32Array | **−7.3%** | | Int64Array | +1.3% | | UInt64Array | +2.5% | | INT32/Decimal128Array | **−4.4%** | | INT64/Decimal128Array | +2.6% | The magnitude scales with how much of the column is bw=0 (single-value: every miniblock; increasing-value: every miniblock; stepped-increasing: depends on where the step lands within a miniblock). ### Non-target paths `mandatory/optional, no NULLs` and `optional, half NULLs` variants exercise the non-terminal decode path, unchanged by this PR. These show uniform +3% to +10% regressions across types. Consistent with measurement variance on a non-isolated machine — the noise floor is raised but the signs are mixed in both directions across the two bench runs I did today on the same hardware. ### Measurement conditions Non-isolated machine, no CPU pinning, browser tabs and background processes active. For the bw=0 target paths the signal (−18% to −22% on single-value) is large enough to read through any plausible noise; for the non-target paths I'd discount the numbers. Happy to rerun on more controlled hardware if useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
