iemejia commented on PR #55919: URL: https://github.com/apache/spark/pull/55919#issuecomment-4688824001
@LuciferYang All three benchmark runs are now on AMD EPYC 7763 (JDK 17, 21, 25) and the results are pretty promising: **INT64 reads**: 1.8x-3.7x across all JDKs and data patterns **INT64 skip**: 2.3x-4.0x **Unsigned long encoding** (with the new `byte[]` loop): 7.3x-8.6x **INT32 reads**: 1.1x-1.6x (narrowing overhead limits gains) **DELTA_BYTE_ARRAY / DELTA_LENGTH_BYTE_ARRAY**: 1.2x-1.9x indirect improvement Updated the PR description with full JDK 17/21/25 comparison tables and the new workflow run links. Thank you for all your help and the thorough review suggestions -- the `byte[]` loop approach is cleaner and avoids the ByteBuffer abstraction entirely, and moving the scratch buffer allocation to `initFromPage` makes the code more straightforward. Really appreciate the guidance on getting the benchmark workflow right too. I believe this is ready to go now -- would you be able to merge it when you get a chance? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
