viirya opened a new pull request, #56084: URL: https://github.com/apache/spark/pull/56084
### What changes were proposed in this pull request? Adds a low-level benchmark, `WritableColumnVectorBulkFillBenchmark`, for the constant-value bulk-fill APIs on `WritableColumnVector`: - `putBooleans(rowId, count, value)` - `putBytes(rowId, count, value)` - `putShorts(rowId, count, value)` - `putInts(rowId, count, value)` - `putLongs(rowId, count, value)` - `putNulls(rowId, count)` The benchmark sweeps `count = 1, 8, 64, 512, 4096, 65536`, covering both the call-overhead dominated regime (small count) and the memory bandwidth dominated regime (large count). Each case runs on both `OnHeapColumnVector` and `OffHeapColumnVector`. ### Why are the changes needed? There is currently no benchmark covering these constant-value bulk-fill APIs in isolation. SPARK-57024 and SPARK-57036 are optimizing several of them; this benchmark establishes a stable baseline so future optimizations to these methods can be tracked. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? The benchmark itself compiles and runs. No production code change. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.7) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
