gf2121 commented on PR #14935: URL: https://github.com/apache/lucene/pull/14935#issuecomment-3064581110
I did some iteration locally and find out a way providing similar performance without `scratch` array (new code in `denseBranchLessParallel`). ``` Benchmark (bitCount) Mode Cnt Score Error Units BitsetToArrayBenchmark.denseBranchLess 5 thrpt 5 13.582 ± 0.179 ops/us BitsetToArrayBenchmark.denseBranchLess 10 thrpt 5 13.574 ± 0.034 ops/us BitsetToArrayBenchmark.denseBranchLess 20 thrpt 5 13.543 ± 0.244 ops/us BitsetToArrayBenchmark.denseBranchLess 30 thrpt 5 13.398 ± 0.249 ops/us BitsetToArrayBenchmark.denseBranchLess 40 thrpt 5 13.524 ± 0.409 ops/us BitsetToArrayBenchmark.denseBranchLess 50 thrpt 5 13.522 ± 0.434 ops/us BitsetToArrayBenchmark.denseBranchLess 60 thrpt 5 13.575 ± 0.025 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 5 thrpt 5 8.508 ± 0.124 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 10 thrpt 5 6.374 ± 0.066 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 20 thrpt 5 11.519 ± 0.182 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 30 thrpt 5 11.496 ± 0.246 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 40 thrpt 5 11.521 ± 0.256 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 50 thrpt 5 11.501 ± 0.285 ops/us BitsetToArrayBenchmark.denseBranchLessCmov 60 thrpt 5 9.102 ± 0.026 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 5 thrpt 5 15.238 ± 0.312 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 10 thrpt 5 15.268 ± 0.112 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 20 thrpt 5 15.278 ± 0.199 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 30 thrpt 5 15.269 ± 0.348 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 40 thrpt 5 15.182 ± 0.458 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 50 thrpt 5 15.287 ± 0.105 ops/us BitsetToArrayBenchmark.denseBranchLessParallel 60 thrpt 5 15.312 ± 0.158 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 5 thrpt 5 15.727 ± 0.294 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 10 thrpt 5 15.852 ± 0.256 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 20 thrpt 5 15.815 ± 0.258 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 30 thrpt 5 15.738 ± 0.620 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 40 thrpt 5 15.821 ± 0.435 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 50 thrpt 5 15.806 ± 0.202 ops/us BitsetToArrayBenchmark.denseBranchLessUnrolling 60 thrpt 5 15.483 ± 2.151 ops/us ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org