uchenily commented on PR #45918:
URL: https://github.com/apache/arrow/pull/45918#issuecomment-2750256709

   I ran a test `hashjoin + hash aggr` (join type: RIGHT_OUTER, no key match). 
When each input batch was set to 1<<15, the `probe * build (4096 * 512)` 
scenario took only 17.8s (including data generation time), whereas the original 
serial way took 471.8s.
   
   It should be noted that during this test, I modified `kNumRowsPerScanTask` 
to `4 * 1024`. If the original value of `512 * 1024` was used, the performance 
remained poor, in fact, the test took so long that I couldn't even measure the 
runtime.
   
   What I mean is that kNumRowsPerScanTask also significantly impacts the test 
results. However, since I couldn't determine a more reasonable value for this 
parameter, I did not modify kNumRowsPerScanTask in this PR.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to