ZhangHuiGui commented on PR #40998: URL: https://github.com/apache/arrow/pull/40998#issuecomment-2039204161
> I question the merits of reordering the columns in the first place (I never saw benchmarks showing a clear benefit) but the change seems correct to me. I've tested this, looks like a big performance improvement if use the resorted columns (perf shows it's benefit from easier vectorization). origin columns order : `string_values, i32_values, i64_values, fsb_values, bool_values`. Use resorted columns: ```shell GrouperWithResortedColumns/1024/0 68.6 us 68.6 us 10057 bytes_per_second=14.2436Mi/s items_per_second=14.9355M/s null_percent=0 size=1.024k GrouperWithResortedColumns/2048/0 138 us 138 us 4989 bytes_per_second=14.1325Mi/s items_per_second=14.819M/s null_percent=0 size=2.048k GrouperWithResortedColumns/4096/0 297 us 297 us 2450 bytes_per_second=13.1419Mi/s items_per_second=13.7803M/s null_percent=0 size=4.096k GrouperWithResortedColumns/8192/0 635 us 635 us 1026 bytes_per_second=12.3119Mi/s items_per_second=12.91M/s null_percent=0 size=8.192k GrouperWithResortedColumns/16384/0 1899 us 1899 us 351 bytes_per_second=8.22997Mi/s items_per_second=8.62975M/s null_percent=0 size=16.384k GrouperWithResortedColumns/32768/10000 4444 us 4441 us 155 bytes_per_second=7.03607Mi/s items_per_second=7.37785M/s null_percent=0.01 size=32.768k GrouperWithResortedColumns/32768/100 11958 us 11953 us 71 bytes_per_second=2.61451Mi/s items_per_second=2.74151M/s null_percent=1 size=32.768k GrouperWithResortedColumns/32768/10 29060 us 29048 us 34 bytes_per_second=1.07582Mi/s items_per_second=1.12808M/s null_percent=10 size=32.768k GrouperWithResortedColumns/32768/2 46366 us 46347 us 30 bytes_per_second=690.451Ki/s items_per_second=707.022k/s null_percent=50 size=32.768k GrouperWithResortedColumns/32768/1 1089 us 1089 us 644 bytes_per_second=28.699Mi/s items_per_second=30.0931M/s null_percent=100 size=32.768k GrouperWithResortedColumns/32768/0 4273 us 4271 us 162 bytes_per_second=7.31762Mi/s items_per_second=7.67308M/s null_percent=0 size=32.768k ``` Use origin columns: ```shell GrouperWithResortedColumns/1024/0 6628 us 6625 us 100 bytes_per_second=150.947Ki/s items_per_second=154.57k/s null_percent=0 size=1.024k GrouperWithResortedColumns/2048/0 13385 us 13376 us 100 bytes_per_second=149.516Ki/s items_per_second=153.105k/s null_percent=0 size=2.048k GrouperWithResortedColumns/4096/0 28524 us 28510 us 130 bytes_per_second=140.303Ki/s items_per_second=143.67k/s null_percent=0 size=4.096k GrouperWithResortedColumns/8192/0 33133 us 33114 us 66 bytes_per_second=241.59Ki/s items_per_second=247.388k/s null_percent=0 size=8.192k GrouperWithResortedColumns/16384/0 40724 us 40705 us 33 bytes_per_second=393.073Ki/s items_per_second=402.506k/s null_percent=0 size=16.384k GrouperWithResortedColumns/32768/10000 57044 us 57008 us 17 bytes_per_second=561.327Ki/s items_per_second=574.799k/s null_percent=0.01 size=32.768k GrouperWithResortedColumns/32768/100 56691 us 56668 us 16 bytes_per_second=564.691Ki/s items_per_second=578.244k/s null_percent=1 size=32.768k GrouperWithResortedColumns/32768/10 56045 us 56024 us 17 bytes_per_second=571.188Ki/s items_per_second=584.897k/s null_percent=10 size=32.768k GrouperWithResortedColumns/32768/2 53605 us 53577 us 25 bytes_per_second=597.276Ki/s items_per_second=611.61k/s null_percent=50 size=32.768k GrouperWithResortedColumns/32768/1 1102 us 1102 us 643 bytes_per_second=28.3595Mi/s items_per_second=29.7371M/s null_percent=100 size=32.768k GrouperWithResortedColumns/32768/0 54583 us 54560 us 17 bytes_per_second=586.508Ki/s items_per_second=600.585k/s null_percent=0 size=32.768k ``` I'll add more tests on it and push the benchmark in another pr. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
