alamb commented on PR #7513:
URL: https://github.com/apache/arrow-rs/pull/7513#issuecomment-2895290335

   > > It seems regression for Q36/Q37.
   > 
   > Yes, I agree -- I will figure out why
   
   I did some profiling:
   
   ```shell
   samply record target/release/deps/arrow_reader_clickbench-aef15514767c9665 
--bench arrow_reader_clickbench/sync/Q36
   ```
   
   Basically, the issue is that calling `slice()` is taking a non trivial 
amount of the time for Q36
   
   ![Screenshot 2025-05-20 at 1 23 25 
PM](https://github.com/user-attachments/assets/6f64c8f4-c9f8-451e-8dc8-cacbe6fc4e4e)
   
   
   I added some printlns and it seems like we have 181k rows in total that pass 
but the number of buffers is crazy (I think this is related to concat not 
compacting the ByteViewArray). Working on this...
   ```
   ByteViewArray::slice offset=8192 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=16384 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=24576 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=32768 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=40960 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=49152 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=57344 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=65536 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=73728 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=81920 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=90112 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=98304 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=106496 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=114688 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=122880 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=131072 length=8192, total_rows: 181198 
buffer_count: 542225
   ByteViewArray::slice offset=139264 length=8192, total_rows: 181198 
buffer_count: 542225
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to