lyang24 commented on PR #9093:
URL: https://github.com/apache/arrow-rs/pull/9093#issuecomment-3737337607

   its looks like its doing well with large scan querys
   ```
   arrow_reader_clickbench/async/Q20    1.18    130.2±1.59ms        ? ?/sec    
1.00    110.7±0.82ms        ? ?/sec
   arrow_reader_clickbench/async/Q21    1.29    165.5±0.99ms        ? ?/sec    
1.00    128.6±0.93ms        ? ?/sec
   arrow_reader_clickbench/async/Q22    1.24   318.7±11.80ms        ? ?/sec    
1.00    257.3±6.26ms        ? ?/sec
   ```
   
   some regressions with high selectivity
   ```
   arrow_reader_row_filter/int64 > 90/exclude_filter_column/async               
        1.00      2.6±0.02ms        ? ?/sec    1.31      3.5±0.08ms        ? 
?/sec
   arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                
        1.00      2.4±0.02ms        ? ?/sec    1.35      3.2±0.03ms        ? 
?/sec
   ``` 
   
   regression with
   ```
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, 
mandatory, no NULLs                    1.00     75.9±0.46µs        ? ?/sec    
1.56    118.1±0.43µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, 
optional, half NULLs                   1.00    232.7±2.34µs        ? ?/sec    
1.23    285.9±3.00µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, 
optional, no NULLs                     1.00     80.8±0.46µs        ? ?/sec    
1.53    123.7±0.34µs        ? ?/sec
   ```
   
   had chatgpt summarize the result as
   
   <h1 data-start="88" data-end="126">Summary: Pre-allocation vs Main Path</h1>
   <p data-start="128" data-end="280">Your <code data-start="133" 
data-end="160">ViewBuffer::with_capacity</code> optimization has a <strong 
data-start="180" data-end="228">very consistent, architecture-level 
behavior</strong> across all tested Parquet → Arrow decoding kernels.</p>
   <hr data-start="282" data-end="285">
   <h2 data-start="287" data-end="306">🟢 Where It Wins</h2>
   <div class="TyagGW_tableContainer"><div tabindex="-1" class="group 
TyagGW_tableWrapper flex flex-col-reverse w-fit">
   Kernel Type | Why | Speedup
   -- | -- | --
   BinaryView / StringView | Avoids repeated realloc of pointer & offset 
vectors | +5% → +25%
   Dictionary encoded | Index + value indirection benefits from fixed capacity 
| +6% → +18%
   ByteStreamSplit numeric | Chunked layout breaks streaming writes | +5% → +15%
   Selective row filters (<40% survive) | Output small & unpredictable | +3% → 
+12%
   
   </div></div>
   <p data-start="1230" data-end="1343">These kernels are <strong 
data-start="1248" data-end="1284">flat POD, memory-bandwidth bound</strong> — 
eager zero-touching destroys cache &amp; streaming behavior.</p>
   <hr data-start="1345" data-end="1348">
   <h2 data-start="1350" data-end="1382"></h2>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to