ShiKaiWi commented on issue #5250:
URL: https://github.com/apache/arrow-rs/issues/5250#issuecomment-1871134531

   After run the command `cargo bench --bench arrow_reader --features="arrow 
test_common experimental" -- StringArray/dictionary` in the `parquet` source 
directory, I find that the mentioned problem has been fixed on the master 
branch, with the changeset 
https://github.com/ShiKaiWi/arrow-rs/commit/d4e905a6cc337f10c61f47d75f264df82fc97242,
 the performance drops:
   ```
   arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs
                           time:   [152.71 µs 153.63 µs 154.51 µs]
                           change: [+9.9536% +11.737% +13.274%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high severe
   arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs
                           time:   [146.80 µs 147.41 µs 148.10 µs]
                           change: [+14.007% +14.516% +14.970%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs
                           time:   [174.27 µs 176.84 µs 179.28 µs]
                           change: [+5.0935% +6.4985% +7.8113%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   ```
   
   And the changeset 
https://github.com/ShiKaiWi/arrow-rs/commit/d4e905a6cc337f10c61f47d75f264df82fc97242
 only works for the `parquet v43`.
   
   @tustvold sorry to bother you. :sweat_smile:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to