alamb opened a new pull request, #20652:
URL: https://github.com/apache/datafusion/pull/20652

   ## Which issue does this PR close?
   
   - Closes https://github.com/apache/datafusion/issues/20524
   
   ## Rationale for this change
   
   
   `push_down_filter_regression.slt ` is the sqllogictest that takes the 
longest to run, even after @Tim-53  reduced its time in 
   - https://github.com/apache/datafusion/pull/20586
   
   While reviewing https://github.com/apache/datafusion/pull/20586 and trying 
to make the sqllogictest runs faster, I noticed that a substantial amount of 
the unit test time was spent doing zstd compression/decompression:
   
   <img width="2423" height="841" alt="Screenshot 2026-03-02 at 12 50 24 PM" 
src="https://github.com/user-attachments/assets/75cfe12b-3bb2-4ffa-9c36-63ca00b8c3ff";
 />
   
   Thus, we can improve the test speed by skipping the zstd step
   
   ## What changes are included in this PR?
   
   1. Don't compress the parquet files in the test
   
   ## Are these changes tested?
   
   Yes by CI
   
   Here are my performance runs using @kosiew 's new timing feature
   ```shell
   cargo test --profile=ci --test sqllogictests  -- --timing-summary top
   ```
   
   Main:
   ```
   Per-file elapsed summary (deterministic):
   1.    4.035s  push_down_filter_regression.slt  <-- takes  over 4 seconds
   2.    3.573s  joins.slt
   3.    3.492s  aggregate.slt
   4.    3.316s  imdb.slt
   5. ```
   
   This PR
   ```
   Per-file elapsed summary (deterministic):
   1.    3.308s  aggregate.slt 
   2.    3.290s  joins.slt
   3.    3.181s  imdb.slt
   4.    2.914s  push_down_filter_regression.slt   <--- takes less than 3 
seconds and is no longer the tallest pole
   ```
   
   ## Are there any user-facing changes?
   
   Faster tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to