duongcongtoai commented on issue #17446:
URL: https://github.com/apache/datafusion/issues/17446#issuecomment-3341564409

   ```
    hyperfine "~/proj/rust/build/release-nonlto/datafusion-cli-opt -f 
report.sql" "~/proj/rust/build/release-nonlto/datafusion-cli-49.0.0 -f 
report.sql" "~/proj/rust/build/release-nonlto/datafusion-cli-50.0.0 -f 
report.sql"
   Benchmark 1: ~/proj/rust/build/release-nonlto/datafusion-cli-opt -f 
report.sql
     Time (mean ± σ):      2.994 s ±  0.053 s    [User: 23.054 s, System: 2.006 
s]
     Range (min … max):    2.909 s …  3.056 s    10 runs
    
   Benchmark 2: ~/proj/rust/build/release-nonlto/datafusion-cli-49.0.0 -f 
report.sql
     Time (mean ± σ):      6.318 s ±  0.104 s    [User: 29.943 s, System: 2.245 
s]
     Range (min … max):    6.183 s …  6.529 s    10 runs
    
   Benchmark 3: ~/proj/rust/build/release-nonlto/datafusion-cli-50.0.0 -f 
report.sql
     Time (mean ± σ):      4.456 s ±  0.062 s    [User: 29.947 s, System: 2.012 
s]
     Range (min … max):    4.394 s …  4.562 s    10 runs
    
   Summary
     ~/proj/rust/build/release-nonlto/datafusion-cli-opt -f report.sql ran
       1.49 ± 0.03 times faster than 
~/proj/rust/build/release-nonlto/datafusion-cli-50.0.0 -f report.sql
       2.11 ± 0.05 times faster than 
~/proj/rust/build/release-nonlto/datafusion-cli-49.0.0 -f report.sql
   ```
   
   I also notice that `AccumulatorState::size` takes up to 30% of CPU
   ```
   fn evaluate(&mut self, emit_to: EmitTo) -> Result<ArrayRef> {
       let vec_size_pre = self.states.allocated_size();
   
       let states = emit_to.take_needed(&mut self.states);
   
       let results: Vec<ScalarValue> = states
           .into_iter()
           .map(|mut state| {
               self.free_allocation(state.size()); ----> 12%
               state.accumulator.evaluate() ->
           })
           .collect::<Result<_>>()?;
   
   
   fn invoke_per_accumulator
               let values_to_accumulate = slice_and_maybe_filter(
                   &values,
                   opt_filter.as_ref().map(|f| f.as_boolean()),
                   offsets,
               )?;
               f(state.accumulator.as_mut(), &values_to_accumulate)?;
   
               // clear out the state so they are empty for next
               // iteration
               state.indices.clear();
               sizes_post += state.size();  ----> 6.2%
           }
   
   
   fn state()
           for mut state in states {
               self.free_allocation(state.size());
               let accumulator_state = state.accumulator.state()?; ----> 11%
               results.resize_with(accumulator_state.len(), Vec::new);
               for (idx, state_val) in 
accumulator_state.into_iter().enumerate() {
                   results[idx].push(state_val);
               }
           }
   
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to