2010YOUY01 commented on PR #11943:
URL: https://github.com/apache/datafusion/pull/11943#issuecomment-2287796117

   > > I think I'm not so familiar with the Emit::First and there is no block 
implementation done yet. Could we emit every block size of values we have? 
Something like Emit::First(block size).
   > > We have `emit_early_if_necessary` that do First(n) emission when 
condition met.
   > > ```rust
   > >     fn emit_early_if_necessary(&mut self) -> Result<()> {
   > >         if self.group_values.len() >= self.batch_size
   > >             && matches!(self.group_ordering, GroupOrdering::None)
   > >             && matches!(self.mode, AggregateMode::Partial)
   > >             && self.update_memory_reservation().is_err()
   > >         {
   > >             let n = self.group_values.len() / self.batch_size * 
self.batch_size;
   > >             let batch = self.emit(EmitTo::First(n), false)?;
   > >             self.exec_state = ExecutionState::ProducingOutput(batch);
   > >         }
   > >         Ok(())
   > >     }
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > If we emit every block size we accumulated, is it something similar to 
the block approach? If not, what is the difference?
   > > Upd: One difference I can think of is that in block approach, we have 
all the accumulated values, and we can optimize it based on **all the values** 
we have, while in Emit::First mode, we early emit partial values, therefore, we 
loss the change if we want to do optimization based on **all the values** 🤔 ?
   > 
   > Ok, I think I got it now, if we constantly `emit first n` when the group 
len just equal to `batch size`, it is actually equal to blocked approach.
   > 
   > But `emit first n` will be just triggered in some special cases in my 
knowledge:
   > 
   > * Streaming aggr (the really constantly `emit first n` case, and I forced 
to disable blocked mode in this case)
   > * In `Partial` operator  if found the memory exceeded limit 
(`emit_early_if_necessary`)
   > 
   > And in others, we need to poll to end, then `emit all` and use `slice` 
method to split it and return now:
   > 
   > * For example, in the `FinalPartitioned` operator
   > * In the `Partial` operator when memory under the memory limit
   > * other cases...
   > 
   > And in such cases, blocked approach may be effecient for both memory and 
cpu as stated above?
   
   I still don't understand this `FirstBlocks` variant, if partial aggregate 
runs out of memory, output first N blocks and let final aggregation process 
them first looks like won't help much regarding total memory usage (since it's 
high cardinality aggregation)
   Is it related to the spilling implementation? I will check it later.
   
   Also thanks for pushing this forward, I think this approach is promising for 
performance


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to