Rachelint commented on issue #11931:
URL: https://github.com/apache/datafusion/issues/11931#issuecomment-2284775602

   > ## The sketch's detailed design
   > ### 1. When will the blocked method triggered?
   > 
   >     * It should not be streaming aggregation, because steaming depends on 
the excact `Emit::First(n)` mode, and it is too expansive to impl it in blocked 
method.
   > 
   >     * The blocked `GroupValues` will be triggered, if we found the 
used`GroupValues` impl support it.
   > 
   >     * The blocked `GroupAccumulator` will be only triggered, when all the 
used `GroupAccumulator`s support blocked, and  the used `GroupValues` supports 
blocked too.
   > 
   > 
   > ### 2. Introduce new emit modes used in blocked method
   > 
   > It can support emit multiple blocks in `GroupValuess` and 
`GroupAccumulators` now:
   > 
   > ```
   > /// Describes how many rows should be emitted during grouping.
   > #[derive(Debug, Clone, Copy)]
   > pub enum EmitTo {
   >     /// Emit all groups
   >     All,
   >     /// Emit only the first `n` groups and shift all existing group
   >     /// indexes down by `n`.
   >     ///
   >     /// For example, if `n=10`, group_index `0, 1, ... 9` are emitted
   >     /// and group indexes '`10, 11, 12, ...` become `0, 1, 2, ...`.
   >     First(usize),
   >     /// Emit all groups managed by blocks
   >     AllBlocks,
   >     /// Emit only the first `n` group blocks,
   >     /// similar as `First`, but used in blocked `GroupValues` and 
`GroupAccumulator`.
   >     ///
   >     /// For example, `n=3`, `block size=4`, finally 12 groups will be 
returned.
   >     FirstBlocks(usize),
   > }
   > ```
   > 
   > For incrementally development for blocked method for so many detailed 
GroupValues and GroupAccumulator impls. This sketch pr did a lot of 
compatibility works, and combinations are allowed:
   > 
   >     * Single GroupAccumulator + single GroupAccumulator
   > 
   >     * Blocked GroupValues + single GroupAccumulator
   > 
   >     * Blocked GroupValues + blocked GroupAccumulator
   > 
   > 
   > ### 3. Introduce `GroupIndices` to do communication between `GroupValues` 
and `GroupAccumulator`
   > 
   > One of the problem is how to let `GroupAccumulator` know the if `group 
indices` is flat or `blocked`? I introduce `GroupIndices` to make it, but it 
indeed leads to api change for `GroupAccumulator`.
   > 
   > ```
   > pub enum GroupIndices<'a> {
   >     Flat(&'a [u64]),
   >     Blocked(&'a [u64]),
   > }
   > 
   > #[derive(Debug, Clone, Copy, PartialEq, Eq)]
   > pub enum GroupIndicesType {
   >     Flat,
   >     Blocked,
   > }
   > 
   > impl GroupIndicesType {
   >     pub fn typed_group_indices<'a>(&self, indices: &'a [u64]) -> 
GroupIndices<'a> {
   >         match self {
   >             GroupIndicesType::Flat => GroupIndices::Flat(indices),
   >            dGroupIndicesType::Blocked => GroupIndices::Blocked(indices),
   >         }
   >     d
   > d
   > `dd
   
   I have finished a draft  framework for blocked management, minding have a 
quick look? The general design is d


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to