alamb commented on issue #6906: URL: https://github.com/apache/datafusion/issues/6906#issuecomment-2355604402
## Background (I will make a PR shortly to add this to the actual datafusion docs) [`GroupsAccumulator`](https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.GroupsAccumulator.html) logically does this: ``` ┌─────┐ │ 0 │───────────▶ "A" ├─────┤ │ 1 │───────────▶ "Z" └─────┘ ... ... ┌─────┐ │ N-2 │ "A" ├─────┤ │ N-1 │───────────▶ "Q" └─────┘ Logical group Current Min/Max number value for that group GroupsAccumulator to store N aggregate values: logically keepa a mapping from each group index to the current value ``` Today, String / Binary min/max values are implemented using [`GroupsAccumulatorAdapter`](https://docs.rs/datafusion/latest/datafusion/physical_expr/struct.GroupsAccumulatorAdapter.html) which results in ``` Individual String (separate allocation) ┌─────┐ ┌──────────────────────────┐ │ 0 │───────────▶│ ScalarValue::Utf8("A") ├──────────▶ "A" ├─────┤ ├──────────────────────────┤ │ 1 │───────────▶│ ScalarValue::Utf8("Z") │──────────▶ "Z" └─────┘ └──────────────────────────┘ ... ... ... ┌─────┐ ┌──────────────────────────┐ │ N-2 │ │ ScalarValue::Utf8("A") │──────────▶ "A" ├─────┤ ├──────────────────────────┤ │ N-1 │───────────▶│ ScalarValue::Utf8("Q") │──────────▶ "Q" └─────┘ └──────────────────────────┘ Logical group Current Min/Max value for that group stored number as a ScalarValue which points to an indivdually allocated String How GroupsAccumulatorAdaptor works today: stores each current min/max as a ScalarValue ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org