GitHub user mgrenonville closed a discussion: Expose intermediary states in aggregation functions
Hello, While looking at Datafusion (what an awesome project !!), I wondered if it's possible to expose intermediary states (ie: before merge_batch) to allow what clickhouse calls ["-Merge", "-State", "-MergeState"](https://clickhouse.com/docs/sql-reference/aggregate-functions/combinators#-state) combinators. This allow clickhouse to persist pre-aggregated data using a grouping key as key, thus allow to compress data without loosing ability to filter it. For example, uniqState returns a statistical structure (kind of count min sketch) that can be merge later, while querying. With this, it's easy to keep a uniqState by minute, and query uniqMerge by hour. Thanks GitHub link: https://github.com/apache/datafusion/discussions/16239 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
