Following what Ryan did for HLL sketches, I would also add an aggregate 
expression for unions as the aggregate version of the binary union expression.

The expressions that Ryan added are:
hll_sketch_agg
hll_union
hll_union_agg
hll_sketch_estimate

Following the same naming convention I would probably go for:

theta_sketch_agg(sketch_col)
theta_union(sketch1, sketch2)
theta_union_agg(sketch_col)
theta_difference(sketch1, sketch2)
theta_intersection(sketch1, sketch2)
theta_intersection_agg(sketch_col)
theta_sketch_estimate(sketch)

I do not think having an aggregate expression for differences would make sense 
(just not obvious to me).

What do you think?

- Menelaos


> On Jun 3, 2025, at 4:27 PM, Boumalhab, Chris <cboum...@amazon.com> wrote:
> 
> I think something like this could work:
> theta_sketch_agg(col) to build the sketch
> theta_sketch_union(sketch1, sketch2) to union the sketches
> theta_sketch_estimate(sketch) or theta_sketch_estimate_count(sketch) to 
> estimate count
> …
>  
> Something similar can be done for tuple support.
>  
> Let me know what you think.
>  
> Chris

Reply via email to