Re: [External] Re: [E] Re: Choice of Flink vs Spark for using DataSketches with streaming data

2021-04-08 Thread Alex Garland
Thanks Will and Marko I don’t think we need to decrement/ retract values for any reason, and our requirements were we to use Flink SQL would not currently involve the OVER syntax. It seems today like we’ve managed to get DataSketches CPC sketch integrated okay with an aggregate function in

Re: [E] Re: Choice of Flink vs Spark for using DataSketches with streaming data

2021-04-08 Thread Marko Mušnjak
The basic streaming windowed aggregations (in the Java/Scala API, https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/operators/windows.html#aggregatefunction) don't require the retract method, but it looks like the SQL/Table API requires retract support for aggregate

Re: [E] Re: Choice of Flink vs Spark for using DataSketches with streaming data

2021-04-08 Thread Will Lauer
Last time I looked at the Flink API for implementing aggregators, it looked like it required a "decrement" function to remove entries from the aggregate in addition to the standard "aggregate" function to add entries to the aggregate. The documentation was unclear, but it looked like this was a