I can't answer about Spark or Flink, but as a druid person, I'll put in a
plug for druid for the "if necessary" case.  It can ingest from kafka and
aggregate and do sketches during ingestion.  (It's a whole new ballpark,
though, if you're not already using it.)

On Tue, Apr 6, 2021 at 9:56 AM Alex Garland <agarl...@expediagroup.com>
wrote:

> Hi
>
>
>
> New to DataSketches and looking forward to using, seems like a great
> library.
>
>
>
> My team are evaluating it to profile streaming data (in Kafka) in 5-minute
> windows. The obvious options for stream processing (given experience within
> our org) would be either Flink or Spark Streaming.
>
>
>
> Two questions:
>
>    - Would I be right in thinking that there are not existing
>    integrations as libraries for either of these platforms? Absolutely fine if
>    not, just confirming understanding.
>    - Is there any view (from either the maintainers or the wider
>    community) on whether either of those two are easier to integrate with
>    DataSketches? We would also consider other streaming platforms if
>    necessary, but as mentioned wider usage within the org would lean us
>    against that if at all possible.
>
>
>
> Many thanks
>

Reply via email to