Flink and sketches

Flavio Pompermaier Thu, 21 Mar 2019 09:43:34 -0700

Hi to all,
I was looking for an approx_count and freq_item in Flink and I'm not sure
which road to follow.
At the moment I found 2 valuable options:


   1. Wait for STREAMLINE to unveil their code of HLL_DISTINCT_COUNT[1]
   2. Use the Yahoo Datasketches lib [2], following the example of Tobias
   Lindener [3][4] (and maybe release a better and reusable third party lib
   for Flink)

What do you advice about it? Is there any other ongoing effort on approx
statistics?

Best,
Flavio

[1]
https://h2020-streamline-project.eu/wp-content/uploads/2018/10/Streamline-D5.5-Final.pdf
[2] https://datasketches.github.io
[3]https://github.com/tlindener/ApproximateQueries/
[4]
https://www.slideshare.net/SeattleApacheFlinkMeetup/approximate-queries-and-graph-streams-on-apache-flink-theodore-vasiloudis-seattle-apache-flink-meetup

Flink and sketches

Reply via email to