I'm new to Apache Spark and an absolute beginner. I'm playing around with
Spark Streaming (API version 1.5.1) in Java and want to implement a
prototype which uses HyperLogLog to estimate distinct elements. I use the
stream-lib from clearspring (https://github.com/addthis/stream-lib).
I planned
I solved the problem by passing the HLL object to the function, updating it
and returning it as new state. This was obviously a thinking barrier... ;-)
--
View this message in context:
I want to work with the Kafka integration for structured streaming. I use
Spark version 2.0.0. and I start the spark-shell with:
spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.0.0
As described here: