Hi,
I am tracking states in my Spark streaming application with
MapGroupsWithStateFunction described here:
https://spark.apache.org/docs/2.4.0/api/java/org/apache/spark/sql/streaming/GroupState.html
Which are the limiting factors on the number of states a job can track at
the same time? Is it memory? Could be a bounded data structure in the
internal implementation? Anything else ...
You might have valuable input on this while I am trying to setup and test
this.

Thanks,
Arnold

Reply via email to