Can someone succinctly describe the challenge in adding the
`mapGroupsWithState()` API to PySpark?

I was hoping for some suboptimal but nonetheless working solution to be
available in Python, as there are with Python UDFs for example, but that
doesn't seem to be case. The JIRA ticket for arbitrary stateful operations
in Structured Streaming <https://issues.apache.org/jira/browse/SPARK-19067>
doesn't give any indication that a Python version of the API is coming.

Is this something that will likely be added in the near future, or is it a
major undertaking? Can someone briefly describe the problem?

Nick

Reply via email to