Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/21477#discussion_r193631566
--- Diff: python/pyspark/sql/streaming.py ---
@@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None,
continuous=None):
self._jwrite = self._jwrite.trigger(jTrigger)
return self
+ def foreach(self, f):
+ """
+ Sets the output of the streaming query to be processed using the
provided writer ``f``.
+ This is often used to write the output of a streaming query to
arbitrary storage systems.
+ The processing logic can be specified in two ways.
+
+ #. A **function** that takes a row as input.
--- End diff --
This is superset of what we support in scala. Python users are more likely
to use simple lambdas instead of defining classes. But they may also want to
write transactional stuff in python with open and close methods. Hence
providing both alternatives seems to be a good idea.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]