Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21477#discussion_r193632883
--- Diff: python/pyspark/sql/streaming.py ---
@@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None,
continuous=None):
self._jwrite = self._jwrite.trigger(jTrigger)
return self
+ def foreach(self, f):
+ """
+ Sets the output of the streaming query to be processed using the
provided writer ``f``.
+ This is often used to write the output of a streaming query to
arbitrary storage systems.
+ The processing logic can be specified in two ways.
+
+ #. A **function** that takes a row as input.
--- End diff --
(including the response to
https://github.com/apache/spark/pull/21477#discussion_r193631209) I kind of
agree that it's a-okay idea but I think we usually provide a consistent API
support so far unless it's language specific, for example, ContextManager,
decorator in Python and etc.
Just for clarification, does Scala side support function only support too?
Also, I know attribute-checking way is kind of more like "Pythonic" way but
I am seeing the documentation is already diverted between Scala vs Python. It
costs maintaining overhead on the other hand.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]