Github user brkyvz commented on a diff in the pull request:
https://github.com/apache/spark/pull/16520#discussion_r95479800
--- Diff: python/pyspark/sql/streaming.py ---
@@ -779,15 +783,20 @@ def start(self, path=None, format=None,
partitionBy=None, queryName=None, **opti
:param path: the path in a Hadoop supported file system
:param format: the format used to save
-
- * ``append``: Append contents of this :class:`DataFrame` to
existing data.
- * ``overwrite``: Overwrite existing data.
- * ``ignore``: Silently ignore this operation if data already
exists.
- * ``error`` (default case): Throw an exception if data already
exists.
+ :param outputMode: specifies how data of a streaming
DataFrame/Dataset is written to a
+ streaming sink.
+
+ * `append`:Only the new rows in the streaming
DataFrame/Dataset will be written to the
+ sink
+ * `complete`:All the rows in the streaming DataFrame/Dataset
will be written to the sink
+ every time these is some updates
+ * `update`:only the rows that were updated in the streaming
DataFrame/Dataset will be
+ written to the sink every time there are some updates. If
the query doesn't contain
+ aggregations, it will be equivalent to the `append` mode.
--- End diff --
ditto
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]