Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/spark/pull/21477#discussion_r193289567
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/ForeachWriter.scala
---
@@ -20,10 +20,48 @@ package org.apache.spark.sql
import org.apache.spark.annotation.InterfaceStability
/**
- * A class to consume data generated by a `StreamingQuery`. Typically this
is used to send the
- * generated data to external systems. Each partition will use a new
deserialized instance, so you
- * usually should do all the initialization (e.g. opening a connection or
initiating a transaction)
- * in the `open` method.
+ * The abstract class for writing custom logic to process data generated
by a query.
+ * This is often used to write the output of a streaming query to
arbitrary storage systems.
--- End diff --
Looks like doc is duplicated between `foreach()` and `ForeachWriter`. I'm
not sure how we can leave some reference on Python doc instead of duplicating
content, but even Python doc doesn't support some kind of reference, some part
of content seems to be OK to be placed to either place, not both.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]