[GitHub] [spark] HeartSaVioR commented on a change in pull request #29767: [SPARK-32896][SS] Add DataStreamWriter.table API

GitBox Tue, 06 Oct 2020 04:06:12 -0700


HeartSaVioR commented on a change in pull request #29767:
URL: https://github.com/apache/spark/pull/29767#discussion_r500188785




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
##########
@@ -457,6 +470,17 @@ final class DataStreamWriter[T] private[sql](ds: 
Dataset[T]) {
     foreachBatch((batchDs: Dataset[T], batchId: Long) => 
function.call(batchDs, batchId))
   }
 
+  /**
+   * Specifies the underlying output table.
+   *
+   * @since 3.1.0
+   */
+  def table(tableName: String): DataStreamWriter[T] = {

Review comment:
       I have a bit different view on DataStreamWriter (and probably 
DataFrameWriter as well):
   
   While we don't restrict the order, actually I think it's pretty much natural 
to have a flow, like `define a sink` -> `set options to the sink` -> `set 
options to the streaming query` -> `start the query`. (A couple of parts can be 
consolidated or the sequence can be swapped.)
   
   ```
   df.writeStream
      .format("...")
      .option("...")
      .outputMode(...)
      .trigger(...)
      .start()
   ```
   
   Now it looks to be simply arbitrary and something got mixed up. 
`checkpointLocation` isn't something being tied to the sink but we let end 
users to put into `option` which is also used for sink. `queryName` as well.
   
   I intended the addition of `table` method as `defining a sink`, but if we'd 
like to care for tables specially, `DataFrameWriter.insertInto` would match the 
intention and I can change the method name to `insertInto` here as well. WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a change in pull request #29767: [SPARK-32896][SS] Add DataStreamWriter.table API

Reply via email to