[GitHub] arunmahadevan commented on a change in pull request #23430: [SPARK-26520][SQL] data source v2 API refactor (micro-batch read)

GitBox Thu, 03 Jan 2019 10:35:33 -0800

arunmahadevan commented on a change in pull request #23430: [SPARK-26520][SQL] 
data source v2 API refactor (micro-batch read)
URL: https://github.com/apache/spark/pull/23430#discussion_r245091849


 ##########
 File path: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Scan.java
 ##########
 @@ -65,4 +67,20 @@ default String description() {
   default Batch toBatch() {
     throw new UnsupportedOperationException("Batch scans are not supported");
   }
+
+  /**
+   * Returns the physical representation of this scan for streaming query with 
micro-batch mode. By
+   * default this method throws exception, data sources must overwrite this 
method to provide an
+   * implementation, if the {@link Table} that creates this scan implements
+   * {@link SupportsMicroBatchRead}.
+   *
+   * @param checkpointLocation a path to Hadoop FS scratch space that can be 
used for failure
+   *                           recovery. Data streams for the same logical 
source in the same query
+   *                           will be given the same checkpointLocation.
+   *
+   * @throws UnsupportedOperationException
+   */
+  default MicroBatchStream toMicroBatchStream(String checkpointLocation) {
 
 Review comment:
   In "alternative1" there is no equivalent Logical `Scan`? I was thinking we 
need the `Scan` (the logical scan) separate from physical scans.
   
   Also if they don't inherit a common parent can it be passed to the 
DatasourceV2ScanExec ? 
   
   Anyways better to relook and rename as appropriate to keep the different 
ones (batch/micro-batch/continuous) have common pre/suffixes and denote what 
they mean.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] arunmahadevan commented on a change in pull request #23430: [SPARK-26520][SQL] data source v2 API refactor (micro-batch read)

Reply via email to