[GitHub] spark pull request #22009: [SPARK-24882][SQL] improve data source v2 API

rdblue Tue, 07 Aug 2018 11:21:44 -0700

Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22009#discussion_r208335523
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ContinuousReadSupportProvider.java
 ---
    @@ -20,17 +20,30 @@
     import java.util.Optional;
     
     import org.apache.spark.annotation.InterfaceStability;
    -import org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader;
    +import org.apache.spark.sql.sources.v2.reader.ScanConfig;
    +import 
org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReadSupport;
    +import org.apache.spark.sql.sources.v2.reader.streaming.Offset;
     import org.apache.spark.sql.types.StructType;
     
     /**
      * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
    - * provide data reading ability for continuous stream processing.
    + * provide data reading ability for stream processing(continuous mode).
      */
     @InterfaceStability.Evolving
    -public interface ContinuousReadSupport extends DataSourceV2 {
    +public interface ContinuousReadSupportProvider extends DataSourceV2 {
    +
       /**
    -   * Creates a {@link ContinuousReader} to scan the data from this data 
source.
    +   * Creates a {@link ContinuousReadSupport} to scan the data from this 
streaming data source.
    +   *
    +   * The execution engine will create a {@link ContinuousReadSupport} at 
the start of a streaming
    +   * query, alternate calls to {@link 
ContinuousReadSupport#newScanConfigBuilder(Offset)}
    --- End diff --
    
    I think this should be more clear about normal operation and 
reconfiguration. It should say that Spark will call `newScanConfigBuilder` and 
will use that `ReadSupport` instance for the duration of the streaming app or 
until `needsReconfiguration` is true. Reconfiguration starts over calling 
`newScanConfigBuilder`. That would be more clear for implementers to understand.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22009: [SPARK-24882][SQL] improve data source v2 API

Reply via email to