GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/21946

     [SPARK-24990][SQL] merge ReadSupport and ReadSupportWithSchema

    ## What changes were proposed in this pull request?
    
    Regarding user-specified schema, data sources may have 3 different 
behaviors:
    1. must have a user-specified schema
    2. can't have a user-specified schema
    3. can accept the user-specified if it's given, or infer the schema.
    
    I added `ReadSupportWithSchema` to support these behaviors, following data 
source v1. But it turns out we don't need this extra interface. We can just add 
a `createReader(schema, options)` to `ReadSupport` and make it call 
`createReader(options)` by default.
    
    The github currently has a problem with syncing the apache git repo, please 
review the second commit.
    
    TODO: also fix the streaming API in followup PRs.
    
    ## How was this patch tested?
    
    existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark ds-schema

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21946.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21946
    
----
commit defc54c69aadc510c6f77e13e57f003646c461bc
Author: Wenchen Fan <wenchen@...>
Date:   2018-08-01T13:39:35Z

    [SPARK-24971][SQL] remove SupportsDeprecatedScanRow
    
    ## What changes were proposed in this pull request?
    
    This is a follow up of https://github.com/apache/spark/pull/21118 .
    
    In https://github.com/apache/spark/pull/21118 we added 
`SupportsDeprecatedScanRow`. Ideally data source should produce `InternalRow` 
instead of `Row` for better performance. We should remove 
`SupportsDeprecatedScanRow` and encourage data sources to produce 
`InternalRow`, which is also very easy to build.
    
    ## How was this patch tested?
    
    existing tests.
    
    Author: Wenchen Fan <wenc...@databricks.com>
    
    Closes #21921 from cloud-fan/row.

commit 19808d500a869114d84383f23056483316e52a33
Author: Wenchen Fan <wenchen@...>
Date:   2018-07-31T18:06:16Z

    merge ReadSupport and ReadSupportWithSchema

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to