GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/21946
[SPARK-24990][SQL] merge ReadSupport and ReadSupportWithSchema ## What changes were proposed in this pull request? Regarding user-specified schema, data sources may have 3 different behaviors: 1. must have a user-specified schema 2. can't have a user-specified schema 3. can accept the user-specified if it's given, or infer the schema. I added `ReadSupportWithSchema` to support these behaviors, following data source v1. But it turns out we don't need this extra interface. We can just add a `createReader(schema, options)` to `ReadSupport` and make it call `createReader(options)` by default. The github currently has a problem with syncing the apache git repo, please review the second commit. TODO: also fix the streaming API in followup PRs. ## How was this patch tested? existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark ds-schema Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21946.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21946 ---- commit defc54c69aadc510c6f77e13e57f003646c461bc Author: Wenchen Fan <wenchen@...> Date: 2018-08-01T13:39:35Z [SPARK-24971][SQL] remove SupportsDeprecatedScanRow ## What changes were proposed in this pull request? This is a follow up of https://github.com/apache/spark/pull/21118 . In https://github.com/apache/spark/pull/21118 we added `SupportsDeprecatedScanRow`. Ideally data source should produce `InternalRow` instead of `Row` for better performance. We should remove `SupportsDeprecatedScanRow` and encourage data sources to produce `InternalRow`, which is also very easy to build. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenc...@databricks.com> Closes #21921 from cloud-fan/row. commit 19808d500a869114d84383f23056483316e52a33 Author: Wenchen Fan <wenchen@...> Date: 2018-07-31T18:06:16Z merge ReadSupport and ReadSupportWithSchema ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org