GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/21946
[SPARK-24990][SQL] merge ReadSupport and ReadSupportWithSchema
## What changes were proposed in this pull request?
Regarding user-specified schema, data sources may have 3 different
behaviors:
1. must have a user-specified schema
2. can't have a user-specified schema
3. can accept the user-specified if it's given, or infer the schema.
I added `ReadSupportWithSchema` to support these behaviors, following data
source v1. But it turns out we don't need this extra interface. We can just add
a `createReader(schema, options)` to `ReadSupport` and make it call
`createReader(options)` by default.
The github currently has a problem with syncing the apache git repo, please
review the second commit.
TODO: also fix the streaming API in followup PRs.
## How was this patch tested?
existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark ds-schema
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21946.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21946
----
commit defc54c69aadc510c6f77e13e57f003646c461bc
Author: Wenchen Fan <wenchen@...>
Date: 2018-08-01T13:39:35Z
[SPARK-24971][SQL] remove SupportsDeprecatedScanRow
## What changes were proposed in this pull request?
This is a follow up of https://github.com/apache/spark/pull/21118 .
In https://github.com/apache/spark/pull/21118 we added
`SupportsDeprecatedScanRow`. Ideally data source should produce `InternalRow`
instead of `Row` for better performance. We should remove
`SupportsDeprecatedScanRow` and encourage data sources to produce
`InternalRow`, which is also very easy to build.
## How was this patch tested?
existing tests.
Author: Wenchen Fan <[email protected]>
Closes #21921 from cloud-fan/row.
commit 19808d500a869114d84383f23056483316e52a33
Author: Wenchen Fan <wenchen@...>
Date: 2018-07-31T18:06:16Z
merge ReadSupport and ReadSupportWithSchema
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]