GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/20603
[SPARK-23418][SQL]: Fail DataSourceV2 reads when user schema is passed, but
not supported.
## What changes were proposed in this pull request?
DataSourceV2 initially allowed user-supplied schemas when a source doesn't
implement `ReadSupportWithSchema`, as long as the schema was identical to the
source's schema. This is confusing behavior because changes to an underlying
table can cause a previously working job to fail with an exception that
user-supplied schemas are not allowed.
This reverts commit adcb25a0624, which was added to #20387 so that it could
be removed in a separate JIRA issue and PR.
## How was this patch tested?
Existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rdblue/spark SPARK-23418-revert-adcb25a0624
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20603.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20603
----
commit 8ab22313a30dc86ad6464248f99fa704ec82ef39
Author: Ryan Blue <blue@...>
Date: 2018-01-17T21:58:12Z
SPARK-22386: DataSourceV2: Use immutable logical plans.
commit 6ac33cd3b3daf37ff46797f43c7227f56cb3dad0
Author: Ryan Blue <blue@...>
Date: 2018-01-24T19:34:42Z
SPARK-23203: Fix scala style check.
commit 466dfd1e695f346fcf986f8292d8a387ab1fb957
Author: Ryan Blue <blue@...>
Date: 2018-01-24T19:54:10Z
SPARK-23203: Fix Kafka tests, use StreamingDataSourceV2Relation.
This also removes unused imports.
commit c9e226de93d6a461272f817cbc4f8d5fb703f46d
Author: Ryan Blue <blue@...>
Date: 2018-02-02T20:30:33Z
SPARK-23204: DataFrameReader: Remove v2 table identifier parsing.
commit 580243fe2d43662015cc9bd4ef715a6b7d162628
Author: Ryan Blue <blue@...>
Date: 2018-02-02T21:48:29Z
SPARK-23203: Remove import changes from DataSourceV2Utils.
commit b97379d7f57d16aff032fecd1915c32070332b90
Author: Ryan Blue <blue@...>
Date: 2018-02-06T19:21:08Z
SPARK-23203: Remove TableIdentifier from DataSourceV2Relation.
commit c7326a37e2c737f94bb46b73b6050cfa4fdcebf9
Author: Ryan Blue <blue@...>
Date: 2018-02-07T17:22:10Z
Remove path from DataSourceV2Relation.
commit df00376f12f5aeb5de40d1c7aa0105bddae12983
Author: Ryan Blue <blue@...>
Date: 2018-02-09T19:09:55Z
Implement doCanonicalize in DataSourceV2Relation.
commit 98814910b843121fb974ed4cf637292cda41a7f1
Author: Ryan Blue <blue@...>
Date: 2018-02-09T19:12:41Z
Remove write methods from DataSourceV2Relation.
commit 6e88a9fbb35ed9687e042e643d83e28633406eed
Author: Ryan Blue <blue@...>
Date: 2018-02-09T21:04:16Z
Remove unnecessary imports.
commit 57e05c2babbcaec3ed3aa69765e1145539879c97
Author: Ryan Blue <blue@...>
Date: 2018-02-09T23:52:42Z
Add DataSourceV2Relation.create to ensure projection is always set.
commit aaa1e0395dccf69b51ee540173eceddb9cb8e267
Author: Ryan Blue <blue@...>
Date: 2018-02-13T21:24:16Z
Rename userSchema -> userSpecifiedSchema.
commit adcb25a06240dc413f58b2d1240405b0a5485578
Author: Ryan Blue <blue@...>
Date: 2018-02-13T21:57:42Z
Allow some user-supplied schemas without ReadSupportWithSchema.
This is going to be removed in a follow-up commit, so it is as
self-contained as possible. This allows user-supplied schemas when
ReadSupportWithSchema is not implemented as long as the supplied schema
is identical to the reader's schema.
commit f623080309dcf75525af6b8369e7427c31589afa
Author: Ryan Blue <blue@...>
Date: 2018-02-13T22:05:30Z
Revert "Allow some user-supplied schemas without ReadSupportWithSchema."
This reverts commit adcb25a06240dc413f58b2d1240405b0a5485578.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]