GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/22529
[SPARK-25460][BRANCH-2.4][SS] DataSourceV2: SS sources do not respect
SessionConfigSupport
## What changes were proposed in this pull request?
This PR proposes to backport SPARK-25460 to branch-2.4:
This PR proposes to respect `SessionConfigSupport` in SS datasources as
well. Currently these are only respected in batch sources:
https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala#L198-L203
https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L244-L249
If a developer makes a datasource V2 that supports both structured
streaming and batch jobs, batch jobs respect a specific configuration, let's
say, URL to connect and fetch data (which end users might not be aware of);
however, structured streaming ends up with not supporting this (and should
explicitly be set into options).
## How was this patch tested?
Unit tests were added.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-25460-backport
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22529.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22529
----
commit b6f8880ad6bdbcb721ca0863502ec4b6c85b162c
Author: hyukjinkwon <gurwls223@...>
Date: 2018-09-20T12:22:55Z
[SPARK-25460][SS] DataSourceV2: SS sources do not respect
SessionConfigSupport
This PR proposes to respect `SessionConfigSupport` in SS datasources as
well. Currently these are only respected in batch sources:
https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala#L198-L203
https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L244-L249
If a developer makes a datasource V2 that supports both structured
streaming and batch jobs, batch jobs respect a specific configuration, let's
say, URL to connect and fetch data (which end users might not be aware of);
however, structured streaming ends up with not supporting this (and should
explicitly be set into options).
Unit tests were added.
Closes #22462 from HyukjinKwon/SPARK-25460.
Authored-by: hyukjinkwon <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]