GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/20448
[SPARK-23203][SQL] make DataSourceV2Relation immutable
## What changes were proposed in this pull request?
This is inspired by https://github.com/apache/spark/pull/20387, but only
focus on making the plan immutable.
The idea is simple, instead of keeping the mutable `DataSourceReader` in
the plan, we should keep `DataSourceV2`, and create the reader when needed. The
pushdown information will be stored in the plan, instead of relying on the
mutable reader.
This can also help us removing 2 unnecessary APIs from
`SupportsPushDownCatalystFilters` and `SupportsPushDownFilters`.
## How was this patch tested?
I improved the test in `DataSourceV2Suite`, to make sure this new change
doesn't break the column pruning and filter push down.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark immutable-plan
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20448.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20448
----
commit 7441334e210944e1419be4059d12c06385c586cb
Author: Wenchen Fan <wenchen@...>
Date: 2018-01-31T03:12:00Z
make DataSourceV2Relation immutable
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]