GitHub user marmbrus opened a pull request:
https://github.com/apache/spark/pull/2382
[SPARK-3414][SQL] Replace LowerCaseSchema with Resolver
_This PR is a follow up to #2293 (and to a lesser extent #2262 #2334)._
In #2293 the catalog was changed to store analyzed logical plans instead of
unresolved ones. While this change fixed the reported bug (which was caused by
yet another instance of us forgetting to put in a `LowerCaseSchema` operator)
it had the consequence of breaking assumptions made by `MultiInstanceRelation`.
Specifically, we can't replace swap out leaf operators in a tree without
rewriting changed expression ids (which happens when you self join the same RDD
that has been registered as a temp table).
In this PR, I instead remove the need to insert `LowerCaseSchema` operators
at all, and instead move the concern of matching up identifiers completely into
analysis. Doing so allows the test cases from both #2293 and #2262 to pass at
the same time (and likely fixes a slew of other "unknown unknown" bugs).
While it is rolled back in this PR, storing the analyzed plan might
actually be a good idea. For instance, it is kind of confusing if you register
a temporary table, change the case sensitivity of resolution and now you can't
query that table anymore. This can be addressed in a follow up PR.
Follow-ups:
- Configurable case sensitivity
- Consider storing analyzed plans for temp tables
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/marmbrus/spark lowercase
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2382.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2382
----
commit 19a61b93b2d3f835d1f1e286b2c91cfcf16371f8
Author: Michael Armbrust <[email protected]>
Date: 2014-08-27T19:43:38Z
Decrease partitions when testing
commit 76b3e04a7140fb4e06bc51827341659282bff2f8
Author: Michael Armbrust <[email protected]>
Date: 2014-08-27T20:40:48Z
increase test parallelism
commit fd7b671267687d46d4b7304cd5cdd44c5c926b2f
Author: Michael Armbrust <[email protected]>
Date: 2014-09-09T01:58:28Z
Make parquet tests less order dependent
commit dc7cb6ea81f5cd1da2247a43e8b4f81e78b52715
Author: Michael Armbrust <[email protected]>
Date: 2014-09-10T20:59:40Z
more test fixes
commit 9188b59b0b741376608b91dd71afdd3b90ac07f9
Author: Michael Armbrust <[email protected]>
Date: 2014-09-13T05:55:19Z
Merge branch 'shufflePartitions' into lowercase
commit c2f2ec8b5967b3e69efee4ba054afb96320ac673
Author: Michael Armbrust <[email protected]>
Date: 2014-09-13T19:00:29Z
Replace LowerCaseSchema with Resolver.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]