GitHub user marmbrus opened a pull request:
https://github.com/apache/spark/pull/8285
[SPARK-10093][SQL] Avoid transformation on executors.
This is kind of a weird case, but given a sufficiently complex query plan
(in this case a `TungstenProject` with an `Exchange` underneath), we could have
NPEs on the executors due to the time when we were calling
`transformAllExpressions`
In general we should ensure that all transformations occur on the driver
and not on the executors. Some reasons for avoid executor side transformations
include:
- (this case) Some operator constructors require state such as access to
the Spark/SQL conf so doing a `makeCopy` on the executor can fail.
- (unrelated reason for avoid executor transformations) `ExprIds` are
calculated using an atomic integer, so you can violate their uniqueness
constraint by constructing them anywhere other than the driver.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/marmbrus/spark transformDriver
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8285.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8285
----
commit 8d3b048f05234da6379a228e73bbebafd1dc97dc
Author: Michael Armbrust <[email protected]>
Date: 2015-08-18T20:19:39Z
[SPARK-10093][SQL] Avoid transformation on executors.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]