GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/9422
[SPARK-10429] [SQL] make mutableProjection atomic
Right now, SQL's mutable projection updates every value of the mutable
project after it evaluates the corresponding expression. This makes the
behavior of MutableProjection confusing and complicate the implementation of
common aggregate functions like stddev because developers need to be aware that
when evaluating {{i+1}}th expression of a mutable projection, {{i}}th slot of
the mutable row has already been updated.
This PR make the MutableProjection atomic, by generating all the results of
expressions first, then copy them into mutableRow.
Had run a mircro-benchmark, there is no notable performance difference
between using class members and local variables.
cc @yhuai
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark atomic_mutable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9422.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9422
----
commit 28916288d01def045eb2295ea3c9f680fb5b0b01
Author: Davies Liu <[email protected]>
Date: 2015-11-02T22:48:15Z
make mutableProjection atomic
commit bec07dadf159246e3032e38dde64e5a9edf2756f
Author: Davies Liu <[email protected]>
Date: 2015-11-02T23:07:31Z
refactor
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]