GitHub user davies reopened a pull request:
https://github.com/apache/spark/pull/7592
[SPARK-9247] [SQL] Use BytesToBytesMap for broadcast join
This PR introduce BytesToBytesMap to UnsafeHashedRelation, use it in
executor for better performance.
It serialize all the key and values from java HashMap, put them into a
BytesToBytesMap while deserializing. All the values for a same key are stored
continuous to have better memory locality.
This PR also address the comments for #7480 , do some clean up.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark unsafe_map2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7592.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7592
----
commit fc221e078307d32fb0333d59e6df29e06474fbc4
Author: Davies Liu <[email protected]>
Date: 2015-07-22T08:23:02Z
use BytesToBytesMap for broadcast join
commit 46f1f227d6db5064de570a918c9f9aac7695e066
Author: Davies Liu <[email protected]>
Date: 2015-07-23T03:01:07Z
fix style
commit 5eb1b5a63aebc1d32911787b45198c46f7abe219
Author: Davies Liu <[email protected]>
Date: 2015-07-23T04:08:14Z
address comments in #7480
commit 1c5ad8dc04c12513eab2345064aa2de1bfb0873f
Author: Davies Liu <[email protected]>
Date: 2015-07-23T08:07:04Z
fix test
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]