GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/7741
[SPARK-9411] [SQL] [WIP] Make Tungsten page sizes configurable
We need to make page sizes configurable so we can reduce them in unit tests
and increase them in real production workloads.
The following hardcoded page sizes need to be updated:
- [ ] Spark Core: UnsafeShuffleExternalSorter.PAGE_SIZE
- [ ] Spark SQL: UnsafeExternalSorter.PAGE_SIZE
- [x] Unsafe: BytesToBytesMap.PAGE_SIZE_BYTES
While updating the page sizes, we should also update certain size
calculations which are based on the page size so that they do not assume that
all pages are the same size. This isn't strictly necessary in this patch but
should be done eventually as part of supporting overflow pages for large
records.
A number of unit tests also need to be updated to account for the new page
sizes.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark SPARK-9411
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7741.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7741
----
commit e61485828c4136660fdcdf9995ec4e1a0ca3105b
Author: Josh Rosen <[email protected]>
Date: 2015-07-29T00:44:07Z
Makes BytesToBytesMap page size configurable
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]