GitHub user sitalkedia opened a pull request:
https://github.com/apache/spark/pull/13699
[SPARK-15958] Make initial buffer size for the Sorter configurable
## What changes were proposed in this pull request?
Currently the initial buffer size in the sorter is hard coded inside the
code and is too small for large workload. As a result, the sorter spends
significant time expanding the buffer size and copying the data. It would be
useful to have it configurable.
## How was this patch tested?
Tested by running a job on the cluster.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sitalkedia/spark config_sort_buffer_upstream
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/13699.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #13699
----
commit 82e540c00c85f0c9f2d9d34c37420752fce2d18f
Author: Sital Kedia <[email protected]>
Date: 2016-06-15T00:56:18Z
[SPARK-15958] Make initial buffer size for the Sorter configurable
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]