[
https://issues.apache.org/jira/browse/TEZ-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285638#comment-14285638
]
Rajesh Balamohan commented on TEZ-1803:
---------------------------------------
- Ran terasort (500 GB) on test cluster.
- Changed tez.runtime.io.sort.mb with 4 GB container (i.e io.sort.mb with 1200
and 2500 respectively). Couldn't test with higher container size due to some
other issue, raised a JIRA separately for that.
- With (tez.runtime.io.sort.mb = 2134 and pipelinedsorter with 2 threads), Map
Phase time: 257 secs
- With (tez.runtime.io.sort.mb = 1200 and pipelinedsorter with 2 threads), Map
Phase time: 294 secs
- Effectively there is a 12-13% runtime improvement.
- Apart from this, there is good amount of savings in disk spills. Will attach
the tez-ui counters page separately here.
- Ran teravalidate benchmark to validate the results.
> Support > 2gb sort buffer in pipelinedsorter
> --------------------------------------------
>
> Key: TEZ-1803
> URL: https://issues.apache.org/jira/browse/TEZ-1803
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Labels: performance
> Attachments: TEZ-1803.1.patch, TEZ-1803.2.patch, TEZ-1803.3.patch,
> TEZ-1803.4.patch, TEZ-1803.5.patch, TEZ-1803.6.patch, TEZ-1803.WIP.1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)