[ 
https://issues.apache.org/jira/browse/TEZ-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285638#comment-14285638
 ] 

Rajesh Balamohan commented on TEZ-1803:
---------------------------------------

- Ran terasort (500 GB) on test cluster. 
- Changed tez.runtime.io.sort.mb with 4 GB container (i.e io.sort.mb with 1200 
and 2500 respectively).  Couldn't test with higher container size due to some 
other issue, raised a JIRA separately for that.
- With (tez.runtime.io.sort.mb = 2134 and pipelinedsorter with 2 threads), Map 
Phase time: 257 secs
- With (tez.runtime.io.sort.mb = 1200 and pipelinedsorter with 2 threads), Map 
Phase time: 294 secs
- Effectively there is a 12-13% runtime improvement.
- Apart from this, there is good amount of savings in disk spills.  Will attach 
the tez-ui counters page separately here.
- Ran teravalidate benchmark to validate the results.

> Support > 2gb sort buffer in pipelinedsorter
> --------------------------------------------
>
>                 Key: TEZ-1803
>                 URL: https://issues.apache.org/jira/browse/TEZ-1803
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: performance
>         Attachments: TEZ-1803.1.patch, TEZ-1803.2.patch, TEZ-1803.3.patch, 
> TEZ-1803.4.patch, TEZ-1803.5.patch, TEZ-1803.6.patch, TEZ-1803.WIP.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to