[ 
https://issues.apache.org/jira/browse/TEZ-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2244:
----------------------------------
    Attachment: TEZ-2244.6.patch

Made minor changes to comments.

> PipelinedSorter: Progressive allocation for sort-buffers
> --------------------------------------------------------
>
>                 Key: TEZ-2244
>                 URL: https://issues.apache.org/jira/browse/TEZ-2244
>             Project: Apache Tez
>          Issue Type: Improvement
>    Affects Versions: 0.7.0
>            Reporter: Gopal V
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2244.1.patch, TEZ-2244.2.patch, TEZ-2244.3.patch, 
> TEZ-2244.4.patch, TEZ-2244.5.patch, TEZ-2244.6.patch, TEZ-2244.WIP.patch
>
>
> Currently, the sort buffers are allocated pessimistically for all tasks so 
> that the largest task's spill stays within memory.
> After the chained buffer implementation inside PipelinedSorter, it brings up 
> the possibility of only allocating the first chunk of the sort buffer when 
> the sorter starts up.
> This allows for the tasks which do not heavily use the sort buffer (like a 
> grouping aggregation) to use the sort-space only when the map-aggregation 
> turns itself off.
> Not reserving memory on startup hurts the worst-case scenario for the 
> pipelined sorter, but improves the average case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to