Gopal V created TEZ-2244:
----------------------------

             Summary: PipelinedSorter: Progressive allocation for sort-buffers
                 Key: TEZ-2244
                 URL: https://issues.apache.org/jira/browse/TEZ-2244
             Project: Apache Tez
          Issue Type: Improvement
    Affects Versions: 0.7.0
            Reporter: Gopal V


Currently, the sort buffers are allocated pessimistically for all tasks so that 
the largest task's spill stays within memory.

After the chained buffer implementation inside PipelinedSorter, it brings up 
the possibility of only allocating the first chunk of the sort buffer when the 
sorter starts up.

This allows for the tasks which do not heavily use the sort buffer (like a 
grouping aggregation) to use the sort-space only when the map-aggregation turns 
itself off.

Not reserving memory on startup hurts the worst-case scenario for the pipelined 
sorter, but improves the average case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to