Gopal V created TEZ-2244:
----------------------------
Summary: PipelinedSorter: Progressive allocation for sort-buffers
Key: TEZ-2244
URL: https://issues.apache.org/jira/browse/TEZ-2244
Project: Apache Tez
Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Gopal V
Currently, the sort buffers are allocated pessimistically for all tasks so that
the largest task's spill stays within memory.
After the chained buffer implementation inside PipelinedSorter, it brings up
the possibility of only allocating the first chunk of the sort buffer when the
sorter starts up.
This allows for the tasks which do not heavily use the sort buffer (like a
grouping aggregation) to use the sort-space only when the map-aggregation turns
itself off.
Not reserving memory on startup hurts the worst-case scenario for the pipelined
sorter, but improves the average case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)