[
https://issues.apache.org/jira/browse/TEZ-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960365#comment-15960365
]
Rajesh Balamohan commented on TEZ-3673:
---------------------------------------
50% can be the default percentage when the spill would happen.
With final merge avoidance, this could create lots of files for which we need
http keep-alive (Need MAPREDUCE-6850 for this). Query-12 in TPC-H can be an
example which uses this code path very heavily. Final merge avoidance helped a
lot in the map side, but ended up taking time in the reducer side due to
connection establishments.
> Allocate smaller buffers in UnorderedPartitionedKVWriter
> --------------------------------------------------------
>
> Key: TEZ-3673
> URL: https://issues.apache.org/jira/browse/TEZ-3673
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Harish Jaiprakash
> Assignee: Harish Jaiprakash
> Attachments: TEZ-3673.01.patch
>
>
> UnorderedPartitionedKVWriter allocates in bigger chunks. It may or may not
> get filled up. In PipelinedSorter, we start off with 32MB chunks. But
> UnorderedPartitionedKVWriter can be worse as it allocates bigger blocks. Need
> to revisit this allocation.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)