[ 
https://issues.apache.org/jira/browse/TEZ-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960365#comment-15960365
 ] 

Rajesh Balamohan commented on TEZ-3673:
---------------------------------------

50% can be the default percentage when the spill would happen. 

With final merge avoidance, this could create lots of files for which we need 
http keep-alive (Need MAPREDUCE-6850 for this). Query-12 in TPC-H can be an 
example which uses this code path very heavily. Final merge avoidance helped a 
lot in the map side, but ended up taking time in the reducer side due to 
connection establishments.

> Allocate smaller buffers in UnorderedPartitionedKVWriter
> --------------------------------------------------------
>
>                 Key: TEZ-3673
>                 URL: https://issues.apache.org/jira/browse/TEZ-3673
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Harish Jaiprakash
>            Assignee: Harish Jaiprakash
>         Attachments: TEZ-3673.01.patch
>
>
> UnorderedPartitionedKVWriter allocates in bigger chunks. It may or may not 
> get filled up. In PipelinedSorter, we start off with 32MB chunks. But 
> UnorderedPartitionedKVWriter can be worse as it allocates bigger blocks. Need 
> to revisit this allocation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to