[ 
https://issues.apache.org/jira/browse/TEZ-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3680:
----------------------------------
    Attachment: TEZ-3680.2.patch

Unordered*Writer was honoring pipelinedshuffle earlier. It is possible that 
pipelined shuffle is disabled as the config is common for sorted/unsorted 
outputs. However, in the case of unordered*writer, 
{{TEZ_RUNTIME_ENABLE_FINAL_MERGE_IN_OUTPUT}} can be considered as well for 
disabling final merge. Addressed this in the current patch. 

> Optimizations to UnorderedPartitionedKVWriter
> ---------------------------------------------
>
>                 Key: TEZ-3680
>                 URL: https://issues.apache.org/jira/browse/TEZ-3680
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>         Attachments: profiler.png, TEZ-3680.1.patch, TEZ-3680.2.patch
>
>
> 1. Consider increasing the number of threads in spill executor. 
> {{TEZ_RUNTIME_UNORDERED_OUTPUT_MAX_PER_BUFFER_SIZE_BYTES}} can be used to 
> configure the buffer size. If smaller buffer sizes are provided, there is a 
> chance of getting frequent spills; currently the spill executor operates in 
> single threaded mode.
> 2. During profiling, things like incrementing the counters, notifying 
> progress came up. This may not be common in regular tez jobs. But in 
> processes like LLAP (hive based), it is possible to get into such situations. 
> I will attach the profiler snapshot showing this. It would be good to 
> update/notify less frequently.
> 3. Optimize mergeAll().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to