[ 
https://issues.apache.org/jira/browse/TEZ-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3680:
----------------------------------
    Description: 
1. Consider increasing the number of threads in spill executor. 
{{TEZ_RUNTIME_UNORDERED_OUTPUT_MAX_PER_BUFFER_SIZE_BYTES}} can be used to 
configure the buffer size. If smaller buffer sizes are provided, there is a 
chance of getting frequent spills; currently the spill executor operates in 
single threaded mode.

2. During profiling, things like incrementing the counters, notifying progress 
came up. This may not be common in regular tez jobs. But in processes like LLAP 
(hive based), it is possible to get into such situations. I will attach the 
profiler snapshot showing this. It would be good to update/notify less 
frequently.

3. Optimize mergeAll().

  was:

1. Consider increasing the number of threads in spill executor. 
{{TEZ_RUNTIME_UNORDERED_OUTPUT_MAX_PER_BUFFER_SIZE_BYTES}} can be used to 
configure the buffer size. If smaller buffer sizes are provided, there is a 
chance of getting frequent spills; currently the spill executor operates in 
single threaded mode.

2. During profiling, things like incrementing the counters, notifying progress 
came up. This may not be common in regular tez jobs. But in processes like LLAP 
(hive based), it is possible to get into such situations. I will attach the 
profiler snapshot showing this. It would be good to update/notify less 
frequently.


> Optimizations to UnorderedPartitionedKVWriter
> ---------------------------------------------
>
>                 Key: TEZ-3680
>                 URL: https://issues.apache.org/jira/browse/TEZ-3680
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>         Attachments: profiler.png
>
>
> 1. Consider increasing the number of threads in spill executor. 
> {{TEZ_RUNTIME_UNORDERED_OUTPUT_MAX_PER_BUFFER_SIZE_BYTES}} can be used to 
> configure the buffer size. If smaller buffer sizes are provided, there is a 
> chance of getting frequent spills; currently the spill executor operates in 
> single threaded mode.
> 2. During profiling, things like incrementing the counters, notifying 
> progress came up. This may not be common in regular tez jobs. But in 
> processes like LLAP (hive based), it is possible to get into such situations. 
> I will attach the profiler snapshot showing this. It would be good to 
> update/notify less frequently.
> 3. Optimize mergeAll().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to