[ 
https://issues.apache.org/jira/browse/TEZ-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166857#comment-17166857
 ] 

Rajesh Balamohan edited comment on TEZ-4208 at 7/29/20, 3:59 AM:
-----------------------------------------------------------------

Q67 runtime with/without patch in internal cluster @ 10 TB scale:
|| ||Without Patch||With Patch||
|Job Runtime (in seconds)|1961.63 s|1656.14 s|
|TaskCounter_Map_1_OUTPUT_Reducer_2:|x|x |
|OUTPUT_BYTES_PHYSICAL: |457771151796|311823523913|
|OUTPUT_RECORDS:|20169930972|20169930972|
|SHUFFLE_CHUNK_COUNT:|37776|5193|


was (Author: rajesh.balamohan):
Q67 runtime with/without patch in internal cluster @ 10 TB scale:
|| ||Without Patch||With Patch||
|Job Runtime (in seconds)|1961.63 s|1656.14 s|
|TaskCounter_Map_1_OUTPUT_Reducer_2:|
 
| |
|OUTPUT_BYTES_PHYSICAL: |457771151796|311823523913|
|OUTPUT_RECORDS:|20169930972|20169930972|
|SHUFFLE_CHUNK_COUNT:|37776|5193|

> Pipelinesorter uses single SortSpan after spill
> -----------------------------------------------
>
>                 Key: TEZ-4208
>                 URL: https://issues.apache.org/jira/browse/TEZ-4208
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Priority: Major
>         Attachments: TEZ-4208.1.patch, q67_sorter.log
>
>
> Though it could have created multiple spans, tez always uses the first span 
> after spill. It is quite possible that other spans are bigger compared to the 
> first one, due to progressive space allocation.  Fixing this would help in 
> reducing the number of spills (depending on the jobs) and lesser load for 
> indexcache entries (as lesser number of files have to be opened).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to