[ 
https://issues.apache.org/jira/browse/TEZ-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111794#comment-16111794
 ] 

Muhammad Samir Khan edited comment on TEZ-3809 at 8/2/17 10:07 PM:
-------------------------------------------------------------------

Took a heap dump on ordered word count before final merge. In the after case, 
one of the outputs was written to disk instead of kept in memory and that is 
why it has 37 entries. 

Before:

Class Name                                                                      
                               | Shallow Heap | Retained Heap | Percentage
-----------------------------------------------------------------------------------------------------------------------------------------------------------
java.lang.Thread @ 0x5d2c473f8  ShuffleAndMergeRunner {Tokenizer} Thread        
                               |          120 | 2,229,207,992 |     96.48%
|- java.util.ArrayList @ 0x73f978f10                                            
                               |           24 | 2,229,206,760 |     96.48%
|  '- java.lang.Object[38] @ 0x73f979130                                        
                               |          168 | 2,229,206,736 |     96.48%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a88898|           32 |    68,078,192 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0b260|           32 |    67,839,520 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a888b8|           32 |    67,700,608 |      2.93%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73f9db168|           32 |    67,500,816 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab36218|           32 |    67,408,704 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631deed28|           32 |    67,367,424 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x743b86ee0|           32 |    67,337,936 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af3a698|           32 |    67,300,896 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c5b8|           32 |    67,282,464 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab33140|           32 |    67,264,304 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a88878|           32 |    67,127,368 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0b218|           32 |    67,098,216 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c6c8|           32 |    67,064,504 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d239a6c8|           32 |    67,003,776 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23b7e10|           32 |    66,965,296 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631def2b8|           32 |    66,928,032 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab351d0|           32 |    66,916,896 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x74805dfb8|           32 |    66,886,272 |      2.89%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af39598|           32 |    66,718,800 |      2.89%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73fb0fb78|           32 |    66,688,296 |      2.89%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c4b0|           32 |    66,656,312 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af39578|           32 |    66,629,936 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631deec30|           32 |    66,584,576 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c680|           32 |    66,537,624 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af3a620|           32 |    66,529,416 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c570|           32 |    66,229,848 |      2.87%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73f9d8f08|           32 |    65,616,336 |      2.84%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af39500|           32 |    63,794,840 |      2.76%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a7ed98|           32 |    60,026,520 |      2.60%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab36280|           32 |    50,731,856 |      2.20%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23c0da8|           32 |    48,164,288 |      2.08%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23b85b8|           32 |    46,712,064 |      2.02%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23b8310|           32 |    46,564,520 |      2.02%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a7e528|           32 |    45,990,432 |      1.99%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60aec1cf0|           32 |    45,642,312 |      1.98%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23c0a48|           32 |     8,122,976 |      0.35%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab362a0|           32 |     3,712,608 |      0.16%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60aec1ca8|           32 |       481,784 |      0.02%
|     '- Total: 38 entries                                                      
                               |              |               |           
-----------------------------------------------------------------------------------------------------------------------------------------------------------

After:

Class Name                                                                      
                               | Shallow Heap | Retained Heap | Percentage
-----------------------------------------------------------------------------------------------------------------------------------------------------------
java.lang.Thread @ 0x5ce7a24e8  ShuffleAndMergeRunner {Tokenizer} Thread        
                               |          120 | 2,182,643,840 |     96.38%
|- java.util.ArrayList @ 0x7343b6ae8                                            
                               |           24 | 2,182,642,112 |     96.38%
|  '- java.lang.Object[37] @ 0x7343b6c18                                        
                               |          168 | 2,182,642,088 |     96.38%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767263f70|           32 |    68,078,192 |      3.01%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x719c6dbe8|           32 |    67,839,512 |      3.00%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7672080b0|           32 |    67,700,608 |      2.99%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767204a18|           32 |    67,500,816 |      2.98%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3138258|           32 |    67,408,704 |      2.98%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73cc9bb50|           32 |    67,367,416 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3140588|           32 |    67,337,928 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767262eb8|           32 |    67,300,888 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7155fa180|           32 |    67,282,464 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767232980|           32 |    67,264,296 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7632001d8|           32 |    67,127,360 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x738be08c8|           32 |    67,098,208 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767204c50|           32 |    67,064,496 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x76725fb38|           32 |    67,003,776 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e31303e8|           32 |    66,965,296 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x763200058|           32 |    66,928,024 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3140160|           32 |    66,916,888 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x715866b18|           32 |    66,886,272 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x721d7d950|           32 |    66,718,792 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x719c6b988|           32 |    66,688,296 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x734290ec8|           32 |    66,656,312 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce73f150|           32 |    66,629,936 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767261d48|           32 |    66,584,576 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x721d7b6f0|           32 |    66,537,624 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7632000d0|           32 |    66,529,416 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767204dc0|           32 |    66,229,848 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7349ccf30|           32 |    65,616,336 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767260ca8|           32 |    63,794,832 |      2.82%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x76725f9f0|           32 |    60,026,512 |      2.65%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e31405c8|           32 |    50,731,848 |      2.24%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce73a870|           32 |    48,164,288 |      2.13%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce7525d8|           32 |    46,712,056 |      2.06%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e053c7d0|           32 |    45,990,432 |      2.03%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce74ae60|           32 |    45,642,312 |      2.02%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e317aba0|           32 |     8,122,976 |      0.36%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e316e3f8|           32 |     3,712,600 |      0.16%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3177a48|           32 |       481,784 |      0.02%
|     '- Total: 37 entries                                                      
                               |              |               |           
-----------------------------------------------------------------------------------------------------------------------------------------------------------



was (Author: samkhan):
Took a heap dump on ordered word count before final merge. In the after case, 
one of the outputs was written to disk instead of kept in memory and that is 
why it has 37 entries. 

Before:

Class Name                                                                      
                               | Shallow Heap | Retained Heap | Percentage
-----------------------------------------------------------------------------------------------------------------------------------------------------------
java.lang.Thread @ 0x5d2c473f8  ShuffleAndMergeRunner {Tokenizer} Thread        
                               |          120 | 2,229,207,992 |     96.48%
|- java.util.ArrayList @ 0x73f978f10                                            
                               |           24 | 2,229,206,760 |     96.48%
|  '- java.lang.Object[38] @ 0x73f979130                                        
                               |          168 | 2,229,206,736 |     96.48%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a88898|           32 |    68,078,192 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0b260|           32 |    67,839,520 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a888b8|           32 |    67,700,608 |      2.93%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73f9db168|           32 |    67,500,816 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab36218|           32 |    67,408,704 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631deed28|           32 |    67,367,424 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x743b86ee0|           32 |    67,337,936 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af3a698|           32 |    67,300,896 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c5b8|           32 |    67,282,464 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab33140|           32 |    67,264,304 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a88878|           32 |    67,127,368 |      2.91%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0b218|           32 |    67,098,216 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c6c8|           32 |    67,064,504 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d239a6c8|           32 |    67,003,776 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23b7e10|           32 |    66,965,296 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631def2b8|           32 |    66,928,032 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab351d0|           32 |    66,916,896 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x74805dfb8|           32 |    66,886,272 |      2.89%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af39598|           32 |    66,718,800 |      2.89%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73fb0fb78|           32 |    66,688,296 |      2.89%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c4b0|           32 |    66,656,312 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af39578|           32 |    66,629,936 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631deec30|           32 |    66,584,576 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c680|           32 |    66,537,624 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af3a620|           32 |    66,529,416 |      2.88%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x631e0c570|           32 |    66,229,848 |      2.87%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73f9d8f08|           32 |    65,616,336 |      2.84%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60af39500|           32 |    63,794,840 |      2.76%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a7ed98|           32 |    60,026,520 |      2.60%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab36280|           32 |    50,731,856 |      2.20%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23c0da8|           32 |    48,164,288 |      2.08%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23b85b8|           32 |    46,712,064 |      2.02%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23b8310|           32 |    46,564,520 |      2.02%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e4a7e528|           32 |    45,990,432 |      1.99%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60aec1cf0|           32 |    45,642,312 |      1.98%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5d23c0a48|           32 |     8,122,976 |      0.35%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60ab362a0|           32 |     3,712,608 |      0.16%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x60aec1ca8|           32 |       481,784 |      0.02%
|     '- Total: 38 entries                                                      
                               |              |               |           
-----------------------------------------------------------------------------------------------------------------------------------------------------------

After:

Class Name                                                                      
                               | Shallow Heap | Retained Heap | Percentage
-----------------------------------------------------------------------------------------------------------------------------------------------------------
java.lang.Thread @ 0x5ce7a24e8  ShuffleAndMergeRunner {Tokenizer} Thread        
                               |          120 | 2,182,643,840 |     96.38%
|- java.util.ArrayList @ 0x7343b6ae8                                            
                               |           24 | 2,182,642,112 |     96.38%
|  '- java.lang.Object[37] @ 0x7343b6c18                                        
                               |          168 | 2,182,642,088 |     96.38%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767263f70|           32 |    68,078,192 |      3.01%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x719c6dbe8|           32 |    67,839,512 |      3.00%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7672080b0|           32 |    67,700,608 |      2.99%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767204a18|           32 |    67,500,816 |      2.98%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3138258|           32 |    67,408,704 |      2.98%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x73cc9bb50|           32 |    67,367,416 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3140588|           32 |    67,337,928 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767262eb8|           32 |    67,300,888 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7155fa180|           32 |    67,282,464 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767232980|           32 |    67,264,296 |      2.97%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7632001d8|           32 |    67,127,360 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x738be08c8|           32 |    67,098,208 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767204c50|           32 |    67,064,496 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x76725fb38|           32 |    67,003,776 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e31303e8|           32 |    66,965,296 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x763200058|           32 |    66,928,024 |      2.96%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3140160|           32 |    66,916,888 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x715866b18|           32 |    66,886,272 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x721d7d950|           32 |    66,718,792 |      2.95%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x719c6b988|           32 |    66,688,296 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x734290ec8|           32 |    66,656,312 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce73f150|           32 |    66,629,936 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767261d48|           32 |    66,584,576 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x721d7b6f0|           32 |    66,537,624 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7632000d0|           32 |    66,529,416 |      2.94%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767204dc0|           32 |    66,229,848 |      2.92%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x7349ccf30|           32 |    65,616,336 |      2.90%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x767260ca8|           32 |    63,794,832 |      2.82%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x76725f9f0|           32 |    60,026,512 |      2.65%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e31405c8|           32 |    50,731,848 |      2.24%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce73a870|           32 |    48,164,288 |      2.13%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce7525d8|           32 |    46,712,056 |      2.06%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e053c7d0|           32 |    45,990,432 |      2.03%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5ce74ae60|           32 |    45,642,312 |      2.02%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e317aba0|           32 |     8,122,976 |      0.36%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e316e3f8|           32 |     3,712,600 |      0.16%
|     |- 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput$InMemoryMapOutput
 @ 0x5e3177a48|           32 |       481,784 |      0.02%
|     '- Total: 37 entries                                                      
                               |              |               |           
-----------------------------------------------------------------------------------------------------------------------------------------------------------

s

> The buffer size allocated for InMemoryMapOutput can be optimized
> ----------------------------------------------------------------
>
>                 Key: TEZ-3809
>                 URL: https://issues.apache.org/jira/browse/TEZ-3809
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Muhammad Samir Khan
>            Assignee: Muhammad Samir Khan
>         Attachments: TEZ-3809.001.patch
>
>
> Related jiras: TEZ-3752 and TEZ-3732.
> -When shuffling input to memory, the decompressed length is used to create 
> the InMemoryMapOutput object. However, IFile.Reader's readToMemory reads 4 
> bytes less (the IFile header). These 4 bytes can optimized and, in an extreme 
> case of 10,000,000 fetches, can save ~38 MB (TEZ-3732).
> -Memory-to-memory merge sums up the sizes of input InMemoryMapOutput buffers 
> to allocate the new InMemoryMapOutput. However, each input has two 
> EOF_MARKERs while only two are needed at the end.
> -InMemoryWriter wraps the output BoundedByteArrayOutputStream in 
> IFileOutputStream which will write checksum at close. This creates an 
> inconsistency between the primary input buffers which don't have checksum and 
> the merged buffers which do. IFileOutputStream wrap can be removed to save 4 
> bytes per merged buffers.
> -InMemoryWriter does not account for two EOF_MARKERs written at close() in 
> its accounting so that the getRawLength() method is off by two bytes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to