Tsuyoshi OZAWA created TEZ-1806:
-----------------------------------

             Summary: Out of Memory with large TEZ_RUNTIME_IO_SORT_MB
                 Key: TEZ-1806
                 URL: https://issues.apache.org/jira/browse/TEZ-1806
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Tsuyoshi OZAWA


When I allocated 4GB for size of each container and 1.5GB for 
TEZ_RUNTIME_IO_SORT_MB, it failed with OoM. I think it's better to decide the 
value of TEZ_RUNTIME_IO_SORT_MB automatically based on the size of containers.

```
14/11/28 03:50:00 INFO tez.DAGBuilder: DAG execution complete                   
                                                                                
                                                                                
              
14/11/28 03:50:00 ERROR tez.DAGBuilder: DAG diagnostics: [Vertex failed, 
vertexName=2, vertexId=vertex_1417036912823_0055_1_01, diagnostics=[Task 
failed, taskId=task_1417036912823_0055_1_01_000003, diagnostics=[TaskAttempt 0 
failed, info=[Error: Fatal Error cause TezChild 
exit.:java.lang.OutOfMemoryError: Java heap space                               
                                                                                
                                                                           
        at 
org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.<init>(DefaultSorter.java:140)
                                                                                
                                                                  
        at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:114)
                                                                                
                                                        
        at 
org.apache.tez.runtime.library.processor.SimpleProcessor.preOp(SimpleProcessor.java:78)
                                                                                
                                                                            
        at 
org.apache.tez.runtime.library.processor.SimpleProcessor.run(SimpleProcessor.java:52)
                                                                                
                                                                              
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
                                                                                
                                                                   
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
                                                                                
                                                                         
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
                                                                                
                                                                         
        at java.security.AccessController.doPrivileged(Native Method)           
                                                                                
                                                                                
              
        at javax.security.auth.Subject.doAs(Subject.java:415)                   
                                                                                
                                                                                
              
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
                                                                                
                                                                               
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
                                                                                
                                                                          
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
                                                                                
                                                                          
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)             
                                                                                
                                                                                
              
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
                                                                                
                                                                                
   
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
                                                                                
                                                                                
   
        at java.lang.Thread.run(Thread.java:745)         
```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to