pgaref commented on a change in pull request #75:
URL: https://github.com/apache/tez/pull/75#discussion_r464930639



##########
File path: tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java
##########
@@ -621,7 +621,7 @@ public synchronized void serviceInit(final Configuration 
conf) throws Exception
       }
     }
 
-    
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
+    rawExecutor = 
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,

Review comment:
       Hey Attila, you are right, even though the number of threads is bounded, 
the number of pending tasks is unlimited.
   In the first version of the patch I did try to limit that as well -- maybe 
we could have a limit there as well 
    e.g., number_of_threads * 2 
https://github.com/apache/tez/pull/75/commits/8358151b555c036f33f2e131a33a246d3c006c5b#diff-755c0ec043a1800cd6cbf31823a59c8fR630
   
   While testing I realized that as we maintain an __initializerMap__ with  all 
the vertex initializers in memory, will can still hit memory issues with larger 
number of inputs dirs. Any thoughts how we could avoid that here? 
   
https://github.com/apache/tez/blob/9d2b61b576a2421ec4fb813489d896d2b89fcce9/tez-dag/src/main/java/org/apache/tez/dag/app/dag/RootInputInitializerManager.java#L136




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to