pgaref commented on a change in pull request #75:
URL: https://github.com/apache/tez/pull/75#discussion_r464930639
##########
File path: tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java
##########
@@ -621,7 +621,7 @@ public synchronized void serviceInit(final Configuration
conf) throws Exception
}
}
-
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
+ rawExecutor =
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
Review comment:
Hey Attila, you are right, even though the number of threads is bounded,
the number of pending tasks is unlimited.
In the first version of the patch I did try to limit that as well -- maybe
we could have a limit there as well
e.g., number_of_threads * 2
https://github.com/apache/tez/pull/75/commits/8358151b555c036f33f2e131a33a246d3c006c5b#diff-755c0ec043a1800cd6cbf31823a59c8fR630
While testing I realized that as we maintain an __initializerMap__ with all
the vertex initializers in memory, will can still hit memory issues with larger
number of inputs dirs. Any thoughts how we could avoid that here?
https://github.com/apache/tez/blob/9d2b61b576a2421ec4fb813489d896d2b89fcce9/tez-dag/src/main/java/org/apache/tez/dag/app/dag/RootInputInitializerManager.java#L136
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]