pgaref commented on a change in pull request #75:
URL: https://github.com/apache/tez/pull/75#discussion_r464930639
##########
File path: tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java
##########
@@ -621,7 +621,7 @@ public synchronized void serviceInit(final Configuration
conf) throws Exception
}
}
-
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
+ rawExecutor =
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
Review comment:
Hey Attila, you are right, even though the number of threads is bounded,
the number of pending tasks is unlimited.
In the first version of the patch I did try to limit that as well by
blocking when there are no available slots -- maybe we could have a limit there
as
e.g., number_of_threads * 2
https://github.com/apache/tez/pull/75/commits/8358151b555c036f33f2e131a33a246d3c006c5b#diff-755c0ec043a1800cd6cbf31823a59c8fR630
Moreover, while testing I realized that as we maintain an __initializerMap__
with all the vertex initializers in memory, will can still hit memory issues
with larger number of inputs dirs. Any thoughts how we could handle that here?
https://github.com/apache/tez/blob/9d2b61b576a2421ec4fb813489d896d2b89fcce9/tez-dag/src/main/java/org/apache/tez/dag/app/dag/RootInputInitializerManager.java#L136
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]