zeroflag commented on a change in pull request #75:
URL: https://github.com/apache/tez/pull/75#discussion_r465108083
##########
File path: tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java
##########
@@ -621,7 +621,7 @@ public synchronized void serviceInit(final Configuration
conf) throws Exception
}
}
-
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
+ rawExecutor =
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
Review comment:
Ahh, right, that also seems to be a problem. Maybe doing it in batches
would work but that complicates the logic further.
If that too much change, maybe we should go back to the original problem and
try to solve it a different way. There are 2 phases
* Creating initializer instances
* Calling the initialize method on the created instance
The 2nd used to be done in the executor's thread, the 1st was moved there as
part of TEZ-4170. The reason was that in some cases (like HiveSplitGenerator)
instance creation (which imho not supposed be slow) is slow.
An alternative solution I proposed earlier is to move the expensive part out
of HiveSplitGenerator's constructor into it's initializers method (which was
already executed by the executor's thread even before TEZ-4170). That way it
wouldn't be needed to execute the 1st phase by the executor.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]