zeroflag commented on a change in pull request #75:
URL: https://github.com/apache/tez/pull/75#discussion_r465108083



##########
File path: tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java
##########
@@ -621,7 +621,7 @@ public synchronized void serviceInit(final Configuration 
conf) throws Exception
       }
     }
 
-    
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,
+    rawExecutor = 
Executors.newFixedThreadPool(conf.getInt(TezConfiguration.TEZ_AM_DAG_APPCONTEXT_THREAD_COUNT_LIMIT,

Review comment:
       Ahh, right, that also seems to be a problem. Maybe doing it in batches 
would work but that complicates the logic further. 
   
   If that too much change, maybe we should go back to the original problem and 
try to solve it a different way. There are 2 phases
   * Creating initializer instances
   * Calling the initialize method on the created instance
   
   The 2nd used to be done in the executor's thread, the 1st was moved there as 
part of TEZ-4170. The reason was that in some cases (like HiveSplitGenerator) 
instance creation (which imho not supposed be slow) is slow.
   An alternative solution I proposed earlier is to move the expensive part out 
of HiveSplitGenerator's constructor into it's initializers method (which was 
already executed by the executor's thread even before TEZ-4170). That way it 
wouldn't be needed to execute the 1st phase by the executor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to