[GitHub] [beam] lukecwik commented on a diff in pull request #25338: Change to a separate UnboundedScheduledExecutor for finalizations.

via GitHub Mon, 06 Feb 2023 10:33:02 -0800


lukecwik commented on code in PR #25338:
URL: https://github.com/apache/beam/pull/25338#discussion_r1097773372



##########
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java:
##########
@@ -219,6 +219,10 @@ public static void main(
     ShortIdMap metricsShortIds = new ShortIdMap();
     ExecutorService executorService =
         options.as(ExecutorOptions.class).getScheduledExecutorService();
+    // In order to reduce memory spent on per-thread coders and buffers, we 
separate finalizations
+    // from standard bundle processing.
+    ExecutorService finalizationExecutorService =

Review Comment:
   Yes the default factory is called once to populate the value the first time 
it is looked up if never set before.
   
   One instance was used because it allowed users to configure it. Other then 
that there were some other minor benefits:
   * it was used a lot for testing where we can inject things like the executor 
deep within the stack without needing to pass it through all the layers
   * it simplified shutdown as we only had one executor to manage for the worker
   * prevented people from incorrectly creating one since it isn't hard to 
incorrectly configure one where you get starved or spawn too many threads
   
   Measuring memory thread pinning should be doable but enumerating thread 
locals is hacky: 
https://stackoverflow.com/questions/2001353/java-list-thread-locals
   
   Using thread groups could be done in the same way where we are creating 
multiple executors but managed centrally via one so that each gets its own 
dedicated set of threads since you can create a named "sub-executor".
   
   Can you share more about where we are using large thread locals? Maybe we 
can tackle that directly instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] lukecwik commented on a diff in pull request #25338: Change to a separate UnboundedScheduledExecutor for finalizations.

Reply via email to