lukecwik commented on code in PR #25338:
URL: https://github.com/apache/beam/pull/25338#discussion_r1097773372
##########
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java:
##########
@@ -219,6 +219,10 @@ public static void main(
ShortIdMap metricsShortIds = new ShortIdMap();
ExecutorService executorService =
options.as(ExecutorOptions.class).getScheduledExecutorService();
+ // In order to reduce memory spent on per-thread coders and buffers, we
separate finalizations
+ // from standard bundle processing.
+ ExecutorService finalizationExecutorService =
Review Comment:
Yes the default factory is called once to populate the value the first time
it is looked up if never set before.
One instance was used because it allowed users to configure it. Other then
that there were some other minor benefits:
* it was used a lot for testing where we can inject things like the executor
deep within the stack without needing to pass it through all the layers
* it simplified shutdown as we only had one executor to manage for the worker
* prevented people from incorrectly creating one since it isn't hard to
incorrectly configure one where you get starved or spawn too many threads
Measuring memory thread pinning should be doable but enumerating thread
locals is hacky:
https://stackoverflow.com/questions/2001353/java-list-thread-locals
Using thread groups could be done in the same way where we are creating
multiple executors but managed centrally via one so that each gets its own
dedicated set of threads since you can create a named "sub-executor".
Can you share more about where we are using large thread locals? Maybe we
can tackle that directly instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]