scwhittle commented on code in PR #30439:
URL: https://github.com/apache/beam/pull/30439#discussion_r1507573892
##########
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/BoundedQueueExecutor.java:
##########
@@ -104,6 +106,20 @@ public void forceExecute(Runnable work, long workBytes) {
executeLockHeld(work, workBytes);
}
+ // Set the maximum/core pool size of the executor.
+ public void setMaximumPoolSize(int maximumPoolSize) {
Review Comment:
sychronized?
seems safer when comparing the atomic with other things like pool size
Perhaps the atomics in this class should just be removed and protected by
synchronized. Using the atomics within synchronized block is more overhead and
these accessors reading the atomics are not called very frequently I believe.
##########
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/BoundedQueueExecutor.java:
##########
@@ -38,6 +38,7 @@ public class BoundedQueueExecutor {
private int elementsOutstanding = 0;
private long bytesOutstanding = 0;
private final AtomicInteger activeCount = new AtomicInteger();
+ private final AtomicInteger maximumThreadCount = new AtomicInteger();
Review Comment:
would be good to add a unit test covering the new methods to this class
directly
##########
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/BoundedQueueExecutor.java:
##########
@@ -70,7 +72,7 @@ protected void beforeExecute(Thread t, Runnable r) {
protected void afterExecute(Runnable r, Throwable t) {
super.afterExecute(r, t);
synchronized (this) {
- if (activeCount.getAndDecrement() == maximumPoolSize) {
+ if (activeCount.getAndDecrement() == maximumThreadCount.get()) {
Review Comment:
I think this might need to be
if (activeCount.getAndDecrement() <= maximumThreadCount.get() &&
startTimeMaxActiveThreadsUsed > 0)
you could maybe test this by adding more work to queue than the limit,
having the work that is scheduled block, then increase the limit and let all
the work complete. We should see totalTimeMaxActiveThreadsUsed incremented
Another way to fix it could be to modify the setMaximumPoolSize to reset and
count startTimeMaxActiveThreadsUsed if the limit increased (it will be
restarted shortly as new threads start and call beforeExecute).
##########
runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorkerTest.java:
##########
@@ -2916,6 +2916,69 @@ public void testActiveThreadMetric() throws Exception {
executor.shutdown();
}
+ @Test
+ public void testOverrideMaximumThreadCount() throws Exception {
+ int maxThreads = 5;
+ int threadExpirationSec = 60;
Review Comment:
this doesn't matter right? If so set it higher so it isn't possible that it
triggers.
##########
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/BoundedQueueExecutor.java:
##########
@@ -48,6 +49,7 @@ public BoundedQueueExecutor(
int maximumElementsOutstanding,
long maximumBytesOutstanding,
ThreadFactory threadFactory) {
+ this.maximumThreadCount.set(maximumPoolSize);
Review Comment:
how about
this.maximumThreadCount = new AtomicInteger(maximumPoolSize);
instead of assignment above
##########
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/BoundedQueueExecutor.java:
##########
@@ -123,6 +139,10 @@ public int activeCount() {
return activeCount.intValue();
}
+ public int maximumThreadCount() {
+ return maximumThreadCount.intValue();
Review Comment:
github won't let me add a comment, but above allThreadsActiveTime shoudl be
synchronized so it isn't racy reading that while other threads are modifying it.
can add a @GuardedBy annotation to totalTimeMaxActiveThreadsUsed
##########
runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorkerTest.java:
##########
@@ -2916,6 +2916,69 @@ public void testActiveThreadMetric() throws Exception {
executor.shutdown();
}
+ @Test
+ public void testOverrideMaximumThreadCount() throws Exception {
+ int maxThreads = 5;
+ int threadExpirationSec = 60;
+ CountDownLatch processStart1 = new CountDownLatch(2);
+ AtomicBoolean stop = new AtomicBoolean(false);
+ // setting up actual implementation of executor instead of mocking to keep
track of
+ // active thread count.
+ BoundedQueueExecutor executor =
Review Comment:
the test that BoundedQueueExecutor works when calling setMaximumPoolSize
seems like it should be in BoundedQueueExecutorTest.
The test at the streaming dataflow worker level seems like it should have
the work update client and validate plumbing of responses to setting the max on
the queue. The queue itself could be a mock in that test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]