This is an automated email from the ASF dual-hosted git repository.
mmarshall pushed a commit to branch branch-2.8
in repository https://gitbox.apache.org/repos/asf/pulsar.git
The following commit(s) were added to refs/heads/branch-2.8 by this push:
new 42422d8 Use `scheduleWithFixedDelay` instead of `scheduleAtFixedRate`
for java producer batch timer (#14125)
42422d8 is described below
commit 42422d84ab5c6d24b57138c39453b45d7dcfba35
Author: Kai <[email protected]>
AuthorDate: Fri Feb 4 15:29:23 2022 -0800
Use `scheduleWithFixedDelay` instead of `scheduleAtFixedRate` for java
producer batch timer (#14125)
Fixes #11100
### Motivation
We believe that the use of `scheduleAtFixedRate` in the java producer's
batch timer can result in unnecessarily high thread usage, which can become
especially problematic for applications that start many producers.
### Modifications
Replaced the use of `scheduleAtFixedRate` with `scheduleWithFixedDelay`,
which is the same behavior as previously in 2.6.x. The producer's parameter
`batchingMaxPublishDelay` implies the use of the "delay" method instead of
"rate" method as well.
### Verifying this change
- [x] Make sure that the change passes the CI checks.
This change is already covered by existing tests, such as existing pulsar
client producer tests.
Testing of the performance regression can be demonstrated by using
[this](https://github.com/klevy-toast/dropwizard-pulsar-test) artifact and
comparing a recent release of pulsar client with a manually built SNAPSHOT
version with this change:
#### Version 2.7.1 CPU & thread behavior
- While sending messages
<img width="1632" alt="image"
src="https://user-images.githubusercontent.com/42187013/152588959-8ee4beb9-70f3-4ad8-9132-240d4498dda5.png">
- While running idle producers
<img width="1613" alt="image"
src="https://user-images.githubusercontent.com/42187013/152589079-b45fce49-757a-4bfd-8ddd-c438774ecf41.png">
- 30 second profile while sending messages
<img width="1295" alt="image"
src="https://user-images.githubusercontent.com/42187013/152589222-54732bf3-44d7-40b8-8c6b-03b54ba01090.png">
#### Version 2.10.0-SNAPSHOT CPU & thread behavior
- While sending messages
<img width="1615" alt="image"
src="https://user-images.githubusercontent.com/42187013/152589391-ae243e7a-5f1f-40b7-a77c-7e3d12a84c8e.png">
- While running idle producers
<img width="1603" alt="image"
src="https://user-images.githubusercontent.com/42187013/152589436-784d9c56-043e-41fa-95e8-6a721e0adc78.png">
- 30 second profile while sending messages
<img width="1289" alt="image"
src="https://user-images.githubusercontent.com/42187013/152589619-f274545d-b9f9-48e8-8b02-e226c6dec59e.png">
These samples show fewer threads running with this change compared to
2.7.1, less time spend in `batchMessageAndSend`, and overall lower CPU usage --
note that this testing was done on a laptop machine, and we have observed,
along with [other
users](https://github.com/apache/pulsar/issues/11100#issuecomment-1007487433)
that this CPU regression can be much worse with more producers, smaller batch
intervals, and on deployed cloud applications.
### Does this pull request potentially affect one of the following parts:
*If `yes` was chosen, please highlight the changes*
- Dependencies (does it add or upgrade a dependency): (no)
- The public API: (no)
- The schema: (no)
- The default values of configurations: (no)
- The wire protocol: (no)
- The rest endpoints: (no)
- The admin cli options: (no)
- Anything that affects deployment: (no)
### Documentation
Check the box below or label this PR directly (if you have committer
privilege).
Need to update docs?
- [x] `no-need-doc`
The general behavior of the batch timer feature should not be changing
(cherry picked from commit f9e69f08c21a57980ef6f6fbb9b8c974621d9bab)
---
.../src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
index cb1bc47..6b23565 100644
---
a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
+++
b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
@@ -1444,7 +1444,7 @@ public class ProducerImpl<T> extends ProducerBase<T>
implements TimerTask, Conne
if (!producerCreatedFuture.isDone() &&
isBatchMessagingEnabled()) {
// schedule the first batch message task
batchTimerTask = cnx.ctx().executor()
-
.scheduleAtFixedRate(catchingAndLoggingThrowables(() -> {
+
.scheduleWithFixedDelay(catchingAndLoggingThrowables(() -> {
if (log.isTraceEnabled()) {
log.trace(
"[{}] [{}] Batching the
messages from the batch container from "