codelipenghui opened a new pull request, #15162: URL: https://github.com/apache/pulsar/pull/15162
### Motivation https://github.com/apache/pulsar/pull/13023 has introduced a performance regression. For each message, we are switching from external thread pool -> internal thread poll -> external thread pool. Previously we want to control the outstanding messages of a consumer using a listener, so after #11455, the message will not move from the receiver queue to the external executor. And #13023 changed the listener trigger in the internal thread pool to fix the ordering issue, so this is the root cause of the performance regression. Here is the frame graph to show the thread frame of the internal thread and external thread. [framegraph.html.txt](https://github.com/apache/pulsar/files/8483765/framegraph.html.txt) And also fix the performance issue for multiple topic consumers and key-shared subscriptions which enabled message listeners. Before this change, the messages are processed serially. After this change, We can improve parallelism on the premise of ensuring order. ### Modification - Remove the isListenerHandlingMessage control - Move the messages from the receiver queue to the queue of external executor but not increase permits - Increase permits before call message listener After the above changes, we don't need to call triggerListener from the external executor. Here is the thread frame graph after applying this change [framegraph2.html.txt](https://github.com/apache/pulsar/files/8483771/framegraph2.html.txt) Before this change, the consumer can't reach 50000 messages/s. After this change, the consumer can reach 400000 messages/s ``` 2022-04-14T02:14:58,208+0800 [main] INFO org.apache.pulsar.testclient.PerformanceConsumer - Throughput received: 9723124 msg --- 470142.670 msg/s --- 3.587 Mbit/s --- Latency: mean: 9476.742 ms - med: 9441 - 95pct: 11908 - 99pct: 12152 - 99.9pct: 12239 - 99.99pct: 12247 - Max: 12247 2022-04-14T02:15:08,222+0800 [main] INFO org.apache.pulsar.testclient.PerformanceConsumer - Throughput received: 14411888 msg --- 468147.684 msg/s --- 3.572 Mbit/s --- Latency: mean: 15262.627 ms - med: 15253 - 95pct: 18023 - 99pct: 18258 - 99.9pct: 18315 - 99.99pct: 18317 - Max: 18318 2022-04-14T02:15:18,236+0800 [main] INFO org.apache.pulsar.testclient.PerformanceConsumer - Throughput received: 18841513 msg --- 442446.540 msg/s --- 3.376 Mbit/s --- Latency: mean: 21164.401 ms - med: 21094 - 95pct: 23664 - 99pct: 23899 - 99.9pct: 23939 - 99.99pct: 23955 - Max: 23955 2022-04-14T02:15:28,253+0800 [main] INFO org.apache.pulsar.testclient.PerformanceConsumer - Throughput received: 23082078 msg --- 423212.525 msg/s --- 3.229 Mbit/s --- Latency: mean: 27174.714 ms - med: 27272 - 95pct: 29453 - 99pct: 29698 - 99.9pct: 29725 - 99.99pct: 29736 - Max: 29736 2022-04-14T02:15:38,268+0800 [main] INFO org.apache.pulsar.testclient.PerformanceConsumer - Throughput received: 27647013 msg --- 455823.127 msg/s --- 3.478 Mbit/s --- Latency: mean: 32438.418 ms - med: 32410 - 95pct: 34870 - 99pct: 35098 - 99.9pct: 35130 - 99.99pct: 35133 - Max: 35134 ``` ### Documentation Check the box below or label this PR directly. Need to update docs? - [ ] `doc-required` (Your PR needs to update docs and you will update later) - [x] `no-need-doc` (Please explain why) - [ ] `doc` (Your PR contains doc changes) - [ ] `doc-added` (Docs have been already added) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
