lhotari commented on PR #23983: URL: https://github.com/apache/pulsar/pull/23983#issuecomment-2657398617
> An extra context switch for each entry is costly, especially when you have many small entries and little or no batching. That's why we put it on the same thread. @merlimat The thread switching was added in PR https://github.com/apache/pulsar/pull/9039, already in December 2020. The reason to make this change is related to a performance concern of #23940 changes which removed the thread switching. https://github.com/apache/pulsar/blob/ee5b13af5cf229c2e4846c6d34ebda59eb82330a/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L796-L826 In Pulsar use cases, synchronization on CPU intensive operations (or blocking IO operations) in Netty IO threads could cause performance regressions. In this case, it would impact use cases where there's a large number of producers producing to a single topic. Blocking IO threads will have a broader impact since it will impact Netty IO of all connections sharing the same IO thread. Before #23940, the code looks like this: https://github.com/apache/pulsar/blob/7a79c78f8e6f4b52f13be1c6441f4b007d9a00fe/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L796-L810 btw. In the Pulsar code base, we have a problem in how IO threads are used. IO threads are used to process work that shouldn't be handled with IO threads at all. I have created an issue #23865. There should be a separate thread pool for running blocking operations and CPU intensive synchronized operations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
