Vanlightly opened a new pull request #11570: URL: https://github.com/apache/pulsar/pull/11570
Fixes #11496 also matches part of PIP 79. C++ implementation that closely matches the proposed Java client changes from reducing partitioned producer connections and lookups: PR 10279 ### Motivation Producers that send messages to partitioned topics start a producer per partition, even when using single partition routing. For topics that have the combination of a large number of producers and a large number of partitions, this can put strain on the brokers. With say 1000 partitions and single partition routing with non-keyed messages, 999 topic owner lookups and producer registrations are performed that could be avoided. PIP 79 also describes this. I wrote this before realising that PIP 79 also covers this. This implementation can be reviewed and contrasted to the Java client implementation in https://github.com/apache/pulsar/pull/10279. ### Modifications Allows partitioned producers to start producers for individual partitions lazily. Starting a producer involves a topic owner lookup to find out which broker is the owner of the partition, then registering the producer for that partition with the owner broker. For topics with many partitions and when using SinglePartition routing without keyed messages, all of these lookups and producer registrations are a waste except for the single chosen partition. This change allows the user to control whether a producer on a partitioned topic uses this lazy start or not, via a new config in ProducerConfiguration. When ProducerConfiguration.setLazyStartPartitionedProducers(true) is set, the PartitionedProducerImpl.start() becomes a synchronous operation that only does housekeeping (no network operations). The producer of any given partition is started (which includes a topic owner lookup and registration) upon sending the first message to that partition. While the producer starts, messages are buffered. The sendTimeout timer is only activated once a producer has been fully started, which should give enough time for any buffered messages to be sent. For very short send timeouts, this setting could cause send timeouts during the start phase. The default of 30s should however not cause this issue. ### Verifying this change This change added tests and can be verified as follows: - BasicEndToEndTest, testPartitionedProducerConsumer - BasicEndToEndTest, testSyncFlushBatchMessagesPartitionedTopicLazyProducers - BasicEndToEndTest, testFlushInLazyPartitionedProducer ### Does this pull request potentially affect one of the following parts: *If `yes` was chosen, please highlight the changes* - Dependencies (does it add or upgrade a dependency): (no) - The public API: (yes) - client configuration - The schema: (no) - The default values of configurations: (no) - The wire protocol: (no) - The rest endpoints: (no) - The admin cli options: (no) - Anything that affects deployment: (no) ### Documentation #### For contributor For this PR, do we need to update docs? Yes, the new client config would need documenting. Can contribute that if this PR is accepted. #### For committer For this PR, do we need to update docs? - If yes, - if you update docs in this PR, label this PR with the `doc` label. - if you plan to update docs later, label this PR with the `doc-required` label. - if you need help on updating docs, create a follow-up issue with the `doc-required` label. - If no, label this PR with the `no-need-doc` label and explain why. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
