Vanlightly opened a new issue #11496:
URL: https://github.com/apache/pulsar/issues/11496


   **Is your enhancement request related to a problem? Please describe.**
   Producers that send messages to partitioned topics start a producer per 
partition, even when using single partition routing. For topics that have the 
combination of a large number of producers and a large number of partitions, 
this can put strain on the brokers. With say 1000 partitions and single 
partition routing with non-keyed messages, 999 topic owner lookups and producer 
registrations are performed that could be avoided.
   
   **Describe the solution you'd like**
   
   Option 1 - Strict Single Partition Routing
   The problem is that we have no way of knowing which partitions will be 
involved upon producer creation, even when using Single Partition routing. The 
problem with this is that the user code can still use keyed messages which may 
then involve more than one partition. 
   
   Solution: offer a strict single partition routing mode where we guarantee 
that all messages will only be sent to a single partition, keyed or not. This 
would allow us to only start a single producer on the creation of the 
partitioned producer. 
   
   Option 2 - Lazy Producer Start
   Allow for producers in the partitioned producer class to be started lazily, 
upon the first message being sent to their particular partition. This would be 
controlled via a new producer configuration as this behaviour only benefits 
those who:
   - have topic with a large number of partitions in a topic
   - single partition routing is used
   - messages are non-keyed
   - potentially have a large number of producers to the topic
   
   Messages will be buffered while the connection to the topic owner is carried 
out.
   
   The downside is that there will be extra latency on the first messages being 
published. The send timeout timer is only started once the producer is 
connected so this means that timeouts should not trigger. Only if the send 
timeout is set very low and the number of pending messages is high might we 
typically see send timeouts because of this change.
   
   **Describe alternatives you've considered**
   Just option 1 and 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to