michaeljmarshall opened a new pull request #12780:
URL: https://github.com/apache/pulsar/pull/12780


   ### Motivation
   
   When a broker receives a message for a producer that is not registered, in 
the process of being registered, or failed registration, the message gets 
ignored. Because the Pulsar protocol allows a client to deliver multiple 
messages before receiving any acknowledgement, the behavior of ignoring 
messages presents a risk for persisting messages out of order. By itself, the 
broker will not persist out of order this way because the client is only 
supposed to send messages after registering the producer successfully on the 
connection. However, as we saw in https://github.com/apache/pulsar/pull/12779, 
a client that does not follow the protocol could persist messages out of order 
unless we update the behavior.
   
   I propose closing the connection when "unexpected" messages are received.
   
   One tradeoff for this implementation is that when the broker initiates 
closing a producer, there is a chance that the whole connection will get closed 
because the producer had messages in flight. I think this is a reasonable 
tradeoff to ensure that clients not following the protocol are not able to 
persist messages out-of-order.
   
   From my perspective, this is the simplest solution that will ensure message 
order is preserved. Alternatively, we could come up with logic to try to handle 
messages sent to recently closed producers, but that would greatly increase the 
complexity for this edge case. Note that it is not sufficient to reply to each 
message with a `SendError` because the producer may have already sent later 
messages and those could be persisted if the producer is concurrently being 
created. Note also that when the Java Client producer receives a generic 
`SendError`, it reacts by closing the connection in most cases.
   
   ### Modifications
   
   * Update `ServerCnx` to close the connection when a broker receives a 
message for a producer that is not ready to handle that message.
   
   ### Verifying this change
   
   I added a test to verify this new behavior. 
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API: no
     - The schema: no
     - The default values of configurations: no
     - The wire protocol: yes and no
     - The rest endpoints: no
     - The admin cli options: no
     - Anything that affects deployment: no
   
   The wire protocol itself is not changed, but the way that the broker 
implements the protocol is changed. Since the broker can close the connection 
at any time, this change will not break any integrations.
   
   ### Documentation
   
   - [x] `doc-required` 
     
   I think we will want to update the documentation of the Pulsar protocol here 
https://pulsar.apache.org/docs/en/develop-binary-protocol/#producer. The 
current docs do not mention how a broker will handle such a case. I'll need a 
little help understanding how/where we want to update the documentation for 
this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to