michaeljmarshall commented on pull request #13949: URL: https://github.com/apache/pulsar/pull/13949#issuecomment-1021881366
> If the client is like the followings: > > 1. The client sends the command to create the producer > > 2. The broker received the command and start to process the request, but can't complete it after the client-side operation timeout > > 3. The client tries to send a new create producer command with the same producer ID but a different request ID (Of course this usually doesn't happen) > > 4. Now the broker received the new create producer command and will use the existing `future` https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1173 to process the new request. > > 5. Then the previous request failed, and complete the future with an exception > > 6. The client will not able to receive the response for the second request. This is a valid concern. However, the client is supposed to send a `CloseProducer` command when timing out the first request before sending a `Producer` command. When it sends the `CloseProducer` command, the above scenario is avoided. This was not done in the Java Client until I fixed it a month ago: https://github.com/apache/pulsar/pull/13161. Additionally, it was not explicitly called out in the protocol documentation until a month ago either: https://github.com/apache/pulsar/pull/12948. The consequence of my proposed PR: if a client does not follow the protocol spec, it will not receive a response (step 6), then the client will timeout waiting for a response and then it'll retry and succeed (or fail). I agree that we need to be careful implementing this. I wouldn't cherry pick these changes to previous branches. However, as I have mentioned elsewhere, I think this is the right design because we should only respond to a client once. > The server side should return a response to the client, one producer future might map to multiple request IDs, we should avoid broker miss response to the client, of course, if the connection is not available, there is no way to make the above guarantee. I don't think we need to design for this case because the client is not supposed to send two `Producer` commands without a `CloseProducer` command in between. Note that we don't store the `requestId` for a second `Producer` command, so there is no way for the broker to respond to the second request. Since the client removed the first request when it timed out, replying to it won't matter. The second client request will timeout, too, and the client will need to resend the `Producer` command a third time. > I think it's the more easy and clear way to handle both the client-side and server-side. And currently, the client-side also follow this way https://github.com/apache/pulsar/blob/master/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L534, https://github.com/apache/pulsar/blob/master/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L719 In both of these examples, the client will log warning messages if it receives responses for requests that are already completed. Those warnings won't be actionable. I think we should respond to client requests just once. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
