I view this as an edge case of the Pulsar Protocol that requires
clarification. Once we clarify the spec, we can update the clients to
conform to the spec. I don't think we need to give end users control
over this small part of the protocol.

After reviewing the design a bit more, I think we should update the
Java client to send the `CloseProducer` command, as well as update the
spec to recommend this implementation. While the `ServerCnx` class
"works" both ways, the current Java client implementation leads to
unnecessary calls, unnecessary errors, and a longer time to recovery.

Specifically, if the client fails to send a `CloseProducer` command,
it ends up getting into a sequence of retries where each new
`Producer` command receives an immediate `ErrorResponse` because the
`ServerCnx` already has a pending producer. By sending a
`CloseProducer` command, the client gives the broker permission to
stop keeping track of the original create producer request. It also
means that if the topic eventually loads, the broker will respond to
the right request id with a `ProducerSuccessResponse` command.

I will follow up with an update to the client and the protocol spec,
unless there are any objections.

Thanks,
Michael

On Thu, Nov 18, 2021 at 12:25 PM Neng Lu <nl...@apache.org> wrote:
>
> How about making the behavior when timeout configurable? And by default, it 
> will send the `CloseProducer` cmd?
>
> On 2021/11/17 05:52:21 Michael Marshall wrote:
> > Hi All,
> >
> > I noticed that the `ServerCnxTest#testCreateProducerTimeout` test
> > indicates that a producer will send a `CloserProducer` command when
> > producer creation times out for the client.
> >
> > The Java client does not follow this protocol. When the producer
> > creation times out, it just schedules another attempt to create the
> > producer.
> >
> > The C++ client does follow this protocol and is annotated with the
> > following comment:
> >
> > > // Creating the producer has timed out. We need to ensure the broker 
> > > closes the producer
> > > // in case it was indeed created, otherwise it might prevent new create 
> > > producer operation,
> > > // since we are not closing the connection
> >
> > I don't see anything in our official protocol spec indicating the
> > "right" behavior. Given the test coverage, it seems like the initial
> > design was to expect a `CloseProducer` command. However, I also see that
> > the broker's `ServerCnx` class will reply to a redundant `Producer`
> > command with a `ProducerSuccess` command, as long as the producer
> > is already created.
> >
> > Should I submit a PR to update the Java client to send a
> > `CloseProducer` command when a `Producer` command times out?
> >
> > Thanks,
> > Michael
> >

Reply via email to