lhotari commented on issue #25146: URL: https://github.com/apache/pulsar/issues/25146#issuecomment-3760011208
> Under high broker load or in extreme timing conditions, the broker might fail to deliver the close consumer command to the client, even though the TCP connection itself remains healthy. I don't immediately see how it could be possible that the delivery of the command would fail when the TCP connection itself remains healthy. It's possible that the delivery is delayed, that's a valid case. However, if the delivery fails, that would mean that there's another bug that should be addressed directly. In the org.apache.pulsar.broker.service.Consumer class, there are 2 different methods related to closing: `close` and `disconnect`. When `close` is called, it won't notify the client. Perhaps somewhere the call to `disconnect` is missing and `close` is used instead? The close notifications are sent at https://github.com/apache/pulsar/blob/b1019ce54dd3b81cdb2804a5f37c862022132034/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L3388-L3400 It would be possible to use `ctx.writeAndFlush(msg).addListener(ChannelFutureListener.CLOSE_ON_FAILURE)` to close the connection immediately if writing to the channel fails. The reason why `.addListener` isn't used in all writes is to avoid creating a ChannelPromise instance for all invocations to writeAndFlush. Instead, `ctx.voidPromise()` is used in `org.apache.pulsar.common.util.netty.NettyChannelUtil#writeAndFlushWithVoidPromise`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
