Hi. I don't know if it's a similar issue, but here is what just happened on our Pulsar cluster (small three nodes clusters) :
One of the 3 broker suddenly stopped sending messages, its total MSG/S OUT was zero while MSG/I IN were still at ~100k/s. So it acted as if all the consumers on this particular broker were simultaneously disconnected. The odd thing is that **no exception were thrown, and no error or warning were issued**, either on the client side (the log level for pulsar is WARN) or on the brokers. The topics are persistent, failover non shared. All machines (clients and brokers) were under light load (0% wait, 20% CPU, low network BW). After a client restart, the consumers started receiving messages again. Is there anything we can do to detect such disconnection ? I thought pulsar clients & consumers were supposed to reconnect automatically, but maybe we need to listen to ConsumerEvents to detect disabled consumers ? Client & brokers are 2.1.0 [ Full content available at: https://github.com/apache/incubator-pulsar/issues/2431 ] This message was relayed via gitbox.apache.org for [email protected]
