> On 21 Dec 2016, at 14:57 , Sven Van Caekenberghe <[email protected]> wrote: > > Hendrik, > > Thank you for this detailed feedback. > >> On 21 Dec 2016, at 12:44, Henrik Johansen <[email protected]> >> wrote: >> >> Hi Sven! >> One thing I noticed when testing the RabbitMQ client with keepalive > 0, was >> connection being closed and all subscriptions lost when receiving large >> payloads, due to timeout before the keepalive packet could be sent before >> waiting to receive next object. > > Indeed, I used my Stamp (STOMP) Rabbit MQ client as a model for the MQTT one > - there is a lot of similarity, especially in concept. > > Keep alive processing is not that easy. I tried to do it by using read > timeouts as a source of regular opportunities to check for the need to > process keep alive logic. But of course, if you have no outstanding read (in > a loop), that won't work. > > The fact that receiving a large payload would trigger an actual keep alive > time out is not something that I have seen myself. It seems weird that the > reading/transferring of incoming data would not count as activity against > keep alive, no ?
I never had a chance to investigate fully, but I distinctly remember having the same reaction! It's quite awhile ago now, so my memory might be hazy, take the following with an appropriate amount of grains of salt. The first times I encountered it, it seemed quite random, occuring after extended periods of client inactivity after receiving only small payloads... Setting a much shorter keepalive timeout than the default was/is very useful in reproducing/verifying if it is an issue. The timeouts then occurred relatively shortly after I'd received a single payload with no other activity, and disappeared once I removed the resetting of lastActivity timestamp on reads, indicating that for at least rabbitmq (3.5 was the version at the time, I believe), receiving data was *not* being counted as keep-alive activity. The issue of receiving large payloads blocking writing in time was still unresolved, consistently cut off in the middle of (its own!) multi-MB payload transfers due to keepalive packet not being sent. I couldn't see a solution other than abandoning the elegant single-threaded approach and do keepalive as a separate high-priority process, but the architectural choice for the app changed at this point to not include a MQ in first delivery, so the more involved rewrite for doing so before deploying in production, kinda got stranded :/ Cheers, Henry
