On 8/10/06, Komandur <[EMAIL PROTECTED]> wrote:
>> 1. can we use an 'elastic prefetch' buffer based on a sliding window (like >> in TCP) - this reacts to client (mis)behavior >We could start with a prefetch of 1 and increase it over time for well >behaving clients. However it doesn't fix the problem as a mis-behaving >consumer could still hog at least one message - though this would >reduce the imact from 1000 or so to 1. Note that the prefetch window needs to follow the standard tcp stuff of multiplicative decrease during problem period & additive increase upon positive ack (IMHO, there isn't much to be gained in reinventing the TCP flow control wheel, which has been honed for over a decade.)
The problem is - once a message has been sent to a consumer its too late - the consumer is now hogging it. This differs considerably with TCP - in TCP it doesn't affect other connections if you send a little too much data to a socket.
This helps in several ways: - Messages are dispatched as soon as possible, as slow consumer will automatically have a smaller 'prefetch window'. In fact by decaying the 'prefetch window' (like in the latest implementations of TCP flow control), a new slow consumer's window automatically shrinks.
Growing and shriking the prefetch windows based on the amount of time it takes to get acknowledgements back is certainly possible - though its a different discussion and is for different reasons as it purely tunes the prefetch size to their optimal level. This also assumes that you can actually grow and shrink them accurately. e.g. the prefetch buffer sizes may need to be large for performance reasons when some messages take a long time to process or when networks are slow. So adding automatically sized prefetch windows could result in windows being too small. However AMQ-850 is about a completely different problem to sizing the prefetch buffer - its what to do about a badly behaving consumer.
- I am not sure I understand the 'one message hog' case.
Start with a prefetch of 1. Give a consumer a message then if the consumer doesn't do anything with it - or locks up while processing it. then that message is now 'hogged' - no other consumer can get the message until the consumer is closed or the client killed.
Most of the consumers are idempotent (there are many failure cases to count on 'once and only once' delivery). So there is no harm in redelivering this one message for which no ack has been received yet.
That 1 message will not be delivered to anyone else - which is a real problem. There's the added effect on ordering too.
>> 2. When the broker detects a misbehaving client, reclaim the unAcked >> messages for other active consumers (and make the window size 0 or 1 in >> step >> 1 above) >If a client/connection misbehaves (e.g. becomes inactive) then the >connection is closed and all consumers are closed too causing all >their unacked messages to be redelivered. This sounds good. However, please note that misbehavior is not necessarily a binary state. Sometimes an ACK could be delayed for many reasons (either transient consumer (mis) behavior or other network related issues). It is in the gray areas that the tcp flow control works really well.
Agreed - which is why AMQ-850 is introduced to allow people to set an inactivity timer on specific consumers. It could just be 1 thread which is blocked on some lock - while the other threads and the rest of the connection is working fine. -- James ------- http://radio.weblogs.com/0112098/
