On Nov 23, 2006, at 9:25 AM, Robert Greig wrote:
On 23/11/06, Steve Vinoski <[EMAIL PROTECTED]> wrote:
The failure I was getting was that the listener was expecting 10
messages but was getting only 5 or 3 instead. What it looks like is
that notify is just a bit faster than notifyAll, and the way this
test was running, the difference added up over time. With notifyAll
on my machine, the test tended to take about 85 seconds on those
times it passed, but took only 75-78 seconds with notify, apparently
enough to make it pass more reliably. Are there other threads in a
pool somewhere? If so, that could easily explain the difference.
I don't understand why it would matter that notify() is faster than
notifyAll() in this case? i.e. semantically this test should pass with
both? Also given that it is just sending 10 messages in-VM, it should
be extremely quick. In any event, the timeout was 5 seconds and as the
loop was originally crafted it waited *before* checking the count.
Well, I didn't write the test to begin with, and so don't pretend to
know all the details, but from what I can see there are multiple
receivers, each wanting to receive 10 messages. If there are enough
threads in the system such that notifyAll is needlessly notifying
threads that don't need to be notified, then the threads that
actually do need to run to increment the counts may not be getting
enough attention, and meanwhile time is passing and the 5 seconds
elapses before all receivers can collect all their messages.
Did the messages actually get delivered? i.e was the count just wrong
for some reason?
Other than the above, I don't know why or how the messages could be
delivered but the counts be wrong, but perhaps I'm missing something.
I would certainly not consider myself anywhere close to being an
expert at this point in time on any of this code.
Another thing is that with both notifyAll and notify, I also
sometimes get an IllegalStateException from AMQPFastProtocolHandler
at shutdown, line 198. I haven't looked into this yet.
That is usually a symptom of another problem since it just means that
an undecoded byte buffer has made it through to the protocol handler.
That much I know, but what I don't know yet is why it happens. Is
anyone seeing this anywhere else?
--steve