Literally JUST found this issue! Is this documented anywhere? My issue is that there *is* no sparse message distribution. Every message has a value from between 0 and 9 with none lacking that header.
I even consume where the message is lacking the value. So there shouldn’t be anything left over. I think ActiveMQ should probably log an error when this happens. On Fri, Apr 24, 2015 at 2:03 PM, Timothy Bish <tabish...@gmail.com> wrote: > On 04/24/2015 04:50 PM, Kevin Burton wrote: > > I’ve been working 15 hour days for the last 2-3 weeks trying to resolve > > this so if this is somewhat incoherent it’s probably due to lack of sleep > > :-P > > > > I think we’re experiencing a bug in ActiveMQ which is VERY hard to > > reproduce but happens regularly in our production setup. > > > > I can’t reproduce it in my test setup because it seems to require real > > world data. Every time I try to do so everything works fine. > > > > It seems you have to have the following: > > > > - a large number of queues which need servicing ( > 1000) > > - a fairly large number of connections (>2000) > > - message selectors > > - a queue that has a large number of messages (5000). > > > > I have my test code now reproducing it… > > > > Everything works FINE if we have just a few message. The problems arise > > once the queue size grows at which point selectors don’t work. > > > > It seems like *early* connections win. If I create a connection to > > ActiveMQ early, and keep it open, it will work. But new connections don’t > > work.. Eventually, the existing connections will fail too. > > > > Basically, it works JUST FINE without message selectors. > > > > I KNOW it’s not my code because I’ve written a basic /simple consumer > which > > is literally just raw JMS and is < 50 lines of code. > > > > I also know my messages selectors should match. First. they do match > some > > percentage of the time. Second, when I consume without the message > > selectors, it works. I have it print the message headers and I can > confirm > > that they should match. > > > > This also seems to get worse over time. The larger the queue, the less > > chance messages will be serviced, eventually it will just lock up > entirely. > > > > > > There are no obvious errors in the ActiveMQ log. Just regarding queue > GC. > > > > The box still has about 40% memory free. So I don’t think it has any > issue > > with memory. No OutOfMemoryErrors being logged. > > > > I think another way to debug this could be to restart activemq itself > with > > message tracing. Then try to get the queue to this state again, and try > to > > consume messages nd see what’s being logged while it’s failing. > > > > What’s frustrating here is that this is the 3rd ActiveMQ workaround I’ve > > had to implement. > > > > the first was because LevelDB was very slow… (artificially slow it > seems), > > so then I decided to just use the memory store. But the memory store > > doesn’t support priority, so instead, I implemented priority through JMS > > selectors. But now JMS selectors don’t work. > > > > :-/ > > > This sounds a lot like the standard issue of having a deep queue and the > message selector not being able to match because the maxPageSize value > is limiting what the message cursor will page in. Have you tried upping > the maxPageSize option? See: > https://issues.apache.org/jira/browse/AMQ-2217 > > -- > Tim Bish > Sr Software Engineer | RedHat Inc. > tim.b...@redhat.com | www.redhat.com > twitter: @tabish121 > blog: http://timbish.blogspot.com/ > > -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts> <http://spinn3r.com>