On Thu, May 7, 2015 at 6:30 AM, Tim Bain <tb...@alumni.duke.edu> wrote:
> I agree with your approach with the WeakRunnable; I think that will achieve > the goal without the performance hit of calling purge() after each > cancellation. > > I went ahead with this solution and it seems to be working well in production. Since the unit tests take 24 hours to run , and I haven’t actually been able to get them to run, I gave up on that approach and I’m just being very careful with my code and making sure we can revert our queue quickly. So far my changes have been in production for a week and have been rock solid. These changes have dramatically improved the performance of ActiveMQ in production for us. The current version of ActiveMQ, without our patches, simply can’t scale to this number of queues. So far , with my changes, it’s been rock solid, doesn’t suffer the GC lockup and GCs are not VERY fast. They take 30-90 seconds now. Before they were locking up ActiveMQ for 20 minutes at a time. Once I finished with the lock bugs then it was that queue GC was taking 100% CPU continually but never actually finishing GC. Now queue GC is much much faster and everything seems to be solid. > If you take on the synchronized keywords, you should consider whether we're > worried about concurrent calls to doStart() and doStop(). That scenario > could leak a Timer, and the current code doesn't protect against that > (you'd need a flag to say whether you've been stopped, and you'd need to > check it in doStart() ). > Agreed. I decided to not take these on because they simply weren’t showing up in the profile. So if it aint broke -don’t fix it. The only other issue, really, is the reflection cost of creating a queue. But the performance there is linear at least. The problem with the current version of activeMQ is that performance falls exponentially because of CopyOnWriteArrayList being used everywhere. Refactoring those to use ConcurrentHashMap seems to completely solve the problem. It’s nice when a plan comes together. These changes have actually allowed us to ship 5.0 version of our product. I have to say that ActiveMQ slowed us to market by about 1-1.5 months… but I think I was also asking ActiveMQ to do a lot. But realistically, it should have been able to keep up. It’s just that internally it was using the wrong data structures for our use case. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts>