Thanks so much Rob, I will test the patch against trunk and will update you with the outcome.
- Ramayan On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey <rob.j.godf...@gmail.com> wrote: > On 17 October 2016 at 21:50, Rob Godfrey <rob.j.godf...@gmail.com> wrote: > > > > > > > On 17 October 2016 at 21:24, Ramayan Tiwari <ramayan.tiw...@gmail.com> > > wrote: > > > >> Hi Rob, > >> > >> We are certainly interested in testing the "multi queue consumers" > >> behavior > >> with your patch in the new broker. We would like to know: > >> > >> 1. What will the scope of changes, client or broker or both? We are > >> currently running 0.16 client, so would like to make sure that we will > >> able > >> to use these changes with 0.16 client. > >> > >> > > There's no change to the client. I can't remember what was in the 0.16 > > client... the only issue would be if there are any bugs in the parsing of > > address arguments. I can try to test that out tmr. > > > > > OK - with a little bit of care to get round the address parsing issues in > the 0.16 client... I think we can get this to work. I've created the > following JIRA: > > https://issues.apache.org/jira/browse/QPID-7462 > > and attached to it are a patch which applies against trunk, and a separate > patch which applies against the 6.0.x branch ( > https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is 6.0.4 > plus a few other fixes which we will soon be releasing as 6.0.5) > > To create a consumer which uses this feature (and multi queue consumption) > for the 0.16 client you need to use something like the following as the > address: > > queue_01 ; {node : { type : queue }, link : { x-subscribes : { > arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ], > x-pull-only : true }}}} > > > Note that the initial queue_01 has to be a name of an actual queue on > the virtual host, but otherwise it is not actually used (if you were > using a 0.32 or later client you could just use '' here). The actual > queues that are consumed from are in the list value associated with > x-multiqueue. For my testing I created a list with 3000 queues here > and this worked fine. > > Let me know if you have any questions / issues, > > Hope this helps, > Rob > > > > > > > >> 2. My understanding is that the "pull vs push" change is only with > respect > >> to broker and it does not change our architecture where we use > >> MessageListerner to receive messages asynchronously. > >> > > > > Exactly - this is only a change within the internal broker threading > > model. The external behaviour of the broker remains essentially > unchanged. > > > > > >> > >> 3. Once I/O refactoring is completely, we would be able to go back to > use > >> standard JMS consumer (Destination), what is the timeline and broker > >> release version for the completion of this work? > >> > > > > You might wish to continue to use the "multi queue" model, depending on > > your actual use case, but yeah once the I/O work is complete I would hope > > that you could use the thousands of consumers model should you wish. We > > don't have a schedule for the next phase of I/O rework right now - about > > all I can say is that it is unlikely to be complete this year. I'd need > to > > talk with Keith (who is currently on vacation) as to when we think we may > > be able to schedule it. > > > > > >> > >> Let me know once you have integrated the patch and I will re-run our > >> performance tests to validate it. > >> > >> > > I'll make a patch for 6.0.x presently (I've been working on a change > > against trunk - the patch will probably have to change a bit to apply to > > 6.0.x). > > > > Cheers, > > Rob > > > > Thanks > >> Ramayan > >> > >> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey <rob.j.godf...@gmail.com> > >> wrote: > >> > >> > OK - so having pondered / hacked around a bit this weekend, I think to > >> get > >> > decent performance from the IO model in 6.0 for your use case we're > >> going > >> > to have to change things around a bit. > >> > > >> > Basically 6.0 is an intermediate step on our IO / threading model > >> journey. > >> > In earlier versions we used 2 threads per connection for IO (one read, > >> one > >> > write) and then extra threads from a pool to "push" messages from > >> queues to > >> > connections. > >> > > >> > In 6.0 we move to using a pool for the IO threads, and also stopped > >> queues > >> > from "pushing" to connections while the IO threads were acting on the > >> > connection. It's this latter fact which is screwing up performance > for > >> > your use case here because what happens is that on each network read > we > >> > tell each consumer to stop accepting pushes from the queue until the > IO > >> > interaction has completed. This is causing lots of loops over your > 3000 > >> > consumers on each session, which is eating up a lot of CPU on every > >> network > >> > interaction. > >> > > >> > In the final version of our IO refactoring we want to remove the > >> "pushing" > >> > from the queue, and instead have the consumers "pull" - so that the > only > >> > threads that operate on the queues (outside of housekeeping tasks like > >> > expiry) will be the IO threads. > >> > > >> > So, what we could do (and I have a patch sitting on my laptop for > this) > >> is > >> > to look at using the "multi queue consumers" work I did for you guys > >> > before, but augmenting this so that the consumers work using a "pull" > >> model > >> > rather than the push model. This will guarantee strict fairness > between > >> > the queues associated with the consumer (which was the issue you had > >> with > >> > this functionality before, I believe). Using this model you'd only > >> need a > >> > small number (one?) of consumers per session. The patch I have is to > >> add > >> > this "pull" mode for these consumers (essentially this is a preview of > >> how > >> > all consumers will work in the future). > >> > > >> > Does this seem like something you would be interested in pursuing? > >> > > >> > Cheers, > >> > Rob > >> > > >> > On 15 October 2016 at 17:30, Ramayan Tiwari <ramayan.tiw...@gmail.com > > > >> > wrote: > >> > > >> > > Thanks Rob. Apologies for sending this over weekend :( > >> > > > >> > > Are there are docs on the new threading model? I found this on > >> > confluence: > >> > > > >> > > https://cwiki.apache.org/confluence/display/qpid/IO+ > >> > Transport+Refactoring > >> > > > >> > > We are also interested in understanding the threading model a little > >> > better > >> > > to help us figure our its impact for our usage patterns. Would be > very > >> > > helpful if there are more docs/JIRA/email-threads with some details. > >> > > > >> > > Thanks > >> > > > >> > > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey < > rob.j.godf...@gmail.com > >> > > >> > > wrote: > >> > > > >> > > > So I *think* this is an issue because of the extremely large > number > >> of > >> > > > consumers. The threading model in v6 means that whenever a > network > >> > read > >> > > > occurs for a connection, it iterates over the consumers on that > >> > > connection > >> > > > - obviously where there are a large number of consumers this is > >> > > > burdensome. I fear addressing this may not be a trivial change... > >> I > >> > > shall > >> > > > spend the rest of my afternoon pondering this... > >> > > > > >> > > > - Rob > >> > > > > >> > > > On 15 October 2016 at 17:14, Ramayan Tiwari < > >> ramayan.tiw...@gmail.com> > >> > > > wrote: > >> > > > > >> > > > > Hi Rob, > >> > > > > > >> > > > > Thanks so much for your response. We use transacted sessions > with > >> > > > > non-persistent delivery. Prefetch size is 1 and every message is > >> same > >> > > > size > >> > > > > (200 bytes). > >> > > > > > >> > > > > Thanks > >> > > > > Ramayan > >> > > > > > >> > > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey < > >> > rob.j.godf...@gmail.com> > >> > > > > wrote: > >> > > > > > >> > > > > > Hi Ramyan, > >> > > > > > > >> > > > > > this is interesting... in our testing (which admittedly didn't > >> > cover > >> > > > the > >> > > > > > case of this many queues / listeners) we saw the 6.0.x broker > >> using > >> > > > less > >> > > > > > CPU on average than the 0.32 broker. I'll have a look this > >> weekend > >> > > as > >> > > > to > >> > > > > > why creating the listeners is slower. On the dequeing, can > you > >> > give > >> > > a > >> > > > > > little more information on the usage pattern - are you using > >> > > > > transactions, > >> > > > > > auto-ack or client ack? What prefetch size are you using? > How > >> > large > >> > > > are > >> > > > > > your messages? > >> > > > > > > >> > > > > > Thanks, > >> > > > > > Rob > >> > > > > > > >> > > > > > On 14 October 2016 at 23:46, Ramayan Tiwari < > >> > > ramayan.tiw...@gmail.com> > >> > > > > > wrote: > >> > > > > > > >> > > > > > > Hi All, > >> > > > > > > > >> > > > > > > We have been validating the new Qpid broker (version 6.0.4) > >> and > >> > > have > >> > > > > > > compared against broker version 0.32 and are seeing major > >> > > > regressions. > >> > > > > > > Following is the summary of our test setup and results: > >> > > > > > > > >> > > > > > > *1. Test Setup * > >> > > > > > > *a). *Qpid broker runs on a dedicated host (12 cores, 32 > GB > >> > RAM). > >> > > > > > > *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, > >> we > >> > use > >> > > > 8GB > >> > > > > > > heap and 8GB direct memory. > >> > > > > > > *c).* For 6.0.4, flow to disk has been configured at 60%. > >> > > > > > > *d).* Both the brokers use BDB host type. > >> > > > > > > *e).* Brokers have around 6000 queues and we create 16 > >> listener > >> > > > > > > sessions/threads spread over 3 connections, where each > >> session is > >> > > > > > listening > >> > > > > > > to 3000 queues. However, messages are only enqueued and > >> processed > >> > > > from > >> > > > > 10 > >> > > > > > > queues. > >> > > > > > > *f).* We enqueue 1 million messages across 10 different > >> queues > >> > > > > (evenly > >> > > > > > > divided), at the start of the test. Dequeue only starts once > >> all > >> > > the > >> > > > > > > messages have been enqueued. We run the test for 2 hours and > >> > > process > >> > > > as > >> > > > > > > many messages as we can. Each message runs for around 200 > >> > > > milliseconds. > >> > > > > > > *g).* We have used both 0.16 and 6.0.4 clients for these > >> tests > >> > > > (6.0.4 > >> > > > > > > client only with 6.0.4 broker) > >> > > > > > > > >> > > > > > > *2. Test Results * > >> > > > > > > *a).* System Load Average (read notes below on how we > >> compute > >> > > it), > >> > > > > for > >> > > > > > > 6.0.4 broker is 5x compared to 0.32 broker. During start of > >> the > >> > > test > >> > > > > > (when > >> > > > > > > we are not doing any dequeue), load average is normal (0.05 > >> for > >> > > 0.32 > >> > > > > > broker > >> > > > > > > and 0.1 for new broker), however, while we are dequeuing > >> > messages, > >> > > > the > >> > > > > > load > >> > > > > > > average is very high (around 0.5 consistently). > >> > > > > > > > >> > > > > > > *b). *Time to create listeners in new broker has gone up > by > >> > 220% > >> > > > > > compared > >> > > > > > > to 0.32 broker (when using 0.16 client). For old broker, > >> creating > >> > > 16 > >> > > > > > > sessions each listening to 3000 queues takes 142 seconds and > >> in > >> > new > >> > > > > > broker > >> > > > > > > it took 456 seconds. If we use 6.0.4 client, it took even > >> longer > >> > at > >> > > > > 524% > >> > > > > > > increase (887 seconds). > >> > > > > > > *I).* The time to create consumers increases as we > create > >> > more > >> > > > > > > listeners on the same connections. We have 20 sessions (but > >> end > >> > up > >> > > > > using > >> > > > > > > around 5 of them) on each connection and we create about > 3000 > >> > > > consumers > >> > > > > > and > >> > > > > > > attach MessageListener to it. Each successive session takes > >> > longer > >> > > > > > > (approximately linear increase) to setup same number of > >> consumers > >> > > and > >> > > > > > > listeners. > >> > > > > > > > >> > > > > > > *3). How we compute System Load Average * > >> > > > > > > We query the Mbean SysetmLoadAverage and divide it by the > >> value > >> > of > >> > > > > MBean > >> > > > > > > AvailableProcessors. Both of these MBeans are available > under > >> > > > > > > java.lang.OperatingSystem. > >> > > > > > > > >> > > > > > > I am not sure what is causing these regressions and would > like > >> > your > >> > > > > help > >> > > > > > in > >> > > > > > > understanding it. We are aware about the changes with > respect > >> to > >> > > > > > threading > >> > > > > > > model in the new broker, are there any design docs that we > can > >> > > refer > >> > > > to > >> > > > > > > understand these changes at a high level? Can we tune some > >> > > parameters > >> > > > > to > >> > > > > > > address these issues? > >> > > > > > > > >> > > > > > > Thanks > >> > > > > > > Ramayan > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > >