On 17 October 2016 at 21:50, Rob Godfrey <rob.j.godf...@gmail.com> wrote:

>
>
> On 17 October 2016 at 21:24, Ramayan Tiwari <ramayan.tiw...@gmail.com>
> wrote:
>
>> Hi Rob,
>>
>> We are certainly interested in testing the "multi queue consumers"
>> behavior
>> with your patch in the new broker. We would like to know:
>>
>> 1. What will the scope of changes, client or broker or both? We are
>> currently running 0.16 client, so would like to make sure that we will
>> able
>> to use these changes with 0.16 client.
>>
>>
> There's no change to the client.  I can't remember what was in the 0.16
> client... the only issue would be if there are any bugs in the parsing of
> address arguments.  I can try to test that out tmr.
>


OK - with a little bit of care to get round the address parsing issues in
the 0.16 client... I think we can get this to work.  I've created the
following JIRA:

https://issues.apache.org/jira/browse/QPID-7462

and attached to it are a patch which applies against trunk, and a separate
patch which applies against the 6.0.x branch (
https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is 6.0.4
plus a few other fixes which we will soon be releasing as 6.0.5)

To create a consumer which uses this feature (and multi queue consumption)
for the 0.16 client you need to use something like the following as the
address:

queue_01 ; {node : { type : queue }, link : { x-subscribes : {
arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
x-pull-only : true }}}}


Note that the initial queue_01 has to be a name of an actual queue on
the virtual host, but otherwise it is not actually used (if you were
using a 0.32 or later client you could just use '' here).  The actual
queues that are consumed from are in the list value associated with
x-multiqueue.  For my testing I created a list with 3000 queues here
and this worked fine.

Let me know if you have any questions / issues,

Hope this helps,
Rob


>
>
>> 2. My understanding is that the "pull vs push" change is only with respect
>> to broker and it does not change our architecture where we use
>> MessageListerner to receive messages asynchronously.
>>
>
> Exactly - this is only a change within the internal broker threading
> model.  The external behaviour of the broker remains essentially unchanged.
>
>
>>
>> 3. Once I/O refactoring is completely, we would be able to go back to use
>> standard JMS consumer (Destination), what is the timeline and broker
>> release version for the completion of this work?
>>
>
> You might wish to continue to use the "multi queue" model, depending on
> your actual use case, but yeah once the I/O work is complete I would hope
> that you could use the thousands of consumers model should you wish.  We
> don't have a schedule for the next phase of I/O rework right now - about
> all I can say is that it is unlikely to be complete this year.  I'd need to
> talk with Keith (who is currently on vacation) as to when we think we may
> be able to schedule it.
>
>
>>
>> Let me know once you have integrated the patch and I will re-run our
>> performance tests to validate it.
>>
>>
> I'll make a patch for 6.0.x presently (I've been working on a change
> against trunk - the patch will probably have to change a bit to apply to
> 6.0.x).
>
> Cheers,
> Rob
>
> Thanks
>> Ramayan
>>
>> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey <rob.j.godf...@gmail.com>
>> wrote:
>>
>> > OK - so having pondered / hacked around a bit this weekend, I think to
>> get
>> > decent performance from the IO model in 6.0 for your use case we're
>> going
>> > to have to change things around a bit.
>> >
>> > Basically 6.0 is an intermediate step on our IO / threading model
>> journey.
>> > In earlier versions we used 2 threads per connection for IO (one read,
>> one
>> > write) and then extra threads from a pool to "push" messages from
>> queues to
>> > connections.
>> >
>> > In 6.0 we move to using a pool for the IO threads, and also stopped
>> queues
>> > from "pushing" to connections while the IO threads were acting on the
>> > connection.  It's this latter fact which is screwing up performance for
>> > your use case here because what happens is that on each network read we
>> > tell each consumer to stop accepting pushes from the queue until the IO
>> > interaction has completed.  This is causing lots of loops over your 3000
>> > consumers on each session, which is eating up a lot of CPU on every
>> network
>> > interaction.
>> >
>> > In the final version of our IO refactoring we want to remove the
>> "pushing"
>> > from the queue, and instead have the consumers "pull" - so that the only
>> > threads that operate on the queues (outside of housekeeping tasks like
>> > expiry) will be the IO threads.
>> >
>> > So, what we could do (and I have a patch sitting on my laptop for this)
>> is
>> > to look at using the "multi queue consumers" work I did for you guys
>> > before, but augmenting this so that the consumers work using a "pull"
>> model
>> > rather than the push model.  This will guarantee strict fairness between
>> > the queues associated with the consumer (which was the issue you had
>> with
>> > this functionality before, I believe).  Using this model you'd only
>> need a
>> > small number (one?) of consumers per session.  The patch I have is to
>> add
>> > this "pull" mode for these consumers (essentially this is a preview of
>> how
>> > all consumers will work in the future).
>> >
>> > Does this seem like something you would be interested in pursuing?
>> >
>> > Cheers,
>> > Rob
>> >
>> > On 15 October 2016 at 17:30, Ramayan Tiwari <ramayan.tiw...@gmail.com>
>> > wrote:
>> >
>> > > Thanks Rob. Apologies for sending this over weekend :(
>> > >
>> > > Are there are docs on the new threading model? I found this on
>> > confluence:
>> > >
>> > > https://cwiki.apache.org/confluence/display/qpid/IO+
>> > Transport+Refactoring
>> > >
>> > > We are also interested in understanding the threading model a little
>> > better
>> > > to help us figure our its impact for our usage patterns. Would be very
>> > > helpful if there are more docs/JIRA/email-threads with some details.
>> > >
>> > > Thanks
>> > >
>> > > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey <rob.j.godf...@gmail.com
>> >
>> > > wrote:
>> > >
>> > > > So I *think* this is an issue because of the extremely large number
>> of
>> > > > consumers.  The threading model in v6 means that whenever a network
>> > read
>> > > > occurs for a connection, it iterates over the consumers on that
>> > > connection
>> > > > - obviously where there are a large number of consumers this is
>> > > > burdensome.  I fear addressing this may not be a trivial change...
>> I
>> > > shall
>> > > > spend the rest of my afternoon pondering this...
>> > > >
>> > > > - Rob
>> > > >
>> > > > On 15 October 2016 at 17:14, Ramayan Tiwari <
>> ramayan.tiw...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi Rob,
>> > > > >
>> > > > > Thanks so much for your response. We use transacted sessions with
>> > > > > non-persistent delivery. Prefetch size is 1 and every message is
>> same
>> > > > size
>> > > > > (200 bytes).
>> > > > >
>> > > > > Thanks
>> > > > > Ramayan
>> > > > >
>> > > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey <
>> > rob.j.godf...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Ramyan,
>> > > > > >
>> > > > > > this is interesting... in our testing (which admittedly didn't
>> > cover
>> > > > the
>> > > > > > case of this many queues / listeners) we saw the 6.0.x broker
>> using
>> > > > less
>> > > > > > CPU on average than the 0.32 broker.  I'll have a look this
>> weekend
>> > > as
>> > > > to
>> > > > > > why creating the listeners is slower.  On the dequeing, can you
>> > give
>> > > a
>> > > > > > little more information on the usage pattern - are you using
>> > > > > transactions,
>> > > > > > auto-ack or client ack?  What prefetch size are you using?  How
>> > large
>> > > > are
>> > > > > > your messages?
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Rob
>> > > > > >
>> > > > > > On 14 October 2016 at 23:46, Ramayan Tiwari <
>> > > ramayan.tiw...@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi All,
>> > > > > > >
>> > > > > > > We have been validating the new Qpid broker (version 6.0.4)
>> and
>> > > have
>> > > > > > > compared against broker version 0.32 and are seeing major
>> > > > regressions.
>> > > > > > > Following is the summary of our test setup and results:
>> > > > > > >
>> > > > > > > *1. Test Setup *
>> > > > > > >   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB
>> > RAM).
>> > > > > > >   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker,
>> we
>> > use
>> > > > 8GB
>> > > > > > > heap and 8GB direct memory.
>> > > > > > >   *c).* For 6.0.4, flow to disk has been configured at 60%.
>> > > > > > >   *d).* Both the brokers use BDB host type.
>> > > > > > >   *e).* Brokers have around 6000 queues and we create 16
>> listener
>> > > > > > > sessions/threads spread over 3 connections, where each
>> session is
>> > > > > > listening
>> > > > > > > to 3000 queues. However, messages are only enqueued and
>> processed
>> > > > from
>> > > > > 10
>> > > > > > > queues.
>> > > > > > >   *f).* We enqueue 1 million messages across 10 different
>> queues
>> > > > > (evenly
>> > > > > > > divided), at the start of the test. Dequeue only starts once
>> all
>> > > the
>> > > > > > > messages have been enqueued. We run the test for 2 hours and
>> > > process
>> > > > as
>> > > > > > > many messages as we can. Each message runs for around 200
>> > > > milliseconds.
>> > > > > > >   *g).* We have used both 0.16 and 6.0.4 clients for these
>> tests
>> > > > (6.0.4
>> > > > > > > client only with 6.0.4 broker)
>> > > > > > >
>> > > > > > > *2. Test Results *
>> > > > > > >   *a).* System Load Average (read notes below on how we
>> compute
>> > > it),
>> > > > > for
>> > > > > > > 6.0.4 broker is 5x compared to 0.32 broker. During start of
>> the
>> > > test
>> > > > > > (when
>> > > > > > > we are not doing any dequeue), load average is normal (0.05
>> for
>> > > 0.32
>> > > > > > broker
>> > > > > > > and 0.1 for new broker), however, while we are dequeuing
>> > messages,
>> > > > the
>> > > > > > load
>> > > > > > > average is very high (around 0.5 consistently).
>> > > > > > >
>> > > > > > >   *b). *Time to create listeners in new broker has gone up by
>> > 220%
>> > > > > > compared
>> > > > > > > to 0.32 broker (when using 0.16 client). For old broker,
>> creating
>> > > 16
>> > > > > > > sessions each listening to 3000 queues takes 142 seconds and
>> in
>> > new
>> > > > > > broker
>> > > > > > > it took 456 seconds. If we use 6.0.4 client, it took even
>> longer
>> > at
>> > > > > 524%
>> > > > > > > increase (887 seconds).
>> > > > > > >      *I).* The time to create consumers increases as we create
>> > more
>> > > > > > > listeners on the same connections. We have 20 sessions (but
>> end
>> > up
>> > > > > using
>> > > > > > > around 5 of them) on each connection and we create about 3000
>> > > > consumers
>> > > > > > and
>> > > > > > > attach MessageListener to it. Each successive session takes
>> > longer
>> > > > > > > (approximately linear increase) to setup same number of
>> consumers
>> > > and
>> > > > > > > listeners.
>> > > > > > >
>> > > > > > > *3). How we compute System Load Average *
>> > > > > > > We query the Mbean SysetmLoadAverage and divide it by the
>> value
>> > of
>> > > > > MBean
>> > > > > > > AvailableProcessors. Both of these MBeans are available under
>> > > > > > > java.lang.OperatingSystem.
>> > > > > > >
>> > > > > > > I am not sure what is causing these regressions and would like
>> > your
>> > > > > help
>> > > > > > in
>> > > > > > > understanding it. We are aware about the changes with respect
>> to
>> > > > > > threading
>> > > > > > > model in the new broker, are there any design docs that we can
>> > > refer
>> > > > to
>> > > > > > > understand these changes at a high level? Can we tune some
>> > > parameters
>> > > > > to
>> > > > > > > address these issues?
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > > Ramayan
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to