Re: Qpid broker 6.0.4 performance issues

Ramayan Tiwari Mon, 17 Oct 2016 13:26:25 -0700

Hi Rob,

We are certainly interested in testing the "multi queue consumers" behavior
with your patch in the new broker. We would like to know:


1. What will the scope of changes, client or broker or both? We are
currently running 0.16 client, so would like to make sure that we will able
to use these changes with 0.16 client.

2. My understanding is that the "pull vs push" change is only with respect
to broker and it does not change our architecture where we use
MessageListerner to receive messages asynchronously.

3. Once I/O refactoring is completely, we would be able to go back to use
standard JMS consumer (Destination), what is the timeline and broker
release version for the completion of this work?

Let me know once you have integrated the patch and I will re-run our
performance tests to validate it.

Thanks
Ramayan

On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey <rob.j.godf...@gmail.com>
wrote:

> OK - so having pondered / hacked around a bit this weekend, I think to get
> decent performance from the IO model in 6.0 for your use case we're going
> to have to change things around a bit.
>
> Basically 6.0 is an intermediate step on our IO / threading model journey.
> In earlier versions we used 2 threads per connection for IO (one read, one
> write) and then extra threads from a pool to "push" messages from queues to
> connections.
>
> In 6.0 we move to using a pool for the IO threads, and also stopped queues
> from "pushing" to connections while the IO threads were acting on the
> connection.  It's this latter fact which is screwing up performance for
> your use case here because what happens is that on each network read we
> tell each consumer to stop accepting pushes from the queue until the IO
> interaction has completed.  This is causing lots of loops over your 3000
> consumers on each session, which is eating up a lot of CPU on every network
> interaction.
>
> In the final version of our IO refactoring we want to remove the "pushing"
> from the queue, and instead have the consumers "pull" - so that the only
> threads that operate on the queues (outside of housekeeping tasks like
> expiry) will be the IO threads.
>
> So, what we could do (and I have a patch sitting on my laptop for this) is
> to look at using the "multi queue consumers" work I did for you guys
> before, but augmenting this so that the consumers work using a "pull" model
> rather than the push model.  This will guarantee strict fairness between
> the queues associated with the consumer (which was the issue you had with
> this functionality before, I believe).  Using this model you'd only need a
> small number (one?) of consumers per session.  The patch I have is to add
> this "pull" mode for these consumers (essentially this is a preview of how
> all consumers will work in the future).
>
> Does this seem like something you would be interested in pursuing?
>
> Cheers,
> Rob
>
> On 15 October 2016 at 17:30, Ramayan Tiwari <ramayan.tiw...@gmail.com>
> wrote:
>
> > Thanks Rob. Apologies for sending this over weekend :(
> >
> > Are there are docs on the new threading model? I found this on
> confluence:
> >
> > https://cwiki.apache.org/confluence/display/qpid/IO+
> Transport+Refactoring
> >
> > We are also interested in understanding the threading model a little
> better
> > to help us figure our its impact for our usage patterns. Would be very
> > helpful if there are more docs/JIRA/email-threads with some details.
> >
> > Thanks
> >
> > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey <rob.j.godf...@gmail.com>
> > wrote:
> >
> > > So I *think* this is an issue because of the extremely large number of
> > > consumers.  The threading model in v6 means that whenever a network
> read
> > > occurs for a connection, it iterates over the consumers on that
> > connection
> > > - obviously where there are a large number of consumers this is
> > > burdensome.  I fear addressing this may not be a trivial change...  I
> > shall
> > > spend the rest of my afternoon pondering this...
> > >
> > > - Rob
> > >
> > > On 15 October 2016 at 17:14, Ramayan Tiwari <ramayan.tiw...@gmail.com>
> > > wrote:
> > >
> > > > Hi Rob,
> > > >
> > > > Thanks so much for your response. We use transacted sessions with
> > > > non-persistent delivery. Prefetch size is 1 and every message is same
> > > size
> > > > (200 bytes).
> > > >
> > > > Thanks
> > > > Ramayan
> > > >
> > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey <
> rob.j.godf...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ramyan,
> > > > >
> > > > > this is interesting... in our testing (which admittedly didn't
> cover
> > > the
> > > > > case of this many queues / listeners) we saw the 6.0.x broker using
> > > less
> > > > > CPU on average than the 0.32 broker.  I'll have a look this weekend
> > as
> > > to
> > > > > why creating the listeners is slower.  On the dequeing, can you
> give
> > a
> > > > > little more information on the usage pattern - are you using
> > > > transactions,
> > > > > auto-ack or client ack?  What prefetch size are you using?  How
> large
> > > are
> > > > > your messages?
> > > > >
> > > > > Thanks,
> > > > > Rob
> > > > >
> > > > > On 14 October 2016 at 23:46, Ramayan Tiwari <
> > ramayan.tiw...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > We have been validating the new Qpid broker (version 6.0.4) and
> > have
> > > > > > compared against broker version 0.32 and are seeing major
> > > regressions.
> > > > > > Following is the summary of our test setup and results:
> > > > > >
> > > > > > *1. Test Setup *
> > > > > >   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB
> RAM).
> > > > > >   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we
> use
> > > 8GB
> > > > > > heap and 8GB direct memory.
> > > > > >   *c).* For 6.0.4, flow to disk has been configured at 60%.
> > > > > >   *d).* Both the brokers use BDB host type.
> > > > > >   *e).* Brokers have around 6000 queues and we create 16 listener
> > > > > > sessions/threads spread over 3 connections, where each session is
> > > > > listening
> > > > > > to 3000 queues. However, messages are only enqueued and processed
> > > from
> > > > 10
> > > > > > queues.
> > > > > >   *f).* We enqueue 1 million messages across 10 different queues
> > > > (evenly
> > > > > > divided), at the start of the test. Dequeue only starts once all
> > the
> > > > > > messages have been enqueued. We run the test for 2 hours and
> > process
> > > as
> > > > > > many messages as we can. Each message runs for around 200
> > > milliseconds.
> > > > > >   *g).* We have used both 0.16 and 6.0.4 clients for these tests
> > > (6.0.4
> > > > > > client only with 6.0.4 broker)
> > > > > >
> > > > > > *2. Test Results *
> > > > > >   *a).* System Load Average (read notes below on how we compute
> > it),
> > > > for
> > > > > > 6.0.4 broker is 5x compared to 0.32 broker. During start of the
> > test
> > > > > (when
> > > > > > we are not doing any dequeue), load average is normal (0.05 for
> > 0.32
> > > > > broker
> > > > > > and 0.1 for new broker), however, while we are dequeuing
> messages,
> > > the
> > > > > load
> > > > > > average is very high (around 0.5 consistently).
> > > > > >
> > > > > >   *b). *Time to create listeners in new broker has gone up by
> 220%
> > > > > compared
> > > > > > to 0.32 broker (when using 0.16 client). For old broker, creating
> > 16
> > > > > > sessions each listening to 3000 queues takes 142 seconds and in
> new
> > > > > broker
> > > > > > it took 456 seconds. If we use 6.0.4 client, it took even longer
> at
> > > > 524%
> > > > > > increase (887 seconds).
> > > > > >      *I).* The time to create consumers increases as we create
> more
> > > > > > listeners on the same connections. We have 20 sessions (but end
> up
> > > > using
> > > > > > around 5 of them) on each connection and we create about 3000
> > > consumers
> > > > > and
> > > > > > attach MessageListener to it. Each successive session takes
> longer
> > > > > > (approximately linear increase) to setup same number of consumers
> > and
> > > > > > listeners.
> > > > > >
> > > > > > *3). How we compute System Load Average *
> > > > > > We query the Mbean SysetmLoadAverage and divide it by the value
> of
> > > > MBean
> > > > > > AvailableProcessors. Both of these MBeans are available under
> > > > > > java.lang.OperatingSystem.
> > > > > >
> > > > > > I am not sure what is causing these regressions and would like
> your
> > > > help
> > > > > in
> > > > > > understanding it. We are aware about the changes with respect to
> > > > > threading
> > > > > > model in the new broker, are there any design docs that we can
> > refer
> > > to
> > > > > > understand these changes at a high level? Can we tune some
> > parameters
> > > > to
> > > > > > address these issues?
> > > > > >
> > > > > > Thanks
> > > > > > Ramayan
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Qpid broker 6.0.4 performance issues

Reply via email to