Re: Qpid broker 6.0.4 performance issues

2017-01-04 Thread Rob Godfrey
Hi Ramayan,

On 4 January 2017 at 23:54, Ramayan Tiwari  wrote:

> Hi Lorenz,
>
> Happy new year to everyone, hope you guys had fun!
>
> I am doing performance test runs to figure out a reasonable threshold for
> direct memory, considering our use case of small message payloads. I have a
> few more questions:
>
> 1. I assume there is no way to disable direct memory (so that we always use
> heap to store messages), however, I was wondering if this would be
> something that can be offered as part of broker config (unless it a
> significant change and unfeasible to do it).
>

Unfortunately it's not that simple (as I guess you might have expected).
Moving to using direct memory wasn't something we did as an active
choice... what actually happened is that we discovered that when we moved
to java nio socket channels rather than the blocking io, that the JVM
libraries were actually taking our heap buffers and then allocating and
copying the contents into direct memory.  Because of issues with regard to
how these (JVM library managed) direct byte buffers were being cached, we
saw much more frequent out of memory conditions, and from code that we
really had no way to control...  So the decision to start using direct
memory was essentially forced upon us by our move to NIO.  For the same
reason it doesn't really make sense to offer a "use heap memory" option, as
you'll still end up using direct memory (because of the Java library) but
in a much harder to control way.


>
> 2. We monitor Qpid broker memory and stop enqueue at a certain threshold to
> protect broker heap. Now that message payload is stored in DM, we would
> like to track that as well. However, DM size reported by Mbean doesn't
> correlate directly with the sum of message payload (possibly because the
> byte buffers are not allocated/de-allocated that frequently). So to get
> around that, we are considering using the sum queue sizes to get actual DM
> usage.
>
>
Yes - the way we use direct memory is to try to cache the allocated buffers
and re-use them rather than constantly creating and GCing them (because
there are issues with how the JVM does (or rather doesn't) GC direct
buffers).  The broker's own flow control / flow-to-disk algorithms are
based around aggregating the sizes reported by the queues, so this is
certainly a viable approach.  As you know we are deprecating the use of
JMX, but you do bring up an issue that we are not exposing (through any
management channel) a way of doing deep inspection in how direct memory is
being managed.  Potentially we should be adding a mechanism to enquire how
much direct memory is allocated and in use vs. how much is in the cache,
but actually "free".

-- Rob


> Thanks
> Ramayan
>
> On Wed, Dec 21, 2016 at 5:17 AM, Lorenz Quack 
> wrote:
>
> > Hi,
> >
> > Regarding 0.32 behaviour, it checked to see whether to flow a message
> > to disk when putting a message on the Queue the same way Qpid 6 does.
> > In that sense 6 is not more or less aggressive.  However, the
> > algorithm behind the decision whether or not to flow to disk has
> > changed.  This change was done as part of a larger effort to isolate
> > VirtualHosts from each other.  I would have to go back and check how
> > the algorithm worked previously but I would assume that it just
> > considered the total (estimated) amount of memory used and did no per
> > VH or per Queue allocation.  This means 6 effectively lowers the
> > threshold on individual Queues especially in the case where some
> > VirtualHosts and/or Queues are used less than others.  On the upside,
> > the broker is fairer in its resource management and a single
> > VirtualHost can no longer use up all available memory.  How exactly
> > that trade-off between fairness and efficient use of available memory
> > is made is debatable but I don't think we want to go back to the pre-6
> > model of just lumping everything together.
> >
> > Given your numbers (1 VH, 6000 Qs) each Queue would initially be
> > allocated 1/6000th of 60% of 8 GB ≈ 1 MB.  In then end state the full
> > Queues should end up with approximately 780 MB but as you noticed the
> > threshold is only recalculated periodically during housekeeping (by
> > default every 30 s) or when a VH or Queue is added or deleted.  If you
> > have DEBUG logging you should see periodic messages like "Allocating
> > target size to queues [...]"  if not then I am afraid you won't be
> > able to tell the current thresholds because they are only reported
> > once when flowToDisk becomes active/inactive.
> >
> > So I think your analysis is probably correct that the revision of the
> > threshold is always "behind" the publishing, raising it on every
> > revision but never far enough to prevent flowToDisk.  This is not
> > ideal.  We will have to address this.  However, I am afraid that in
> > the current release there is no way to influence the algorithm other
> > than setting the available memory and broker.flowToDiskThreshold.
> >
> > Regarding t

Re: Qpid broker 6.0.4 performance issues

2017-01-04 Thread Ramayan Tiwari
Hi Lorenz,

Happy new year to everyone, hope you guys had fun!

I am doing performance test runs to figure out a reasonable threshold for
direct memory, considering our use case of small message payloads. I have a
few more questions:

1. I assume there is no way to disable direct memory (so that we always use
heap to store messages), however, I was wondering if this would be
something that can be offered as part of broker config (unless it a
significant change and unfeasible to do it).

2. We monitor Qpid broker memory and stop enqueue at a certain threshold to
protect broker heap. Now that message payload is stored in DM, we would
like to track that as well. However, DM size reported by Mbean doesn't
correlate directly with the sum of message payload (possibly because the
byte buffers are not allocated/de-allocated that frequently). So to get
around that, we are considering using the sum queue sizes to get actual DM
usage.

Thanks
Ramayan

On Wed, Dec 21, 2016 at 5:17 AM, Lorenz Quack 
wrote:

> Hi,
>
> Regarding 0.32 behaviour, it checked to see whether to flow a message
> to disk when putting a message on the Queue the same way Qpid 6 does.
> In that sense 6 is not more or less aggressive.  However, the
> algorithm behind the decision whether or not to flow to disk has
> changed.  This change was done as part of a larger effort to isolate
> VirtualHosts from each other.  I would have to go back and check how
> the algorithm worked previously but I would assume that it just
> considered the total (estimated) amount of memory used and did no per
> VH or per Queue allocation.  This means 6 effectively lowers the
> threshold on individual Queues especially in the case where some
> VirtualHosts and/or Queues are used less than others.  On the upside,
> the broker is fairer in its resource management and a single
> VirtualHost can no longer use up all available memory.  How exactly
> that trade-off between fairness and efficient use of available memory
> is made is debatable but I don't think we want to go back to the pre-6
> model of just lumping everything together.
>
> Given your numbers (1 VH, 6000 Qs) each Queue would initially be
> allocated 1/6000th of 60% of 8 GB ≈ 1 MB.  In then end state the full
> Queues should end up with approximately 780 MB but as you noticed the
> threshold is only recalculated periodically during housekeeping (by
> default every 30 s) or when a VH or Queue is added or deleted.  If you
> have DEBUG logging you should see periodic messages like "Allocating
> target size to queues [...]"  if not then I am afraid you won't be
> able to tell the current thresholds because they are only reported
> once when flowToDisk becomes active/inactive.
>
> So I think your analysis is probably correct that the revision of the
> threshold is always "behind" the publishing, raising it on every
> revision but never far enough to prevent flowToDisk.  This is not
> ideal.  We will have to address this.  However, I am afraid that in
> the current release there is no way to influence the algorithm other
> than setting the available memory and broker.flowToDiskThreshold.
>
> Regarding the MemoryStore, the algorithm triggering flowToDisk is the
> same for all stores, just the implementation of the actual writing
> messages to disk differs.  For the MemoryStore it is a noop, i.e., the
> message is not flown to disk and remains in memory. Performancewise
> we do not do a lot of testing with the MemoryStore because it is not a
> typical use-case and mainly used for unit and system testing.  I would
> assume that the better distribution you are seeing is coincidental
> since that part of the code should be relatively independent of the
> store type.  Unfortunately, I cannot see any of your graphs.  I
> believe the mailing list strips all attachments.
>
> Regarding a recommendation of how to configure your DM vs Heap I would
> like to refer you to our documentation [1], especially section
> "9.11.6. Memory Tuning the Broker".  There we provide formulas to
> estimate the memory consumption of the broker for both DM and Heap.
> Note that these are estimates and you should test your chosen settings
> under a typical peak workload.  Given that your messages are small you
> will probably want to favour Heap over DM but I am reluctant to make
> an explicit recommendation.
>
> Kind regards,
> Lorenz
>
> P.S.: I am going on a 2 day vacation later today but feel free to
> continue this conversation with others on this list.
>
> [1] https://qpid.apache.org/releases/qpid-java-6.1.0/java-broker
> /book/Java-Broker-Runtime-Memory.html
>
> On 20/12/16 20:37, Ramayan Tiwari wrote:
>
>> Hi Lorenz,
>>
>> Thanks a lot for your response and explaining the flow to disk algorithm
>> in detail. I have described the test setup in detail in the first email of
>> this thread, to summarize the points again:
>> a) There is only one virtual host.
>> b) There are 6000 queues in this virtual host, but messages are only
>> enqueued to 10

Re: Qpid broker 6.0.4 performance issues

2016-12-21 Thread Lorenz Quack

Hi,

Regarding 0.32 behaviour, it checked to see whether to flow a message
to disk when putting a message on the Queue the same way Qpid 6 does.
In that sense 6 is not more or less aggressive.  However, the
algorithm behind the decision whether or not to flow to disk has
changed.  This change was done as part of a larger effort to isolate
VirtualHosts from each other.  I would have to go back and check how
the algorithm worked previously but I would assume that it just
considered the total (estimated) amount of memory used and did no per
VH or per Queue allocation.  This means 6 effectively lowers the
threshold on individual Queues especially in the case where some
VirtualHosts and/or Queues are used less than others.  On the upside,
the broker is fairer in its resource management and a single
VirtualHost can no longer use up all available memory.  How exactly
that trade-off between fairness and efficient use of available memory
is made is debatable but I don't think we want to go back to the pre-6
model of just lumping everything together.

Given your numbers (1 VH, 6000 Qs) each Queue would initially be
allocated 1/6000th of 60% of 8 GB ≈ 1 MB.  In then end state the full
Queues should end up with approximately 780 MB but as you noticed the
threshold is only recalculated periodically during housekeeping (by
default every 30 s) or when a VH or Queue is added or deleted.  If you
have DEBUG logging you should see periodic messages like "Allocating
target size to queues [...]"  if not then I am afraid you won't be
able to tell the current thresholds because they are only reported
once when flowToDisk becomes active/inactive.

So I think your analysis is probably correct that the revision of the
threshold is always "behind" the publishing, raising it on every
revision but never far enough to prevent flowToDisk.  This is not
ideal.  We will have to address this.  However, I am afraid that in
the current release there is no way to influence the algorithm other
than setting the available memory and broker.flowToDiskThreshold.

Regarding the MemoryStore, the algorithm triggering flowToDisk is the
same for all stores, just the implementation of the actual writing
messages to disk differs.  For the MemoryStore it is a noop, i.e., the
message is not flown to disk and remains in memory. Performancewise
we do not do a lot of testing with the MemoryStore because it is not a
typical use-case and mainly used for unit and system testing.  I would
assume that the better distribution you are seeing is coincidental
since that part of the code should be relatively independent of the
store type.  Unfortunately, I cannot see any of your graphs.  I
believe the mailing list strips all attachments.

Regarding a recommendation of how to configure your DM vs Heap I would
like to refer you to our documentation [1], especially section
"9.11.6. Memory Tuning the Broker".  There we provide formulas to
estimate the memory consumption of the broker for both DM and Heap.
Note that these are estimates and you should test your chosen settings
under a typical peak workload.  Given that your messages are small you
will probably want to favour Heap over DM but I am reluctant to make
an explicit recommendation.

Kind regards,
Lorenz

P.S.: I am going on a 2 day vacation later today but feel free to
continue this conversation with others on this list.

[1] 
https://qpid.apache.org/releases/qpid-java-6.1.0/java-broker/book/Java-Broker-Runtime-Memory.html


On 20/12/16 20:37, Ramayan Tiwari wrote:

Hi Lorenz,

Thanks a lot for your response and explaining the flow to disk 
algorithm in detail. I have described the test setup in detail in the 
first email of this thread, to summarize the points again:

a) There is only one virtual host.
b) There are 6000 queues in this virtual host, but messages are only 
enqueued to 10 queues.
c) Every queue gets equal number of messages (100k) at the start of 
the test (we do not start dequeue till all the 1 million messages are 
enqueued).
d) Heap and DM memory are equal (8GB each) and DM flow to disk 
threshold is 60%.


I looked at QUE-1014/15 log lines and following is what I notice:
a) These log lines are not present in 0.32 broker's log, which means 
that its not doing any flow to disk. Is flow to disk behavior 
different in the two brokers, it looks like 6.0.x is a lot more 
aggressive in this regard.


b) Since all the 1 million messages are enqueued at the start of test 
(takes about 7 mins to enqueue), flow to disk threshold revisions 
performed by the housekeeping task are not able to catch up. Or the 
rate with which thresholds are revised can not catch up with the rate 
of enqueue. In my test, revisions once happened twice (4 seconds and 5 
mins after test start) and then on, the threshold was not revised for 
the queues.


To make sure that we are not getting penalized by writing to disk, I 
also did a test using Memory store type and compared the result with 
BDB store type. Apparently, BDB store i

Re: Qpid broker 6.0.4 performance issues

2016-12-20 Thread Ramayan Tiwari
Hi Lorenz,

Thanks a lot for your response and explaining the flow to disk algorithm in
detail. I have described the test setup in detail in the first email of
this thread, to summarize the points again:
a) There is only one virtual host.
b) There are 6000 queues in this virtual host, but messages are only
enqueued to 10 queues.
c) Every queue gets equal number of messages (100k) at the start of the
test (we do not start dequeue till all the 1 million messages are enqueued).
d) Heap and DM memory are equal (8GB each) and DM flow to disk threshold is
60%.

I looked at QUE-1014/15 log lines and following is what I notice:
a) These log lines are not present in 0.32 broker's log, which means that
its not doing any flow to disk. Is flow to disk behavior different in the
two brokers, it looks like 6.0.x is a lot more aggressive in this regard.

b) Since all the 1 million messages are enqueued at the start of test
(takes about 7 mins to enqueue), flow to disk threshold revisions performed
by the housekeeping task are not able to catch up. Or the rate with which
thresholds are revised can not catch up with the rate of enqueue. In my
test, revisions once happened twice (4 seconds and 5 mins after test start)
and then on, the threshold was not revised for the queues.

To make sure that we are not getting penalized by writing to disk, I also
did a test using Memory store type and compared the result with BDB store
type. Apparently, BDB store is slightly more efficient (2.7%) in terms of
number of messages delivered. Memory store also takes more broker CPU (3%
more on average), but its better in terms of distributing messages in a
round robin manner from all the queues. See the attached graphs for details.

I do notice that flow to disk behavior is almost exactly same (QUE-1014/15
log lines are present) when running with Memory store. I am wondering what
does flow to disk does when we use Memory store?

Since our average messages size is less than 1KB, I am really looking
forward to some recommendation around the % allocation for DM vs Heap.

Thanks
Ramayan


On Tue, Dec 20, 2016 at 4:02 AM, Lorenz Quack 
wrote:

> Hello Ramayan,
>
> glad to hear that the patch is (mostly) working for you.
> To address your points:
>
> 1. If indeed in one case flow to disk is kicking in while in
>the other one it is not, then I am not surprised that
>there is a 5% difference.  The question is whether the
>flow to disk is expected or not which leads to
>
> 2. The direct memory utilization not exceeding a certain
>value is a strong indication that flow to disk is active.
>Could you verify that by checking the logs (QUE-1014/15)?
>If the flow to disk limit is exceeded then it is expected
>that 2 million messages consume the same amount of direct
>memory as 1 million messages.  Could you share a little
>more about the test setup?  How many VirtualHost are
>running on the broker?  How many Queues are on each
>VirtualHost?  What is the Queue depth of those Queues?
>All of those factors influence the actual flow to disk
>threshold.  This is to ensure some fairness between
>VirtualHosts as far as memory consumption is concerned.
>Below I explain how threshold allocation is currently
>performed.  We are considering changing the algorithm in
>the future or making it tunable.  Your ideas, requirements,
>and input on this would certainly be of interest to us.
>
> Looking forward to hearing from you.
>
> Kind regards,
> Lorenz
>
>
> Algorithm for flow to disk threshold:
>
>  1. Take the total amount of the broker.flowToDiskThreshold and
> divide it amongst all active VirtualHosts as follows
>
>a. Half of broker.flowToDiskThreshold is evenly devided
>   amongst the VHs to ensure a minimum amount is available to
>   each VH.
>
>b. The remaining half is allocated proportional to the current
>   usage pattern.  For example, if VH1 is currently using 3
>   MB, VH2 is using 1 MB and VH3 is using 0 MB, then of the
>   remaining half 3/4 will be allocated to VH1, 1/4 to VH2,
>   and nothing to VH3.  If all VHs are empty distribute this
>   half evenly like in 1.a.
>
>  2. The VirtualHosts allocate their available memory to their
> Queues in a proportional fashion as explained above (1.b).
>
>
> Example:
>
>  * The broker.flowToDiskThreshold is set to 10 GB.
>
>  * Two Virtual Hosts with 10 Queues each.
>
>* VH1 all 10 Queues are empty.
>
>* VH2 all Queues contain 10 MB except of one Queue that
>  contains 100 MB.
>
>  * According to 1.a each VirtualHost is allocated half of 5 GB,
>i.e., 2.5 GB
>
>  * According to 1.b VH1 using 0MB does not get any additional
>memory while VH2 gets the full of the remainder of the 5 GB
>totaling 7.5 GB.
>
>  * The Queues on VH1 don't have messages on them so the
>VirtualHost falls back to allocating the

Re: Qpid broker 6.0.4 performance issues

2016-12-20 Thread Lorenz Quack

Hello Ramayan,

glad to hear that the patch is (mostly) working for you.
To address your points:

1. If indeed in one case flow to disk is kicking in while in
   the other one it is not, then I am not surprised that
   there is a 5% difference.  The question is whether the
   flow to disk is expected or not which leads to

2. The direct memory utilization not exceeding a certain
   value is a strong indication that flow to disk is active.
   Could you verify that by checking the logs (QUE-1014/15)?
   If the flow to disk limit is exceeded then it is expected
   that 2 million messages consume the same amount of direct
   memory as 1 million messages.  Could you share a little
   more about the test setup?  How many VirtualHost are
   running on the broker?  How many Queues are on each
   VirtualHost?  What is the Queue depth of those Queues?
   All of those factors influence the actual flow to disk
   threshold.  This is to ensure some fairness between
   VirtualHosts as far as memory consumption is concerned.
   Below I explain how threshold allocation is currently
   performed.  We are considering changing the algorithm in
   the future or making it tunable.  Your ideas, requirements,
   and input on this would certainly be of interest to us.

Looking forward to hearing from you.

Kind regards,
Lorenz


Algorithm for flow to disk threshold:

 1. Take the total amount of the broker.flowToDiskThreshold and
divide it amongst all active VirtualHosts as follows

   a. Half of broker.flowToDiskThreshold is evenly devided
  amongst the VHs to ensure a minimum amount is available to
  each VH.

   b. The remaining half is allocated proportional to the current
  usage pattern.  For example, if VH1 is currently using 3
  MB, VH2 is using 1 MB and VH3 is using 0 MB, then of the
  remaining half 3/4 will be allocated to VH1, 1/4 to VH2,
  and nothing to VH3.  If all VHs are empty distribute this
  half evenly like in 1.a.

 2. The VirtualHosts allocate their available memory to their
Queues in a proportional fashion as explained above (1.b).


Example:

 * The broker.flowToDiskThreshold is set to 10 GB.

 * Two Virtual Hosts with 10 Queues each.

   * VH1 all 10 Queues are empty.

   * VH2 all Queues contain 10 MB except of one Queue that
 contains 100 MB.

 * According to 1.a each VirtualHost is allocated half of 5 GB,
   i.e., 2.5 GB

 * According to 1.b VH1 using 0MB does not get any additional
   memory while VH2 gets the full of the remainder of the 5 GB
   totaling 7.5 GB.

 * The Queues on VH1 don't have messages on them so the
   VirtualHost falls back to allocating them equal shares: 250 MB
   each.

 * On VH2 the total current memory usage is 9*10 MB + 100 MB =
   190 MB so the smaller Queues receive 10/190 * 7.5 GB = 395 MB
   while the large Queue receives 100/190 * 7.5 GB = 3950 MB.

 * In total we allocated 10 * 250 MB + 9 * 395 MB + 1 * 3950 MB
   totaling 10 GB (within bounds of rounding errors).



On 19/12/16 20:48, Ramayan Tiwari wrote:

Hi Rob,

I did another exhaustive performance test using the MultiQueueConsumer 
feature with 6.0.5 (and the patch). The broker CPU issues has been 
resolved and we no longer have the problem message prefetch (caused by 
long running message).


Fairness among queue is also great (not as perfect as 0.32 broker 
though, see attached graphs). Everything looks great, except for:


1. 6.0.5 delivered around 4.6% less messages. Flow to disk triggered 
aggressively in 6.0.5 but I don't see any flow to disk happening in 
0.32 (looking for QUE-1014). This might be the reason for lesser 
message delivery.


2. Direct memory utilization in the new broker does not make sense to 
us. We did 2 tests: 1 millions and 2 million messages (220 Byte 
average message size), however, the direct memory utilization never 
exceeded 500MB (see attached graph), even when we are allocating 8GB 
for direct memory. Because there is a 1KB heap overhead with each 
message, heap utilization looks same for both 0.32  and 6.0.5. For our 
setup, this essentially means that, we are cutting our memory capacity 
by half, because now are allocating half of the available RAM to 
direct memory, but will be limited by heap anyway.


These tests were performed using 16GB RAM, where 8GB was allocated to 
heap and 8GB for Direct memory. I also changed flowToDiskThreshold to 
60%. This is one of our biggest concern with the new broker, since our 
average message size in production is less than 1KB. Currently we 
allocate all the available RAM to heap, which will be reduced in half 
with the new broker.


What is the recommendation for memory allocation (heap vs dm) in our 
use case?


Thanks
Ramaayn

On Fri, Oct 28, 2016 at 5:37 AM, Keith W > wrote:


Hi Ramayan

QPID-7462 is a new (experimental) feature, so we don't consider this
appropriate for inc

Re: Qpid broker 6.0.4 performance issues

2016-12-19 Thread Ramayan Tiwari
Hi Rob,

I did another exhaustive performance test using the MultiQueueConsumer
feature with 6.0.5 (and the patch). The broker CPU issues has been resolved
and we no longer have the problem message prefetch (caused by long running
message).

Fairness among queue is also great (not as perfect as 0.32 broker though,
see attached graphs). Everything looks great, except for:

1. 6.0.5 delivered around 4.6% less messages. Flow to disk triggered
aggressively in 6.0.5 but I don't see any flow to disk happening in 0.32
(looking for QUE-1014). This might be the reason for lesser message
delivery.

2. Direct memory utilization in the new broker does not make sense to us.
We did 2 tests: 1 millions and 2 million messages (220 Byte average message
size), however, the direct memory utilization never exceeded 500MB (see
attached graph), even when we are allocating 8GB for direct memory. Because
there is a 1KB heap overhead with each message, heap utilization looks same
for both 0.32  and 6.0.5. For our setup, this essentially means that, we
are cutting our memory capacity by half, because now are allocating half of
the available RAM to direct memory, but will be limited by heap anyway.

These tests were performed using 16GB RAM, where 8GB was allocated to heap
and 8GB for Direct memory. I also changed flowToDiskThreshold to 60%. This
is one of our biggest concern with the new broker, since our average
message size in production is less than 1KB. Currently we allocate all the
available RAM to heap, which will be reduced in half with the new broker.

What is the recommendation for memory allocation (heap vs dm) in our use
case?

Thanks
Ramaayn

On Fri, Oct 28, 2016 at 5:37 AM, Keith W  wrote:

> Hi Ramayan
>
> QPID-7462 is a new (experimental) feature, so we don't consider this
> appropriate for inclusion in the 6.0.5 defect release  We follow a
> Semantic Versioning[1] strategy.
>
> The underlying issue is your testing has uncovered is poor performance
> with large numbers of consumers.  QPID-7462 effectively side steps the
> problem (by introducing alternative consumer behaviour) but does not
> address the root cause. We continue to consider how best to resolve
> the problem completely, but don't yet have timelines for this change.
> It is something that will be getting attention in what remains of this
> year.  We will keep you posted.
>
> In the meanwhile, I understand this causes you a problem.  If you
> cannot adopt 6.1 (there should be another RC out soon), you could
> consider applying the patch (attached to the JIRA) to 6.0.x branch and
> building yourself.
>
> Kind regards, Keith.
>
>
> [1] http://semver.org
>
>
> On 27 October 2016 at 23:19, Ramayan Tiwari 
> wrote:
> > Hi Rob,
> >
> > I have the truck code which I am testing with, I haven't finished the
> test
> > runs yet. I was hoping that once I validate the change, I can simply
> > release 6.0.5.
> >
> > Thanks
> > Ramayan
> >
> > On Thu, Oct 27, 2016 at 12:41 PM, Rob Godfrey 
> > wrote:
> >
> >> Hi Ramayan,
> >>
> >> did you verify that the change works for you?  You said you were going
> to
> >> test with the trunk code...
> >>
> >> I'll discuss with the other developers tomorrow about whether we can put
> >> this change into 6.0.5.
> >>
> >> Cheers,
> >> Rob
> >>
> >> On 27 October 2016 at 20:30, Ramayan Tiwari 
> >> wrote:
> >>
> >> > Hi Rob,
> >> >
> >> > I looked at the release notes for 6.0.5 and it doesn't include the fix
> >> for
> >> > large consumers issues [1]. The fix is marked for 6.1, which will not
> >> have
> >> > JMX and for us to use this version requires major changes in our
> >> monitoring
> >> > framework. Could you please include the fix in 6.0.5 release?
> >> >
> >> > Thanks
> >> > Ramayan
> >> >
> >> > [1]. https://issues.apache.org/jira/browse/QPID-7462
> >> >
> >> > On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong 
> >> wrote:
> >> >
> >> > > Hi Rob,
> >> > >
> >> > > Again, thank you so much for answering our questions and providing a
> >> > patch
> >> > > so quickly :) One more question I have: would it be possible to
> include
> >> > > test cases involving many queues and listeners (in the order of
> >> thousands
> >> > > of queues) for future Qpid releases, as part of standard perf
> testing
> >> of
> >> > > the broker?
> >> > >
> >> > > Thanks,
> >> > > Helen
> >> > >
> >> > > On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari <
> >> > ramayan.tiw...@gmail.com
> >> > > > wrote:
> >> > >
> >> > >> Thanks so much Rob, I will test the patch against trunk and will
> >> update
> >> > >> you with the outcome.
> >> > >>
> >> > >> - Ramayan
> >> > >>
> >> > >> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey <
> rob.j.godf...@gmail.com
> >> >
> >> > >> wrote:
> >> > >>
> >> > >>> On 17 October 2016 at 21:50, Rob Godfrey  >
> >> > >>> wrote:
> >> > >>>
> >> > >>> >
> >> > >>> >
> >> > >>> > On 17 October 2016 at 21:24, Ramayan Tiwari <
> >> > ramayan.tiw...@gmail.com>
> >> > >>> > wrote:
> >> > >>> >
> >> > >>> >> Hi Rob,
> >> > >>> >>
> >> > 

Re: Qpid broker 6.0.4 performance issues

2016-10-28 Thread Keith W
Hi Ramayan

QPID-7462 is a new (experimental) feature, so we don't consider this
appropriate for inclusion in the 6.0.5 defect release  We follow a
Semantic Versioning[1] strategy.

The underlying issue is your testing has uncovered is poor performance
with large numbers of consumers.  QPID-7462 effectively side steps the
problem (by introducing alternative consumer behaviour) but does not
address the root cause. We continue to consider how best to resolve
the problem completely, but don't yet have timelines for this change.
It is something that will be getting attention in what remains of this
year.  We will keep you posted.

In the meanwhile, I understand this causes you a problem.  If you
cannot adopt 6.1 (there should be another RC out soon), you could
consider applying the patch (attached to the JIRA) to 6.0.x branch and
building yourself.

Kind regards, Keith.


[1] http://semver.org


On 27 October 2016 at 23:19, Ramayan Tiwari  wrote:
> Hi Rob,
>
> I have the truck code which I am testing with, I haven't finished the test
> runs yet. I was hoping that once I validate the change, I can simply
> release 6.0.5.
>
> Thanks
> Ramayan
>
> On Thu, Oct 27, 2016 at 12:41 PM, Rob Godfrey 
> wrote:
>
>> Hi Ramayan,
>>
>> did you verify that the change works for you?  You said you were going to
>> test with the trunk code...
>>
>> I'll discuss with the other developers tomorrow about whether we can put
>> this change into 6.0.5.
>>
>> Cheers,
>> Rob
>>
>> On 27 October 2016 at 20:30, Ramayan Tiwari 
>> wrote:
>>
>> > Hi Rob,
>> >
>> > I looked at the release notes for 6.0.5 and it doesn't include the fix
>> for
>> > large consumers issues [1]. The fix is marked for 6.1, which will not
>> have
>> > JMX and for us to use this version requires major changes in our
>> monitoring
>> > framework. Could you please include the fix in 6.0.5 release?
>> >
>> > Thanks
>> > Ramayan
>> >
>> > [1]. https://issues.apache.org/jira/browse/QPID-7462
>> >
>> > On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong 
>> wrote:
>> >
>> > > Hi Rob,
>> > >
>> > > Again, thank you so much for answering our questions and providing a
>> > patch
>> > > so quickly :) One more question I have: would it be possible to include
>> > > test cases involving many queues and listeners (in the order of
>> thousands
>> > > of queues) for future Qpid releases, as part of standard perf testing
>> of
>> > > the broker?
>> > >
>> > > Thanks,
>> > > Helen
>> > >
>> > > On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari <
>> > ramayan.tiw...@gmail.com
>> > > > wrote:
>> > >
>> > >> Thanks so much Rob, I will test the patch against trunk and will
>> update
>> > >> you with the outcome.
>> > >>
>> > >> - Ramayan
>> > >>
>> > >> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey > >
>> > >> wrote:
>> > >>
>> > >>> On 17 October 2016 at 21:50, Rob Godfrey 
>> > >>> wrote:
>> > >>>
>> > >>> >
>> > >>> >
>> > >>> > On 17 October 2016 at 21:24, Ramayan Tiwari <
>> > ramayan.tiw...@gmail.com>
>> > >>> > wrote:
>> > >>> >
>> > >>> >> Hi Rob,
>> > >>> >>
>> > >>> >> We are certainly interested in testing the "multi queue consumers"
>> > >>> >> behavior
>> > >>> >> with your patch in the new broker. We would like to know:
>> > >>> >>
>> > >>> >> 1. What will the scope of changes, client or broker or both? We
>> are
>> > >>> >> currently running 0.16 client, so would like to make sure that we
>> > will
>> > >>> >> able
>> > >>> >> to use these changes with 0.16 client.
>> > >>> >>
>> > >>> >>
>> > >>> > There's no change to the client.  I can't remember what was in the
>> > 0.16
>> > >>> > client... the only issue would be if there are any bugs in the
>> > parsing
>> > >>> of
>> > >>> > address arguments.  I can try to test that out tmr.
>> > >>> >
>> > >>>
>> > >>>
>> > >>> OK - with a little bit of care to get round the address parsing
>> issues
>> > in
>> > >>> the 0.16 client... I think we can get this to work.  I've created the
>> > >>> following JIRA:
>> > >>>
>> > >>> https://issues.apache.org/jira/browse/QPID-7462
>> > >>>
>> > >>> and attached to it are a patch which applies against trunk, and a
>> > >>> separate
>> > >>> patch which applies against the 6.0.x branch (
>> > >>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is
>> > >>> 6.0.4
>> > >>> plus a few other fixes which we will soon be releasing as 6.0.5)
>> > >>>
>> > >>> To create a consumer which uses this feature (and multi queue
>> > >>> consumption)
>> > >>> for the 0.16 client you need to use something like the following as
>> the
>> > >>> address:
>> > >>>
>> > >>> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
>> > >>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
>> > >>> x-pull-only : true 
>> > >>>
>> > >>>
>> > >>> Note that the initial queue_01 has to be a name of an actual queue on
>> > >>> the virtual host, but otherwise it is not actually used (if you were
>> > >>> using a 0.32 or later client you could just use '' here).  The actual
>> > >>> que

Re: Qpid broker 6.0.4 performance issues

2016-10-27 Thread Ramayan Tiwari
Hi Rob,

I have the truck code which I am testing with, I haven't finished the test
runs yet. I was hoping that once I validate the change, I can simply
release 6.0.5.

Thanks
Ramayan

On Thu, Oct 27, 2016 at 12:41 PM, Rob Godfrey 
wrote:

> Hi Ramayan,
>
> did you verify that the change works for you?  You said you were going to
> test with the trunk code...
>
> I'll discuss with the other developers tomorrow about whether we can put
> this change into 6.0.5.
>
> Cheers,
> Rob
>
> On 27 October 2016 at 20:30, Ramayan Tiwari 
> wrote:
>
> > Hi Rob,
> >
> > I looked at the release notes for 6.0.5 and it doesn't include the fix
> for
> > large consumers issues [1]. The fix is marked for 6.1, which will not
> have
> > JMX and for us to use this version requires major changes in our
> monitoring
> > framework. Could you please include the fix in 6.0.5 release?
> >
> > Thanks
> > Ramayan
> >
> > [1]. https://issues.apache.org/jira/browse/QPID-7462
> >
> > On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong 
> wrote:
> >
> > > Hi Rob,
> > >
> > > Again, thank you so much for answering our questions and providing a
> > patch
> > > so quickly :) One more question I have: would it be possible to include
> > > test cases involving many queues and listeners (in the order of
> thousands
> > > of queues) for future Qpid releases, as part of standard perf testing
> of
> > > the broker?
> > >
> > > Thanks,
> > > Helen
> > >
> > > On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari <
> > ramayan.tiw...@gmail.com
> > > > wrote:
> > >
> > >> Thanks so much Rob, I will test the patch against trunk and will
> update
> > >> you with the outcome.
> > >>
> > >> - Ramayan
> > >>
> > >> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey  >
> > >> wrote:
> > >>
> > >>> On 17 October 2016 at 21:50, Rob Godfrey 
> > >>> wrote:
> > >>>
> > >>> >
> > >>> >
> > >>> > On 17 October 2016 at 21:24, Ramayan Tiwari <
> > ramayan.tiw...@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >> Hi Rob,
> > >>> >>
> > >>> >> We are certainly interested in testing the "multi queue consumers"
> > >>> >> behavior
> > >>> >> with your patch in the new broker. We would like to know:
> > >>> >>
> > >>> >> 1. What will the scope of changes, client or broker or both? We
> are
> > >>> >> currently running 0.16 client, so would like to make sure that we
> > will
> > >>> >> able
> > >>> >> to use these changes with 0.16 client.
> > >>> >>
> > >>> >>
> > >>> > There's no change to the client.  I can't remember what was in the
> > 0.16
> > >>> > client... the only issue would be if there are any bugs in the
> > parsing
> > >>> of
> > >>> > address arguments.  I can try to test that out tmr.
> > >>> >
> > >>>
> > >>>
> > >>> OK - with a little bit of care to get round the address parsing
> issues
> > in
> > >>> the 0.16 client... I think we can get this to work.  I've created the
> > >>> following JIRA:
> > >>>
> > >>> https://issues.apache.org/jira/browse/QPID-7462
> > >>>
> > >>> and attached to it are a patch which applies against trunk, and a
> > >>> separate
> > >>> patch which applies against the 6.0.x branch (
> > >>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is
> > >>> 6.0.4
> > >>> plus a few other fixes which we will soon be releasing as 6.0.5)
> > >>>
> > >>> To create a consumer which uses this feature (and multi queue
> > >>> consumption)
> > >>> for the 0.16 client you need to use something like the following as
> the
> > >>> address:
> > >>>
> > >>> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
> > >>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
> > >>> x-pull-only : true 
> > >>>
> > >>>
> > >>> Note that the initial queue_01 has to be a name of an actual queue on
> > >>> the virtual host, but otherwise it is not actually used (if you were
> > >>> using a 0.32 or later client you could just use '' here).  The actual
> > >>> queues that are consumed from are in the list value associated with
> > >>> x-multiqueue.  For my testing I created a list with 3000 queues here
> > >>> and this worked fine.
> > >>>
> > >>> Let me know if you have any questions / issues,
> > >>>
> > >>> Hope this helps,
> > >>> Rob
> > >>>
> > >>>
> > >>> >
> > >>> >
> > >>> >> 2. My understanding is that the "pull vs push" change is only with
> > >>> respect
> > >>> >> to broker and it does not change our architecture where we use
> > >>> >> MessageListerner to receive messages asynchronously.
> > >>> >>
> > >>> >
> > >>> > Exactly - this is only a change within the internal broker
> threading
> > >>> > model.  The external behaviour of the broker remains essentially
> > >>> unchanged.
> > >>> >
> > >>> >
> > >>> >>
> > >>> >> 3. Once I/O refactoring is completely, we would be able to go back
> > to
> > >>> use
> > >>> >> standard JMS consumer (Destination), what is the timeline and
> broker
> > >>> >> release version for the completion of this work?
> > >>> >>
> > >>> >
> > >>> > You might wish to continue to use the "multi queue" m

Re: Qpid broker 6.0.4 performance issues

2016-10-27 Thread Rob Godfrey
Hi Ramayan,

did you verify that the change works for you?  You said you were going to
test with the trunk code...

I'll discuss with the other developers tomorrow about whether we can put
this change into 6.0.5.

Cheers,
Rob

On 27 October 2016 at 20:30, Ramayan Tiwari 
wrote:

> Hi Rob,
>
> I looked at the release notes for 6.0.5 and it doesn't include the fix for
> large consumers issues [1]. The fix is marked for 6.1, which will not have
> JMX and for us to use this version requires major changes in our monitoring
> framework. Could you please include the fix in 6.0.5 release?
>
> Thanks
> Ramayan
>
> [1]. https://issues.apache.org/jira/browse/QPID-7462
>
> On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong  wrote:
>
> > Hi Rob,
> >
> > Again, thank you so much for answering our questions and providing a
> patch
> > so quickly :) One more question I have: would it be possible to include
> > test cases involving many queues and listeners (in the order of thousands
> > of queues) for future Qpid releases, as part of standard perf testing of
> > the broker?
> >
> > Thanks,
> > Helen
> >
> > On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari <
> ramayan.tiw...@gmail.com
> > > wrote:
> >
> >> Thanks so much Rob, I will test the patch against trunk and will update
> >> you with the outcome.
> >>
> >> - Ramayan
> >>
> >> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey 
> >> wrote:
> >>
> >>> On 17 October 2016 at 21:50, Rob Godfrey 
> >>> wrote:
> >>>
> >>> >
> >>> >
> >>> > On 17 October 2016 at 21:24, Ramayan Tiwari <
> ramayan.tiw...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> Hi Rob,
> >>> >>
> >>> >> We are certainly interested in testing the "multi queue consumers"
> >>> >> behavior
> >>> >> with your patch in the new broker. We would like to know:
> >>> >>
> >>> >> 1. What will the scope of changes, client or broker or both? We are
> >>> >> currently running 0.16 client, so would like to make sure that we
> will
> >>> >> able
> >>> >> to use these changes with 0.16 client.
> >>> >>
> >>> >>
> >>> > There's no change to the client.  I can't remember what was in the
> 0.16
> >>> > client... the only issue would be if there are any bugs in the
> parsing
> >>> of
> >>> > address arguments.  I can try to test that out tmr.
> >>> >
> >>>
> >>>
> >>> OK - with a little bit of care to get round the address parsing issues
> in
> >>> the 0.16 client... I think we can get this to work.  I've created the
> >>> following JIRA:
> >>>
> >>> https://issues.apache.org/jira/browse/QPID-7462
> >>>
> >>> and attached to it are a patch which applies against trunk, and a
> >>> separate
> >>> patch which applies against the 6.0.x branch (
> >>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is
> >>> 6.0.4
> >>> plus a few other fixes which we will soon be releasing as 6.0.5)
> >>>
> >>> To create a consumer which uses this feature (and multi queue
> >>> consumption)
> >>> for the 0.16 client you need to use something like the following as the
> >>> address:
> >>>
> >>> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
> >>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
> >>> x-pull-only : true 
> >>>
> >>>
> >>> Note that the initial queue_01 has to be a name of an actual queue on
> >>> the virtual host, but otherwise it is not actually used (if you were
> >>> using a 0.32 or later client you could just use '' here).  The actual
> >>> queues that are consumed from are in the list value associated with
> >>> x-multiqueue.  For my testing I created a list with 3000 queues here
> >>> and this worked fine.
> >>>
> >>> Let me know if you have any questions / issues,
> >>>
> >>> Hope this helps,
> >>> Rob
> >>>
> >>>
> >>> >
> >>> >
> >>> >> 2. My understanding is that the "pull vs push" change is only with
> >>> respect
> >>> >> to broker and it does not change our architecture where we use
> >>> >> MessageListerner to receive messages asynchronously.
> >>> >>
> >>> >
> >>> > Exactly - this is only a change within the internal broker threading
> >>> > model.  The external behaviour of the broker remains essentially
> >>> unchanged.
> >>> >
> >>> >
> >>> >>
> >>> >> 3. Once I/O refactoring is completely, we would be able to go back
> to
> >>> use
> >>> >> standard JMS consumer (Destination), what is the timeline and broker
> >>> >> release version for the completion of this work?
> >>> >>
> >>> >
> >>> > You might wish to continue to use the "multi queue" model, depending
> on
> >>> > your actual use case, but yeah once the I/O work is complete I would
> >>> hope
> >>> > that you could use the thousands of consumers model should you wish.
> >>> We
> >>> > don't have a schedule for the next phase of I/O rework right now -
> >>> about
> >>> > all I can say is that it is unlikely to be complete this year.  I'd
> >>> need to
> >>> > talk with Keith (who is currently on vacation) as to when we think we
> >>> may
> >>> > be able to schedule it.
> >>> >
> >>> >
> >>> >>
> >>> >> Let me know once yo

Re: Qpid broker 6.0.4 performance issues

2016-10-27 Thread Ramayan Tiwari
Hi Rob,

I looked at the release notes for 6.0.5 and it doesn't include the fix for
large consumers issues [1]. The fix is marked for 6.1, which will not have
JMX and for us to use this version requires major changes in our monitoring
framework. Could you please include the fix in 6.0.5 release?

Thanks
Ramayan

[1]. https://issues.apache.org/jira/browse/QPID-7462

On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong  wrote:

> Hi Rob,
>
> Again, thank you so much for answering our questions and providing a patch
> so quickly :) One more question I have: would it be possible to include
> test cases involving many queues and listeners (in the order of thousands
> of queues) for future Qpid releases, as part of standard perf testing of
> the broker?
>
> Thanks,
> Helen
>
> On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari  > wrote:
>
>> Thanks so much Rob, I will test the patch against trunk and will update
>> you with the outcome.
>>
>> - Ramayan
>>
>> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey 
>> wrote:
>>
>>> On 17 October 2016 at 21:50, Rob Godfrey 
>>> wrote:
>>>
>>> >
>>> >
>>> > On 17 October 2016 at 21:24, Ramayan Tiwari 
>>> > wrote:
>>> >
>>> >> Hi Rob,
>>> >>
>>> >> We are certainly interested in testing the "multi queue consumers"
>>> >> behavior
>>> >> with your patch in the new broker. We would like to know:
>>> >>
>>> >> 1. What will the scope of changes, client or broker or both? We are
>>> >> currently running 0.16 client, so would like to make sure that we will
>>> >> able
>>> >> to use these changes with 0.16 client.
>>> >>
>>> >>
>>> > There's no change to the client.  I can't remember what was in the 0.16
>>> > client... the only issue would be if there are any bugs in the parsing
>>> of
>>> > address arguments.  I can try to test that out tmr.
>>> >
>>>
>>>
>>> OK - with a little bit of care to get round the address parsing issues in
>>> the 0.16 client... I think we can get this to work.  I've created the
>>> following JIRA:
>>>
>>> https://issues.apache.org/jira/browse/QPID-7462
>>>
>>> and attached to it are a patch which applies against trunk, and a
>>> separate
>>> patch which applies against the 6.0.x branch (
>>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is
>>> 6.0.4
>>> plus a few other fixes which we will soon be releasing as 6.0.5)
>>>
>>> To create a consumer which uses this feature (and multi queue
>>> consumption)
>>> for the 0.16 client you need to use something like the following as the
>>> address:
>>>
>>> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
>>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
>>> x-pull-only : true 
>>>
>>>
>>> Note that the initial queue_01 has to be a name of an actual queue on
>>> the virtual host, but otherwise it is not actually used (if you were
>>> using a 0.32 or later client you could just use '' here).  The actual
>>> queues that are consumed from are in the list value associated with
>>> x-multiqueue.  For my testing I created a list with 3000 queues here
>>> and this worked fine.
>>>
>>> Let me know if you have any questions / issues,
>>>
>>> Hope this helps,
>>> Rob
>>>
>>>
>>> >
>>> >
>>> >> 2. My understanding is that the "pull vs push" change is only with
>>> respect
>>> >> to broker and it does not change our architecture where we use
>>> >> MessageListerner to receive messages asynchronously.
>>> >>
>>> >
>>> > Exactly - this is only a change within the internal broker threading
>>> > model.  The external behaviour of the broker remains essentially
>>> unchanged.
>>> >
>>> >
>>> >>
>>> >> 3. Once I/O refactoring is completely, we would be able to go back to
>>> use
>>> >> standard JMS consumer (Destination), what is the timeline and broker
>>> >> release version for the completion of this work?
>>> >>
>>> >
>>> > You might wish to continue to use the "multi queue" model, depending on
>>> > your actual use case, but yeah once the I/O work is complete I would
>>> hope
>>> > that you could use the thousands of consumers model should you wish.
>>> We
>>> > don't have a schedule for the next phase of I/O rework right now -
>>> about
>>> > all I can say is that it is unlikely to be complete this year.  I'd
>>> need to
>>> > talk with Keith (who is currently on vacation) as to when we think we
>>> may
>>> > be able to schedule it.
>>> >
>>> >
>>> >>
>>> >> Let me know once you have integrated the patch and I will re-run our
>>> >> performance tests to validate it.
>>> >>
>>> >>
>>> > I'll make a patch for 6.0.x presently (I've been working on a change
>>> > against trunk - the patch will probably have to change a bit to apply
>>> to
>>> > 6.0.x).
>>> >
>>> > Cheers,
>>> > Rob
>>> >
>>> > Thanks
>>> >> Ramayan
>>> >>
>>> >> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey >> >
>>> >> wrote:
>>> >>
>>> >> > OK - so having pondered / hacked around a bit this weekend, I think
>>> to
>>> >> get
>>> >> > decent performance from the IO model in 6.0 for your use case we're
>>> >> going
>>> >> 

Re: Qpid broker 6.0.4 performance issues

2016-10-19 Thread Helen Kwong
Hi Rob,

Again, thank you so much for answering our questions and providing a patch
so quickly :) One more question I have: would it be possible to include
test cases involving many queues and listeners (in the order of thousands
of queues) for future Qpid releases, as part of standard perf testing of
the broker?

Thanks,
Helen

On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari 
wrote:

> Thanks so much Rob, I will test the patch against trunk and will update
> you with the outcome.
>
> - Ramayan
>
> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey 
> wrote:
>
>> On 17 October 2016 at 21:50, Rob Godfrey  wrote:
>>
>> >
>> >
>> > On 17 October 2016 at 21:24, Ramayan Tiwari 
>> > wrote:
>> >
>> >> Hi Rob,
>> >>
>> >> We are certainly interested in testing the "multi queue consumers"
>> >> behavior
>> >> with your patch in the new broker. We would like to know:
>> >>
>> >> 1. What will the scope of changes, client or broker or both? We are
>> >> currently running 0.16 client, so would like to make sure that we will
>> >> able
>> >> to use these changes with 0.16 client.
>> >>
>> >>
>> > There's no change to the client.  I can't remember what was in the 0.16
>> > client... the only issue would be if there are any bugs in the parsing
>> of
>> > address arguments.  I can try to test that out tmr.
>> >
>>
>>
>> OK - with a little bit of care to get round the address parsing issues in
>> the 0.16 client... I think we can get this to work.  I've created the
>> following JIRA:
>>
>> https://issues.apache.org/jira/browse/QPID-7462
>>
>> and attached to it are a patch which applies against trunk, and a separate
>> patch which applies against the 6.0.x branch (
>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is 6.0.4
>> plus a few other fixes which we will soon be releasing as 6.0.5)
>>
>> To create a consumer which uses this feature (and multi queue consumption)
>> for the 0.16 client you need to use something like the following as the
>> address:
>>
>> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
>> x-pull-only : true 
>>
>>
>> Note that the initial queue_01 has to be a name of an actual queue on
>> the virtual host, but otherwise it is not actually used (if you were
>> using a 0.32 or later client you could just use '' here).  The actual
>> queues that are consumed from are in the list value associated with
>> x-multiqueue.  For my testing I created a list with 3000 queues here
>> and this worked fine.
>>
>> Let me know if you have any questions / issues,
>>
>> Hope this helps,
>> Rob
>>
>>
>> >
>> >
>> >> 2. My understanding is that the "pull vs push" change is only with
>> respect
>> >> to broker and it does not change our architecture where we use
>> >> MessageListerner to receive messages asynchronously.
>> >>
>> >
>> > Exactly - this is only a change within the internal broker threading
>> > model.  The external behaviour of the broker remains essentially
>> unchanged.
>> >
>> >
>> >>
>> >> 3. Once I/O refactoring is completely, we would be able to go back to
>> use
>> >> standard JMS consumer (Destination), what is the timeline and broker
>> >> release version for the completion of this work?
>> >>
>> >
>> > You might wish to continue to use the "multi queue" model, depending on
>> > your actual use case, but yeah once the I/O work is complete I would
>> hope
>> > that you could use the thousands of consumers model should you wish.  We
>> > don't have a schedule for the next phase of I/O rework right now - about
>> > all I can say is that it is unlikely to be complete this year.  I'd
>> need to
>> > talk with Keith (who is currently on vacation) as to when we think we
>> may
>> > be able to schedule it.
>> >
>> >
>> >>
>> >> Let me know once you have integrated the patch and I will re-run our
>> >> performance tests to validate it.
>> >>
>> >>
>> > I'll make a patch for 6.0.x presently (I've been working on a change
>> > against trunk - the patch will probably have to change a bit to apply to
>> > 6.0.x).
>> >
>> > Cheers,
>> > Rob
>> >
>> > Thanks
>> >> Ramayan
>> >>
>> >> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey 
>> >> wrote:
>> >>
>> >> > OK - so having pondered / hacked around a bit this weekend, I think
>> to
>> >> get
>> >> > decent performance from the IO model in 6.0 for your use case we're
>> >> going
>> >> > to have to change things around a bit.
>> >> >
>> >> > Basically 6.0 is an intermediate step on our IO / threading model
>> >> journey.
>> >> > In earlier versions we used 2 threads per connection for IO (one
>> read,
>> >> one
>> >> > write) and then extra threads from a pool to "push" messages from
>> >> queues to
>> >> > connections.
>> >> >
>> >> > In 6.0 we move to using a pool for the IO threads, and also stopped
>> >> queues
>> >> > from "pushing" to connections while the IO threads were acting on the
>> >> > connection.  It's this latter fact which is screwing up performance
>> for
>

Re: Qpid broker 6.0.4 performance issues

2016-10-18 Thread Ramayan Tiwari
Thanks so much Rob, I will test the patch against trunk and will update you
with the outcome.

- Ramayan

On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey 
wrote:

> On 17 October 2016 at 21:50, Rob Godfrey  wrote:
>
> >
> >
> > On 17 October 2016 at 21:24, Ramayan Tiwari 
> > wrote:
> >
> >> Hi Rob,
> >>
> >> We are certainly interested in testing the "multi queue consumers"
> >> behavior
> >> with your patch in the new broker. We would like to know:
> >>
> >> 1. What will the scope of changes, client or broker or both? We are
> >> currently running 0.16 client, so would like to make sure that we will
> >> able
> >> to use these changes with 0.16 client.
> >>
> >>
> > There's no change to the client.  I can't remember what was in the 0.16
> > client... the only issue would be if there are any bugs in the parsing of
> > address arguments.  I can try to test that out tmr.
> >
>
>
> OK - with a little bit of care to get round the address parsing issues in
> the 0.16 client... I think we can get this to work.  I've created the
> following JIRA:
>
> https://issues.apache.org/jira/browse/QPID-7462
>
> and attached to it are a patch which applies against trunk, and a separate
> patch which applies against the 6.0.x branch (
> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is 6.0.4
> plus a few other fixes which we will soon be releasing as 6.0.5)
>
> To create a consumer which uses this feature (and multi queue consumption)
> for the 0.16 client you need to use something like the following as the
> address:
>
> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
> x-pull-only : true 
>
>
> Note that the initial queue_01 has to be a name of an actual queue on
> the virtual host, but otherwise it is not actually used (if you were
> using a 0.32 or later client you could just use '' here).  The actual
> queues that are consumed from are in the list value associated with
> x-multiqueue.  For my testing I created a list with 3000 queues here
> and this worked fine.
>
> Let me know if you have any questions / issues,
>
> Hope this helps,
> Rob
>
>
> >
> >
> >> 2. My understanding is that the "pull vs push" change is only with
> respect
> >> to broker and it does not change our architecture where we use
> >> MessageListerner to receive messages asynchronously.
> >>
> >
> > Exactly - this is only a change within the internal broker threading
> > model.  The external behaviour of the broker remains essentially
> unchanged.
> >
> >
> >>
> >> 3. Once I/O refactoring is completely, we would be able to go back to
> use
> >> standard JMS consumer (Destination), what is the timeline and broker
> >> release version for the completion of this work?
> >>
> >
> > You might wish to continue to use the "multi queue" model, depending on
> > your actual use case, but yeah once the I/O work is complete I would hope
> > that you could use the thousands of consumers model should you wish.  We
> > don't have a schedule for the next phase of I/O rework right now - about
> > all I can say is that it is unlikely to be complete this year.  I'd need
> to
> > talk with Keith (who is currently on vacation) as to when we think we may
> > be able to schedule it.
> >
> >
> >>
> >> Let me know once you have integrated the patch and I will re-run our
> >> performance tests to validate it.
> >>
> >>
> > I'll make a patch for 6.0.x presently (I've been working on a change
> > against trunk - the patch will probably have to change a bit to apply to
> > 6.0.x).
> >
> > Cheers,
> > Rob
> >
> > Thanks
> >> Ramayan
> >>
> >> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey 
> >> wrote:
> >>
> >> > OK - so having pondered / hacked around a bit this weekend, I think to
> >> get
> >> > decent performance from the IO model in 6.0 for your use case we're
> >> going
> >> > to have to change things around a bit.
> >> >
> >> > Basically 6.0 is an intermediate step on our IO / threading model
> >> journey.
> >> > In earlier versions we used 2 threads per connection for IO (one read,
> >> one
> >> > write) and then extra threads from a pool to "push" messages from
> >> queues to
> >> > connections.
> >> >
> >> > In 6.0 we move to using a pool for the IO threads, and also stopped
> >> queues
> >> > from "pushing" to connections while the IO threads were acting on the
> >> > connection.  It's this latter fact which is screwing up performance
> for
> >> > your use case here because what happens is that on each network read
> we
> >> > tell each consumer to stop accepting pushes from the queue until the
> IO
> >> > interaction has completed.  This is causing lots of loops over your
> 3000
> >> > consumers on each session, which is eating up a lot of CPU on every
> >> network
> >> > interaction.
> >> >
> >> > In the final version of our IO refactoring we want to remove the
> >> "pushing"
> >> > from the queue, and instead have the consumers "pull" - so that the
> only
> >> > threads 

Re: Qpid broker 6.0.4 performance issues

2016-10-18 Thread Rob Godfrey
On 17 October 2016 at 21:50, Rob Godfrey  wrote:

>
>
> On 17 October 2016 at 21:24, Ramayan Tiwari 
> wrote:
>
>> Hi Rob,
>>
>> We are certainly interested in testing the "multi queue consumers"
>> behavior
>> with your patch in the new broker. We would like to know:
>>
>> 1. What will the scope of changes, client or broker or both? We are
>> currently running 0.16 client, so would like to make sure that we will
>> able
>> to use these changes with 0.16 client.
>>
>>
> There's no change to the client.  I can't remember what was in the 0.16
> client... the only issue would be if there are any bugs in the parsing of
> address arguments.  I can try to test that out tmr.
>


OK - with a little bit of care to get round the address parsing issues in
the 0.16 client... I think we can get this to work.  I've created the
following JIRA:

https://issues.apache.org/jira/browse/QPID-7462

and attached to it are a patch which applies against trunk, and a separate
patch which applies against the 6.0.x branch (
https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is 6.0.4
plus a few other fixes which we will soon be releasing as 6.0.5)

To create a consumer which uses this feature (and multi queue consumption)
for the 0.16 client you need to use something like the following as the
address:

queue_01 ; {node : { type : queue }, link : { x-subscribes : {
arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
x-pull-only : true 


Note that the initial queue_01 has to be a name of an actual queue on
the virtual host, but otherwise it is not actually used (if you were
using a 0.32 or later client you could just use '' here).  The actual
queues that are consumed from are in the list value associated with
x-multiqueue.  For my testing I created a list with 3000 queues here
and this worked fine.

Let me know if you have any questions / issues,

Hope this helps,
Rob


>
>
>> 2. My understanding is that the "pull vs push" change is only with respect
>> to broker and it does not change our architecture where we use
>> MessageListerner to receive messages asynchronously.
>>
>
> Exactly - this is only a change within the internal broker threading
> model.  The external behaviour of the broker remains essentially unchanged.
>
>
>>
>> 3. Once I/O refactoring is completely, we would be able to go back to use
>> standard JMS consumer (Destination), what is the timeline and broker
>> release version for the completion of this work?
>>
>
> You might wish to continue to use the "multi queue" model, depending on
> your actual use case, but yeah once the I/O work is complete I would hope
> that you could use the thousands of consumers model should you wish.  We
> don't have a schedule for the next phase of I/O rework right now - about
> all I can say is that it is unlikely to be complete this year.  I'd need to
> talk with Keith (who is currently on vacation) as to when we think we may
> be able to schedule it.
>
>
>>
>> Let me know once you have integrated the patch and I will re-run our
>> performance tests to validate it.
>>
>>
> I'll make a patch for 6.0.x presently (I've been working on a change
> against trunk - the patch will probably have to change a bit to apply to
> 6.0.x).
>
> Cheers,
> Rob
>
> Thanks
>> Ramayan
>>
>> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey 
>> wrote:
>>
>> > OK - so having pondered / hacked around a bit this weekend, I think to
>> get
>> > decent performance from the IO model in 6.0 for your use case we're
>> going
>> > to have to change things around a bit.
>> >
>> > Basically 6.0 is an intermediate step on our IO / threading model
>> journey.
>> > In earlier versions we used 2 threads per connection for IO (one read,
>> one
>> > write) and then extra threads from a pool to "push" messages from
>> queues to
>> > connections.
>> >
>> > In 6.0 we move to using a pool for the IO threads, and also stopped
>> queues
>> > from "pushing" to connections while the IO threads were acting on the
>> > connection.  It's this latter fact which is screwing up performance for
>> > your use case here because what happens is that on each network read we
>> > tell each consumer to stop accepting pushes from the queue until the IO
>> > interaction has completed.  This is causing lots of loops over your 3000
>> > consumers on each session, which is eating up a lot of CPU on every
>> network
>> > interaction.
>> >
>> > In the final version of our IO refactoring we want to remove the
>> "pushing"
>> > from the queue, and instead have the consumers "pull" - so that the only
>> > threads that operate on the queues (outside of housekeeping tasks like
>> > expiry) will be the IO threads.
>> >
>> > So, what we could do (and I have a patch sitting on my laptop for this)
>> is
>> > to look at using the "multi queue consumers" work I did for you guys
>> > before, but augmenting this so that the consumers work using a "pull"
>> model
>> > rather than the push model.  This will guarantee strict fairness between
>> 

Re: Qpid broker 6.0.4 performance issues

2016-10-17 Thread Rob Godfrey
On 17 October 2016 at 21:24, Ramayan Tiwari 
wrote:

> Hi Rob,
>
> We are certainly interested in testing the "multi queue consumers" behavior
> with your patch in the new broker. We would like to know:
>
> 1. What will the scope of changes, client or broker or both? We are
> currently running 0.16 client, so would like to make sure that we will able
> to use these changes with 0.16 client.
>
>
There's no change to the client.  I can't remember what was in the 0.16
client... the only issue would be if there are any bugs in the parsing of
address arguments.  I can try to test that out tmr.


> 2. My understanding is that the "pull vs push" change is only with respect
> to broker and it does not change our architecture where we use
> MessageListerner to receive messages asynchronously.
>

Exactly - this is only a change within the internal broker threading
model.  The external behaviour of the broker remains essentially unchanged.


>
> 3. Once I/O refactoring is completely, we would be able to go back to use
> standard JMS consumer (Destination), what is the timeline and broker
> release version for the completion of this work?
>

You might wish to continue to use the "multi queue" model, depending on
your actual use case, but yeah once the I/O work is complete I would hope
that you could use the thousands of consumers model should you wish.  We
don't have a schedule for the next phase of I/O rework right now - about
all I can say is that it is unlikely to be complete this year.  I'd need to
talk with Keith (who is currently on vacation) as to when we think we may
be able to schedule it.


>
> Let me know once you have integrated the patch and I will re-run our
> performance tests to validate it.
>
>
I'll make a patch for 6.0.x presently (I've been working on a change
against trunk - the patch will probably have to change a bit to apply to
6.0.x).

Cheers,
Rob

Thanks
> Ramayan
>
> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey 
> wrote:
>
> > OK - so having pondered / hacked around a bit this weekend, I think to
> get
> > decent performance from the IO model in 6.0 for your use case we're going
> > to have to change things around a bit.
> >
> > Basically 6.0 is an intermediate step on our IO / threading model
> journey.
> > In earlier versions we used 2 threads per connection for IO (one read,
> one
> > write) and then extra threads from a pool to "push" messages from queues
> to
> > connections.
> >
> > In 6.0 we move to using a pool for the IO threads, and also stopped
> queues
> > from "pushing" to connections while the IO threads were acting on the
> > connection.  It's this latter fact which is screwing up performance for
> > your use case here because what happens is that on each network read we
> > tell each consumer to stop accepting pushes from the queue until the IO
> > interaction has completed.  This is causing lots of loops over your 3000
> > consumers on each session, which is eating up a lot of CPU on every
> network
> > interaction.
> >
> > In the final version of our IO refactoring we want to remove the
> "pushing"
> > from the queue, and instead have the consumers "pull" - so that the only
> > threads that operate on the queues (outside of housekeeping tasks like
> > expiry) will be the IO threads.
> >
> > So, what we could do (and I have a patch sitting on my laptop for this)
> is
> > to look at using the "multi queue consumers" work I did for you guys
> > before, but augmenting this so that the consumers work using a "pull"
> model
> > rather than the push model.  This will guarantee strict fairness between
> > the queues associated with the consumer (which was the issue you had with
> > this functionality before, I believe).  Using this model you'd only need
> a
> > small number (one?) of consumers per session.  The patch I have is to add
> > this "pull" mode for these consumers (essentially this is a preview of
> how
> > all consumers will work in the future).
> >
> > Does this seem like something you would be interested in pursuing?
> >
> > Cheers,
> > Rob
> >
> > On 15 October 2016 at 17:30, Ramayan Tiwari 
> > wrote:
> >
> > > Thanks Rob. Apologies for sending this over weekend :(
> > >
> > > Are there are docs on the new threading model? I found this on
> > confluence:
> > >
> > > https://cwiki.apache.org/confluence/display/qpid/IO+
> > Transport+Refactoring
> > >
> > > We are also interested in understanding the threading model a little
> > better
> > > to help us figure our its impact for our usage patterns. Would be very
> > > helpful if there are more docs/JIRA/email-threads with some details.
> > >
> > > Thanks
> > >
> > > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey 
> > > wrote:
> > >
> > > > So I *think* this is an issue because of the extremely large number
> of
> > > > consumers.  The threading model in v6 means that whenever a network
> > read
> > > > occurs for a connection, it iterates over the consumers on that
> > > connection
> > > > - obviously where there are a lar

Re: Qpid broker 6.0.4 performance issues

2016-10-17 Thread Ramayan Tiwari
Hi Rob,

We are certainly interested in testing the "multi queue consumers" behavior
with your patch in the new broker. We would like to know:

1. What will the scope of changes, client or broker or both? We are
currently running 0.16 client, so would like to make sure that we will able
to use these changes with 0.16 client.

2. My understanding is that the "pull vs push" change is only with respect
to broker and it does not change our architecture where we use
MessageListerner to receive messages asynchronously.

3. Once I/O refactoring is completely, we would be able to go back to use
standard JMS consumer (Destination), what is the timeline and broker
release version for the completion of this work?

Let me know once you have integrated the patch and I will re-run our
performance tests to validate it.

Thanks
Ramayan

On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey 
wrote:

> OK - so having pondered / hacked around a bit this weekend, I think to get
> decent performance from the IO model in 6.0 for your use case we're going
> to have to change things around a bit.
>
> Basically 6.0 is an intermediate step on our IO / threading model journey.
> In earlier versions we used 2 threads per connection for IO (one read, one
> write) and then extra threads from a pool to "push" messages from queues to
> connections.
>
> In 6.0 we move to using a pool for the IO threads, and also stopped queues
> from "pushing" to connections while the IO threads were acting on the
> connection.  It's this latter fact which is screwing up performance for
> your use case here because what happens is that on each network read we
> tell each consumer to stop accepting pushes from the queue until the IO
> interaction has completed.  This is causing lots of loops over your 3000
> consumers on each session, which is eating up a lot of CPU on every network
> interaction.
>
> In the final version of our IO refactoring we want to remove the "pushing"
> from the queue, and instead have the consumers "pull" - so that the only
> threads that operate on the queues (outside of housekeeping tasks like
> expiry) will be the IO threads.
>
> So, what we could do (and I have a patch sitting on my laptop for this) is
> to look at using the "multi queue consumers" work I did for you guys
> before, but augmenting this so that the consumers work using a "pull" model
> rather than the push model.  This will guarantee strict fairness between
> the queues associated with the consumer (which was the issue you had with
> this functionality before, I believe).  Using this model you'd only need a
> small number (one?) of consumers per session.  The patch I have is to add
> this "pull" mode for these consumers (essentially this is a preview of how
> all consumers will work in the future).
>
> Does this seem like something you would be interested in pursuing?
>
> Cheers,
> Rob
>
> On 15 October 2016 at 17:30, Ramayan Tiwari 
> wrote:
>
> > Thanks Rob. Apologies for sending this over weekend :(
> >
> > Are there are docs on the new threading model? I found this on
> confluence:
> >
> > https://cwiki.apache.org/confluence/display/qpid/IO+
> Transport+Refactoring
> >
> > We are also interested in understanding the threading model a little
> better
> > to help us figure our its impact for our usage patterns. Would be very
> > helpful if there are more docs/JIRA/email-threads with some details.
> >
> > Thanks
> >
> > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey 
> > wrote:
> >
> > > So I *think* this is an issue because of the extremely large number of
> > > consumers.  The threading model in v6 means that whenever a network
> read
> > > occurs for a connection, it iterates over the consumers on that
> > connection
> > > - obviously where there are a large number of consumers this is
> > > burdensome.  I fear addressing this may not be a trivial change...  I
> > shall
> > > spend the rest of my afternoon pondering this...
> > >
> > > - Rob
> > >
> > > On 15 October 2016 at 17:14, Ramayan Tiwari 
> > > wrote:
> > >
> > > > Hi Rob,
> > > >
> > > > Thanks so much for your response. We use transacted sessions with
> > > > non-persistent delivery. Prefetch size is 1 and every message is same
> > > size
> > > > (200 bytes).
> > > >
> > > > Thanks
> > > > Ramayan
> > > >
> > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey <
> rob.j.godf...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ramyan,
> > > > >
> > > > > this is interesting... in our testing (which admittedly didn't
> cover
> > > the
> > > > > case of this many queues / listeners) we saw the 6.0.x broker using
> > > less
> > > > > CPU on average than the 0.32 broker.  I'll have a look this weekend
> > as
> > > to
> > > > > why creating the listeners is slower.  On the dequeing, can you
> give
> > a
> > > > > little more information on the usage pattern - are you using
> > > > transactions,
> > > > > auto-ack or client ack?  What prefetch size are you using?  How
> large
> > > are
> > > > > your messages?
> > > > >

Re: Qpid broker 6.0.4 performance issues

2016-10-16 Thread Rob Godfrey
OK - so having pondered / hacked around a bit this weekend, I think to get
decent performance from the IO model in 6.0 for your use case we're going
to have to change things around a bit.

Basically 6.0 is an intermediate step on our IO / threading model journey.
In earlier versions we used 2 threads per connection for IO (one read, one
write) and then extra threads from a pool to "push" messages from queues to
connections.

In 6.0 we move to using a pool for the IO threads, and also stopped queues
from "pushing" to connections while the IO threads were acting on the
connection.  It's this latter fact which is screwing up performance for
your use case here because what happens is that on each network read we
tell each consumer to stop accepting pushes from the queue until the IO
interaction has completed.  This is causing lots of loops over your 3000
consumers on each session, which is eating up a lot of CPU on every network
interaction.

In the final version of our IO refactoring we want to remove the "pushing"
from the queue, and instead have the consumers "pull" - so that the only
threads that operate on the queues (outside of housekeeping tasks like
expiry) will be the IO threads.

So, what we could do (and I have a patch sitting on my laptop for this) is
to look at using the "multi queue consumers" work I did for you guys
before, but augmenting this so that the consumers work using a "pull" model
rather than the push model.  This will guarantee strict fairness between
the queues associated with the consumer (which was the issue you had with
this functionality before, I believe).  Using this model you'd only need a
small number (one?) of consumers per session.  The patch I have is to add
this "pull" mode for these consumers (essentially this is a preview of how
all consumers will work in the future).

Does this seem like something you would be interested in pursuing?

Cheers,
Rob

On 15 October 2016 at 17:30, Ramayan Tiwari 
wrote:

> Thanks Rob. Apologies for sending this over weekend :(
>
> Are there are docs on the new threading model? I found this on confluence:
>
> https://cwiki.apache.org/confluence/display/qpid/IO+Transport+Refactoring
>
> We are also interested in understanding the threading model a little better
> to help us figure our its impact for our usage patterns. Would be very
> helpful if there are more docs/JIRA/email-threads with some details.
>
> Thanks
>
> On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey 
> wrote:
>
> > So I *think* this is an issue because of the extremely large number of
> > consumers.  The threading model in v6 means that whenever a network read
> > occurs for a connection, it iterates over the consumers on that
> connection
> > - obviously where there are a large number of consumers this is
> > burdensome.  I fear addressing this may not be a trivial change...  I
> shall
> > spend the rest of my afternoon pondering this...
> >
> > - Rob
> >
> > On 15 October 2016 at 17:14, Ramayan Tiwari 
> > wrote:
> >
> > > Hi Rob,
> > >
> > > Thanks so much for your response. We use transacted sessions with
> > > non-persistent delivery. Prefetch size is 1 and every message is same
> > size
> > > (200 bytes).
> > >
> > > Thanks
> > > Ramayan
> > >
> > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey 
> > > wrote:
> > >
> > > > Hi Ramyan,
> > > >
> > > > this is interesting... in our testing (which admittedly didn't cover
> > the
> > > > case of this many queues / listeners) we saw the 6.0.x broker using
> > less
> > > > CPU on average than the 0.32 broker.  I'll have a look this weekend
> as
> > to
> > > > why creating the listeners is slower.  On the dequeing, can you give
> a
> > > > little more information on the usage pattern - are you using
> > > transactions,
> > > > auto-ack or client ack?  What prefetch size are you using?  How large
> > are
> > > > your messages?
> > > >
> > > > Thanks,
> > > > Rob
> > > >
> > > > On 14 October 2016 at 23:46, Ramayan Tiwari <
> ramayan.tiw...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We have been validating the new Qpid broker (version 6.0.4) and
> have
> > > > > compared against broker version 0.32 and are seeing major
> > regressions.
> > > > > Following is the summary of our test setup and results:
> > > > >
> > > > > *1. Test Setup *
> > > > >   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM).
> > > > >   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use
> > 8GB
> > > > > heap and 8GB direct memory.
> > > > >   *c).* For 6.0.4, flow to disk has been configured at 60%.
> > > > >   *d).* Both the brokers use BDB host type.
> > > > >   *e).* Brokers have around 6000 queues and we create 16 listener
> > > > > sessions/threads spread over 3 connections, where each session is
> > > > listening
> > > > > to 3000 queues. However, messages are only enqueued and processed
> > from
> > > 10
> > > > > queues.
> > > > >   *f).* We enqueue 1 million messages across 10 different queues
>

Re: Qpid broker 6.0.4 performance issues

2016-10-15 Thread Ramayan Tiwari
Thanks Rob. Apologies for sending this over weekend :(

Are there are docs on the new threading model? I found this on confluence:

https://cwiki.apache.org/confluence/display/qpid/IO+Transport+Refactoring

We are also interested in understanding the threading model a little better
to help us figure our its impact for our usage patterns. Would be very
helpful if there are more docs/JIRA/email-threads with some details.

Thanks

On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey 
wrote:

> So I *think* this is an issue because of the extremely large number of
> consumers.  The threading model in v6 means that whenever a network read
> occurs for a connection, it iterates over the consumers on that connection
> - obviously where there are a large number of consumers this is
> burdensome.  I fear addressing this may not be a trivial change...  I shall
> spend the rest of my afternoon pondering this...
>
> - Rob
>
> On 15 October 2016 at 17:14, Ramayan Tiwari 
> wrote:
>
> > Hi Rob,
> >
> > Thanks so much for your response. We use transacted sessions with
> > non-persistent delivery. Prefetch size is 1 and every message is same
> size
> > (200 bytes).
> >
> > Thanks
> > Ramayan
> >
> > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey 
> > wrote:
> >
> > > Hi Ramyan,
> > >
> > > this is interesting... in our testing (which admittedly didn't cover
> the
> > > case of this many queues / listeners) we saw the 6.0.x broker using
> less
> > > CPU on average than the 0.32 broker.  I'll have a look this weekend as
> to
> > > why creating the listeners is slower.  On the dequeing, can you give a
> > > little more information on the usage pattern - are you using
> > transactions,
> > > auto-ack or client ack?  What prefetch size are you using?  How large
> are
> > > your messages?
> > >
> > > Thanks,
> > > Rob
> > >
> > > On 14 October 2016 at 23:46, Ramayan Tiwari 
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > We have been validating the new Qpid broker (version 6.0.4) and have
> > > > compared against broker version 0.32 and are seeing major
> regressions.
> > > > Following is the summary of our test setup and results:
> > > >
> > > > *1. Test Setup *
> > > >   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM).
> > > >   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use
> 8GB
> > > > heap and 8GB direct memory.
> > > >   *c).* For 6.0.4, flow to disk has been configured at 60%.
> > > >   *d).* Both the brokers use BDB host type.
> > > >   *e).* Brokers have around 6000 queues and we create 16 listener
> > > > sessions/threads spread over 3 connections, where each session is
> > > listening
> > > > to 3000 queues. However, messages are only enqueued and processed
> from
> > 10
> > > > queues.
> > > >   *f).* We enqueue 1 million messages across 10 different queues
> > (evenly
> > > > divided), at the start of the test. Dequeue only starts once all the
> > > > messages have been enqueued. We run the test for 2 hours and process
> as
> > > > many messages as we can. Each message runs for around 200
> milliseconds.
> > > >   *g).* We have used both 0.16 and 6.0.4 clients for these tests
> (6.0.4
> > > > client only with 6.0.4 broker)
> > > >
> > > > *2. Test Results *
> > > >   *a).* System Load Average (read notes below on how we compute it),
> > for
> > > > 6.0.4 broker is 5x compared to 0.32 broker. During start of the test
> > > (when
> > > > we are not doing any dequeue), load average is normal (0.05 for 0.32
> > > broker
> > > > and 0.1 for new broker), however, while we are dequeuing messages,
> the
> > > load
> > > > average is very high (around 0.5 consistently).
> > > >
> > > >   *b). *Time to create listeners in new broker has gone up by 220%
> > > compared
> > > > to 0.32 broker (when using 0.16 client). For old broker, creating 16
> > > > sessions each listening to 3000 queues takes 142 seconds and in new
> > > broker
> > > > it took 456 seconds. If we use 6.0.4 client, it took even longer at
> > 524%
> > > > increase (887 seconds).
> > > >  *I).* The time to create consumers increases as we create more
> > > > listeners on the same connections. We have 20 sessions (but end up
> > using
> > > > around 5 of them) on each connection and we create about 3000
> consumers
> > > and
> > > > attach MessageListener to it. Each successive session takes longer
> > > > (approximately linear increase) to setup same number of consumers and
> > > > listeners.
> > > >
> > > > *3). How we compute System Load Average *
> > > > We query the Mbean SysetmLoadAverage and divide it by the value of
> > MBean
> > > > AvailableProcessors. Both of these MBeans are available under
> > > > java.lang.OperatingSystem.
> > > >
> > > > I am not sure what is causing these regressions and would like your
> > help
> > > in
> > > > understanding it. We are aware about the changes with respect to
> > > threading
> > > > model in the new broker, are there any design docs that we can refer
> to
> > > > understand thes

Re: Qpid broker 6.0.4 performance issues

2016-10-15 Thread Rob Godfrey
So I *think* this is an issue because of the extremely large number of
consumers.  The threading model in v6 means that whenever a network read
occurs for a connection, it iterates over the consumers on that connection
- obviously where there are a large number of consumers this is
burdensome.  I fear addressing this may not be a trivial change...  I shall
spend the rest of my afternoon pondering this...

- Rob

On 15 October 2016 at 17:14, Ramayan Tiwari 
wrote:

> Hi Rob,
>
> Thanks so much for your response. We use transacted sessions with
> non-persistent delivery. Prefetch size is 1 and every message is same size
> (200 bytes).
>
> Thanks
> Ramayan
>
> On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey 
> wrote:
>
> > Hi Ramyan,
> >
> > this is interesting... in our testing (which admittedly didn't cover the
> > case of this many queues / listeners) we saw the 6.0.x broker using less
> > CPU on average than the 0.32 broker.  I'll have a look this weekend as to
> > why creating the listeners is slower.  On the dequeing, can you give a
> > little more information on the usage pattern - are you using
> transactions,
> > auto-ack or client ack?  What prefetch size are you using?  How large are
> > your messages?
> >
> > Thanks,
> > Rob
> >
> > On 14 October 2016 at 23:46, Ramayan Tiwari 
> > wrote:
> >
> > > Hi All,
> > >
> > > We have been validating the new Qpid broker (version 6.0.4) and have
> > > compared against broker version 0.32 and are seeing major regressions.
> > > Following is the summary of our test setup and results:
> > >
> > > *1. Test Setup *
> > >   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM).
> > >   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use 8GB
> > > heap and 8GB direct memory.
> > >   *c).* For 6.0.4, flow to disk has been configured at 60%.
> > >   *d).* Both the brokers use BDB host type.
> > >   *e).* Brokers have around 6000 queues and we create 16 listener
> > > sessions/threads spread over 3 connections, where each session is
> > listening
> > > to 3000 queues. However, messages are only enqueued and processed from
> 10
> > > queues.
> > >   *f).* We enqueue 1 million messages across 10 different queues
> (evenly
> > > divided), at the start of the test. Dequeue only starts once all the
> > > messages have been enqueued. We run the test for 2 hours and process as
> > > many messages as we can. Each message runs for around 200 milliseconds.
> > >   *g).* We have used both 0.16 and 6.0.4 clients for these tests (6.0.4
> > > client only with 6.0.4 broker)
> > >
> > > *2. Test Results *
> > >   *a).* System Load Average (read notes below on how we compute it),
> for
> > > 6.0.4 broker is 5x compared to 0.32 broker. During start of the test
> > (when
> > > we are not doing any dequeue), load average is normal (0.05 for 0.32
> > broker
> > > and 0.1 for new broker), however, while we are dequeuing messages, the
> > load
> > > average is very high (around 0.5 consistently).
> > >
> > >   *b). *Time to create listeners in new broker has gone up by 220%
> > compared
> > > to 0.32 broker (when using 0.16 client). For old broker, creating 16
> > > sessions each listening to 3000 queues takes 142 seconds and in new
> > broker
> > > it took 456 seconds. If we use 6.0.4 client, it took even longer at
> 524%
> > > increase (887 seconds).
> > >  *I).* The time to create consumers increases as we create more
> > > listeners on the same connections. We have 20 sessions (but end up
> using
> > > around 5 of them) on each connection and we create about 3000 consumers
> > and
> > > attach MessageListener to it. Each successive session takes longer
> > > (approximately linear increase) to setup same number of consumers and
> > > listeners.
> > >
> > > *3). How we compute System Load Average *
> > > We query the Mbean SysetmLoadAverage and divide it by the value of
> MBean
> > > AvailableProcessors. Both of these MBeans are available under
> > > java.lang.OperatingSystem.
> > >
> > > I am not sure what is causing these regressions and would like your
> help
> > in
> > > understanding it. We are aware about the changes with respect to
> > threading
> > > model in the new broker, are there any design docs that we can refer to
> > > understand these changes at a high level? Can we tune some parameters
> to
> > > address these issues?
> > >
> > > Thanks
> > > Ramayan
> > >
> >
>


Re: Qpid broker 6.0.4 performance issues

2016-10-15 Thread Ramayan Tiwari
Hi Rob,

Thanks so much for your response. We use transacted sessions with
non-persistent delivery. Prefetch size is 1 and every message is same size
(200 bytes).

Thanks
Ramayan

On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey 
wrote:

> Hi Ramyan,
>
> this is interesting... in our testing (which admittedly didn't cover the
> case of this many queues / listeners) we saw the 6.0.x broker using less
> CPU on average than the 0.32 broker.  I'll have a look this weekend as to
> why creating the listeners is slower.  On the dequeing, can you give a
> little more information on the usage pattern - are you using transactions,
> auto-ack or client ack?  What prefetch size are you using?  How large are
> your messages?
>
> Thanks,
> Rob
>
> On 14 October 2016 at 23:46, Ramayan Tiwari 
> wrote:
>
> > Hi All,
> >
> > We have been validating the new Qpid broker (version 6.0.4) and have
> > compared against broker version 0.32 and are seeing major regressions.
> > Following is the summary of our test setup and results:
> >
> > *1. Test Setup *
> >   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM).
> >   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use 8GB
> > heap and 8GB direct memory.
> >   *c).* For 6.0.4, flow to disk has been configured at 60%.
> >   *d).* Both the brokers use BDB host type.
> >   *e).* Brokers have around 6000 queues and we create 16 listener
> > sessions/threads spread over 3 connections, where each session is
> listening
> > to 3000 queues. However, messages are only enqueued and processed from 10
> > queues.
> >   *f).* We enqueue 1 million messages across 10 different queues (evenly
> > divided), at the start of the test. Dequeue only starts once all the
> > messages have been enqueued. We run the test for 2 hours and process as
> > many messages as we can. Each message runs for around 200 milliseconds.
> >   *g).* We have used both 0.16 and 6.0.4 clients for these tests (6.0.4
> > client only with 6.0.4 broker)
> >
> > *2. Test Results *
> >   *a).* System Load Average (read notes below on how we compute it), for
> > 6.0.4 broker is 5x compared to 0.32 broker. During start of the test
> (when
> > we are not doing any dequeue), load average is normal (0.05 for 0.32
> broker
> > and 0.1 for new broker), however, while we are dequeuing messages, the
> load
> > average is very high (around 0.5 consistently).
> >
> >   *b). *Time to create listeners in new broker has gone up by 220%
> compared
> > to 0.32 broker (when using 0.16 client). For old broker, creating 16
> > sessions each listening to 3000 queues takes 142 seconds and in new
> broker
> > it took 456 seconds. If we use 6.0.4 client, it took even longer at 524%
> > increase (887 seconds).
> >  *I).* The time to create consumers increases as we create more
> > listeners on the same connections. We have 20 sessions (but end up using
> > around 5 of them) on each connection and we create about 3000 consumers
> and
> > attach MessageListener to it. Each successive session takes longer
> > (approximately linear increase) to setup same number of consumers and
> > listeners.
> >
> > *3). How we compute System Load Average *
> > We query the Mbean SysetmLoadAverage and divide it by the value of MBean
> > AvailableProcessors. Both of these MBeans are available under
> > java.lang.OperatingSystem.
> >
> > I am not sure what is causing these regressions and would like your help
> in
> > understanding it. We are aware about the changes with respect to
> threading
> > model in the new broker, are there any design docs that we can refer to
> > understand these changes at a high level? Can we tune some parameters to
> > address these issues?
> >
> > Thanks
> > Ramayan
> >
>


Re: Qpid broker 6.0.4 performance issues

2016-10-15 Thread Rob Godfrey
Hi Ramyan,

this is interesting... in our testing (which admittedly didn't cover the
case of this many queues / listeners) we saw the 6.0.x broker using less
CPU on average than the 0.32 broker.  I'll have a look this weekend as to
why creating the listeners is slower.  On the dequeing, can you give a
little more information on the usage pattern - are you using transactions,
auto-ack or client ack?  What prefetch size are you using?  How large are
your messages?

Thanks,
Rob

On 14 October 2016 at 23:46, Ramayan Tiwari 
wrote:

> Hi All,
>
> We have been validating the new Qpid broker (version 6.0.4) and have
> compared against broker version 0.32 and are seeing major regressions.
> Following is the summary of our test setup and results:
>
> *1. Test Setup *
>   *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM).
>   *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use 8GB
> heap and 8GB direct memory.
>   *c).* For 6.0.4, flow to disk has been configured at 60%.
>   *d).* Both the brokers use BDB host type.
>   *e).* Brokers have around 6000 queues and we create 16 listener
> sessions/threads spread over 3 connections, where each session is listening
> to 3000 queues. However, messages are only enqueued and processed from 10
> queues.
>   *f).* We enqueue 1 million messages across 10 different queues (evenly
> divided), at the start of the test. Dequeue only starts once all the
> messages have been enqueued. We run the test for 2 hours and process as
> many messages as we can. Each message runs for around 200 milliseconds.
>   *g).* We have used both 0.16 and 6.0.4 clients for these tests (6.0.4
> client only with 6.0.4 broker)
>
> *2. Test Results *
>   *a).* System Load Average (read notes below on how we compute it), for
> 6.0.4 broker is 5x compared to 0.32 broker. During start of the test (when
> we are not doing any dequeue), load average is normal (0.05 for 0.32 broker
> and 0.1 for new broker), however, while we are dequeuing messages, the load
> average is very high (around 0.5 consistently).
>
>   *b). *Time to create listeners in new broker has gone up by 220% compared
> to 0.32 broker (when using 0.16 client). For old broker, creating 16
> sessions each listening to 3000 queues takes 142 seconds and in new broker
> it took 456 seconds. If we use 6.0.4 client, it took even longer at 524%
> increase (887 seconds).
>  *I).* The time to create consumers increases as we create more
> listeners on the same connections. We have 20 sessions (but end up using
> around 5 of them) on each connection and we create about 3000 consumers and
> attach MessageListener to it. Each successive session takes longer
> (approximately linear increase) to setup same number of consumers and
> listeners.
>
> *3). How we compute System Load Average *
> We query the Mbean SysetmLoadAverage and divide it by the value of MBean
> AvailableProcessors. Both of these MBeans are available under
> java.lang.OperatingSystem.
>
> I am not sure what is causing these regressions and would like your help in
> understanding it. We are aware about the changes with respect to threading
> model in the new broker, are there any design docs that we can refer to
> understand these changes at a high level? Can we tune some parameters to
> address these issues?
>
> Thanks
> Ramayan
>


Qpid broker 6.0.4 performance issues

2016-10-14 Thread Ramayan Tiwari
Hi All,

We have been validating the new Qpid broker (version 6.0.4) and have
compared against broker version 0.32 and are seeing major regressions.
Following is the summary of our test setup and results:

*1. Test Setup *
  *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM).
  *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use 8GB
heap and 8GB direct memory.
  *c).* For 6.0.4, flow to disk has been configured at 60%.
  *d).* Both the brokers use BDB host type.
  *e).* Brokers have around 6000 queues and we create 16 listener
sessions/threads spread over 3 connections, where each session is listening
to 3000 queues. However, messages are only enqueued and processed from 10
queues.
  *f).* We enqueue 1 million messages across 10 different queues (evenly
divided), at the start of the test. Dequeue only starts once all the
messages have been enqueued. We run the test for 2 hours and process as
many messages as we can. Each message runs for around 200 milliseconds.
  *g).* We have used both 0.16 and 6.0.4 clients for these tests (6.0.4
client only with 6.0.4 broker)

*2. Test Results *
  *a).* System Load Average (read notes below on how we compute it), for
6.0.4 broker is 5x compared to 0.32 broker. During start of the test (when
we are not doing any dequeue), load average is normal (0.05 for 0.32 broker
and 0.1 for new broker), however, while we are dequeuing messages, the load
average is very high (around 0.5 consistently).

  *b). *Time to create listeners in new broker has gone up by 220% compared
to 0.32 broker (when using 0.16 client). For old broker, creating 16
sessions each listening to 3000 queues takes 142 seconds and in new broker
it took 456 seconds. If we use 6.0.4 client, it took even longer at 524%
increase (887 seconds).
 *I).* The time to create consumers increases as we create more
listeners on the same connections. We have 20 sessions (but end up using
around 5 of them) on each connection and we create about 3000 consumers and
attach MessageListener to it. Each successive session takes longer
(approximately linear increase) to setup same number of consumers and
listeners.

*3). How we compute System Load Average *
We query the Mbean SysetmLoadAverage and divide it by the value of MBean
AvailableProcessors. Both of these MBeans are available under
java.lang.OperatingSystem.

I am not sure what is causing these regressions and would like your help in
understanding it. We are aware about the changes with respect to threading
model in the new broker, are there any design docs that we can refer to
understand these changes at a high level? Can we tune some parameters to
address these issues?

Thanks
Ramayan