Re: Review Request: WIP: Improve multi-thread performance by reducing the duration the Queue message lock is held.

Kenneth Giusti Thu, 08 Mar 2012 06:53:16 -0800


> On 2012-03-08 11:00:25, Gordon Sim wrote:
> > /trunk/qpid/cpp/src/qpid/broker/Queue.cpp, line 957
> > <https://reviews.apache.org/r/4232/diff/1/?file=88780#file88780line957>
> >
> >     Switching from the consumerLock to messageLock may simplify the code 
> > but obviously won't improve the concurrency. The lock previously used here 
> > was dedicated to protected an entirely distinct set of data. However I 
> > would agree that there is  in reality likely to be little gained.


This may be overkill on my behalf: The reason for losing the consumerLock and 
using the messageLock was simply for locking consistency across the observer 
interface - using the same lock across all observer callouts.  Maybe not 
necessary, but I believe the observers currently assume the queue is providing 
locking and that implies using the same lock.


> On 2012-03-08 11:00:25, Gordon Sim wrote:
> > /trunk/qpid/cpp/src/qpid/broker/Queue.cpp, line 411
> > <https://reviews.apache.org/r/4232/diff/1/?file=88780#file88780line411>
> >
> >     As it stands, I believe this is unsafe. If you have an empty queue, 
> > with a concurrent consumeNextMessage() from a consumer with a push() from a 
> > publisher then it is possible the consumer does not get a message *and* 
> > does not get notified that there is one.
> >     
> >     Moving the listeners.addListener() and the listeners.populate() (from 
> > push()) outside the lock scope means that the consuming thread can fail to 
> > get a message from the allocator, the publishing thread then adds a message 
> > and before the consuming thread adds the consumer to the set of listeners, 
> > the publishing thread populates its set (which will be empty) for 
> > notification.
> >     
> >     Thus the message remains on the queue without being delivered to an 
> > interested consumer until something else happens to retrigger a 
> > notification.

+1000 thanks - I see the problem now.


> On 2012-03-08 11:00:25, Gordon Sim wrote:
> > /trunk/qpid/cpp/src/qpid/broker/Queue.cpp, line 1071
> > <https://reviews.apache.org/r/4232/diff/1/?file=88780#file88780line1071>
> >
> >     This looks like a subtle change to the logic that is perhaps 
> > unintentional? Previously if ctxt was non null the isEnqueued() test would 
> > still be executed and if false, the method would return. That test is now 
> > skipped for the transactional case.
> >     
> >     (If I'm right then a transactional dequeue attempt on a persistent 
> > message that has been removed say due to a ring queue policy limit being 
> > reached will result in a second dequeue attempt on the store that will most 
> > likely fail.)

What the hell did I do?  Thanks for catching that - totally unintended.


- Kenneth


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4232/#review5717
-----------------------------------------------------------


On 2012-03-07 19:23:36, Kenneth Giusti wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4232/
> -----------------------------------------------------------
> 
> (Updated 2012-03-07 19:23:36)
> 
> 
> Review request for qpid, Andrew Stitcher, Alan Conway, Gordon Sim, and Ted 
> Ross.
> 
> 
> Summary
> -------
> 
> In order to improve our parallelism across threads, this patch reduces the 
> scope over which the Queue's message lock is held.
> 
> I've still got more testing to do, but the simple test cases below show 
> pretty good performance improvements on my multi-core box.  I'd like to get 
> some early feedback, as there are a lot of little changes to the queue 
> locking that will need vetting.
> 
> As far as performance is concerned - I've run the following tests on an 8 
> core Xeon X5550 2.67Ghz box.  I used qpid-send and qpid-receive to generate 
> traffic.  All traffic generated in parallel.
> All numbers are in messages/second - sorted by best overall throughput.
> 
> Test 1: Funnel Test (shared queue, 2 producers, 1 consumer)
> 
> test setup and execute info:
> qpidd --auth no -p 8888 --no-data-dir --worker-threads=3
> qpid-config -b $SRC_HOST add queue srcQ1 --max-queue-size=1200000000 
> --max-queue-count=4000000 --flow-stop-size=0 --flow-stop-count=0
> qpid-receive -b $SRC_HOST -a srcQ1 -f -m 4000000 --capacity 2000 
> --ack-frequency 1000 --print-content no --report-total &
> qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 
> --report-total --sequence no --timestamp no &
> qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 
> --report-total --sequence no --timestamp no &
> 
> trunk:                        patched:
> tp(m/s)                       tp(m/s)
> 
> S1    :: S2    :: R1          S1    :: S2    :: R1
> 73144 :: 72524 :: 112914      91120 :: 83278 :: 133730
> 70299 :: 68878 :: 110905      91079 :: 87418 :: 133649
> 70002 :: 69882 :: 110241      83957 :: 83600 :: 131617
> 70523 :: 68681 :: 108372      83911 :: 82845 :: 128513
> 68135 :: 68022 :: 107949      85999 :: 81863 :: 125575
> 
> 
> Test 2: Quad Client Shared Test (1 queue, 2 senders x 2 Receivers)
> 
> test setup and execute info:
> qpidd --auth no -p 8888 --no-data-dir --worker-threads=4
> qpid-config -b $SRC_HOST add queue srcQ1 --max-queue-size=1200000000 
> --max-queue-count=4000000 --flow-stop-size=0 --flow-stop-count=0
> qpid-receive -b $SRC_HOST -a srcQ1 -f -m 2000000 --capacity 2000 
> --ack-frequency 1000 --print-content no --report-total &
> qpid-receive -b $SRC_HOST -a srcQ1 -f -m 2000000 --capacity 2000 
> --ack-frequency 1000 --print-content no --report-total &
> qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 
> --report-total --sequence no --timestamp no &
> qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 
> --report-total --sequence no --timestamp no &
> 
> trunk:                               patched:
> tp(m/s)                              tp(m/s)
> 
> S1    :: S2    :: R1    :: R2        S1    :: S2    :: R1    :: R2
> 52826 :: 52364 :: 52386 :: 52275     64826 :: 64780 :: 64811 :: 64638
> 51457 :: 51269 :: 51399 :: 51213     64630 :: 64157 :: 64562 :: 64157
> 50557 :: 50432 :: 50468 :: 50366     64023 :: 63832 :: 64092 :: 63748
> 
> **Note: pre-patch, qpidd ran steadily at about 270% cpu during this test.  
> With the patch, qpidd's cpu utilization was steady at 300%
> 
> 
> Test 3: Federation Funnel (two federated brokers, 2 producers upstream, 1 
> consumer downstream)
> 
> test setup and execute info:
> qpidd --auth no -p 8888 --no-data-dir --worker-threads=3 -d
> qpidd --auth no -p 9999 --no-data-dir --worker-threads=2 -d
> qpid-config -b $SRC_HOST add queue srcQ1 --max-queue-size=600000000  
> --max-queue-count=2000000 --flow-stop-size=0 --flow-stop-count=0
> qpid-config -b $SRC_HOST add queue srcQ2 --max-queue-size=600000000  
> --max-queue-count=2000000 --flow-stop-size=0 --flow-stop-count=0
> qpid-config -b $DST_HOST add queue dstQ1 --max-queue-size=1200000000 
> --max-queue-count=4000000 --flow-stop-size=0 --flow-stop-count=0
> qpid-config -b $DST_HOST add exchange fanout dstX1
> qpid-config -b $DST_HOST bind dstX1 dstQ1
> qpid-route --ack=1000 queue add $DST_HOST $SRC_HOST dstX1 srcQ1
> qpid-route --ack=1000 queue add $DST_HOST $SRC_HOST dstX1 srcQ2
> qpid-receive -b $DST_HOST -a dstQ1 -f -m 4000000 --capacity 2000 
> --ack-frequency 1000 --print-content no --report-total &
> qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 
> --report-total --sequence no --timestamp no &
> qpid-send -b $SRC_HOST -a srcQ2 -m 2000000 --content-size 300 --capacity 2000 
> --report-total --sequence no --timestamp no &
> 
> 
> trunk:                        patched:
> tp(m/s)                       tp(m/s)
> 
> S1    :: S2    :: R1          S1    :: S2    :: R1
> 86150 :: 84146 :: 87665       90877 :: 88534 :: 108418
> 89188 :: 88319 :: 82469       87014 :: 86753 :: 107386
> 88221 :: 85298 :: 82455       89790 :: 88573 :: 104119
> 
> Still TBD:
>   - fix message group errors
>   - verify/fix cluster
> 
> 
> This addresses bug qpid-3890.
>     https://issues.apache.org/jira/browse/qpid-3890
> 
> 
> Diffs
> -----
> 
>   /trunk/qpid/cpp/src/qpid/broker/Queue.h 1297185 
>   /trunk/qpid/cpp/src/qpid/broker/Queue.cpp 1297185 
>   /trunk/qpid/cpp/src/qpid/broker/QueueListeners.h 1297185 
>   /trunk/qpid/cpp/src/qpid/broker/QueueListeners.cpp 1297185 
> 
> Diff: https://reviews.apache.org/r/4232/diff
> 
> 
> Testing
> -------
> 
> unit test (non-cluster)
> performance tests as described above.
> 
> 
> Thanks,
> 
> Kenneth
> 
>

Re: Review Request: WIP: Improve multi-thread performance by reducing the duration the Queue message lock is held.

Reply via email to