----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4232/#review5717 -----------------------------------------------------------
The numbers are encouraging! As mentioned below I think this still needs some work to be safe however. /trunk/qpid/cpp/src/qpid/broker/Queue.cpp <https://reviews.apache.org/r/4232/#comment12449> As it stands, I believe this is unsafe. If you have an empty queue, with a concurrent consumeNextMessage() from a consumer with a push() from a publisher then it is possible the consumer does not get a message *and* does not get notified that there is one. Moving the listeners.addListener() and the listeners.populate() (from push()) outside the lock scope means that the consuming thread can fail to get a message from the allocator, the publishing thread then adds a message and before the consuming thread adds the consumer to the set of listeners, the publishing thread populates its set (which will be empty) for notification. Thus the message remains on the queue without being delivered to an interested consumer until something else happens to retrigger a notification. /trunk/qpid/cpp/src/qpid/broker/Queue.cpp <https://reviews.apache.org/r/4232/#comment12451> Switching from the consumerLock to messageLock may simplify the code but obviously won't improve the concurrency. The lock previously used here was dedicated to protected an entirely distinct set of data. However I would agree that there is in reality likely to be little gained. /trunk/qpid/cpp/src/qpid/broker/Queue.cpp <https://reviews.apache.org/r/4232/#comment12450> This looks like a subtle change to the logic that is perhaps unintentional? Previously if ctxt was non null the isEnqueued() test would still be executed and if false, the method would return. That test is now skipped for the transactional case. (If I'm right then a transactional dequeue attempt on a persistent message that has been removed say due to a ring queue policy limit being reached will result in a second dequeue attempt on the store that will most likely fail.) - Gordon On 2012-03-07 19:23:36, Kenneth Giusti wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/4232/ > ----------------------------------------------------------- > > (Updated 2012-03-07 19:23:36) > > > Review request for qpid, Andrew Stitcher, Alan Conway, Gordon Sim, and Ted > Ross. > > > Summary > ------- > > In order to improve our parallelism across threads, this patch reduces the > scope over which the Queue's message lock is held. > > I've still got more testing to do, but the simple test cases below show > pretty good performance improvements on my multi-core box. I'd like to get > some early feedback, as there are a lot of little changes to the queue > locking that will need vetting. > > As far as performance is concerned - I've run the following tests on an 8 > core Xeon X5550 2.67Ghz box. I used qpid-send and qpid-receive to generate > traffic. All traffic generated in parallel. > All numbers are in messages/second - sorted by best overall throughput. > > Test 1: Funnel Test (shared queue, 2 producers, 1 consumer) > > test setup and execute info: > qpidd --auth no -p 8888 --no-data-dir --worker-threads=3 > qpid-config -b $SRC_HOST add queue srcQ1 --max-queue-size=1200000000 > --max-queue-count=4000000 --flow-stop-size=0 --flow-stop-count=0 > qpid-receive -b $SRC_HOST -a srcQ1 -f -m 4000000 --capacity 2000 > --ack-frequency 1000 --print-content no --report-total & > qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 > --report-total --sequence no --timestamp no & > qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 > --report-total --sequence no --timestamp no & > > trunk: patched: > tp(m/s) tp(m/s) > > S1 :: S2 :: R1 S1 :: S2 :: R1 > 73144 :: 72524 :: 112914 91120 :: 83278 :: 133730 > 70299 :: 68878 :: 110905 91079 :: 87418 :: 133649 > 70002 :: 69882 :: 110241 83957 :: 83600 :: 131617 > 70523 :: 68681 :: 108372 83911 :: 82845 :: 128513 > 68135 :: 68022 :: 107949 85999 :: 81863 :: 125575 > > > Test 2: Quad Client Shared Test (1 queue, 2 senders x 2 Receivers) > > test setup and execute info: > qpidd --auth no -p 8888 --no-data-dir --worker-threads=4 > qpid-config -b $SRC_HOST add queue srcQ1 --max-queue-size=1200000000 > --max-queue-count=4000000 --flow-stop-size=0 --flow-stop-count=0 > qpid-receive -b $SRC_HOST -a srcQ1 -f -m 2000000 --capacity 2000 > --ack-frequency 1000 --print-content no --report-total & > qpid-receive -b $SRC_HOST -a srcQ1 -f -m 2000000 --capacity 2000 > --ack-frequency 1000 --print-content no --report-total & > qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 > --report-total --sequence no --timestamp no & > qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 > --report-total --sequence no --timestamp no & > > trunk: patched: > tp(m/s) tp(m/s) > > S1 :: S2 :: R1 :: R2 S1 :: S2 :: R1 :: R2 > 52826 :: 52364 :: 52386 :: 52275 64826 :: 64780 :: 64811 :: 64638 > 51457 :: 51269 :: 51399 :: 51213 64630 :: 64157 :: 64562 :: 64157 > 50557 :: 50432 :: 50468 :: 50366 64023 :: 63832 :: 64092 :: 63748 > > **Note: pre-patch, qpidd ran steadily at about 270% cpu during this test. > With the patch, qpidd's cpu utilization was steady at 300% > > > Test 3: Federation Funnel (two federated brokers, 2 producers upstream, 1 > consumer downstream) > > test setup and execute info: > qpidd --auth no -p 8888 --no-data-dir --worker-threads=3 -d > qpidd --auth no -p 9999 --no-data-dir --worker-threads=2 -d > qpid-config -b $SRC_HOST add queue srcQ1 --max-queue-size=600000000 > --max-queue-count=2000000 --flow-stop-size=0 --flow-stop-count=0 > qpid-config -b $SRC_HOST add queue srcQ2 --max-queue-size=600000000 > --max-queue-count=2000000 --flow-stop-size=0 --flow-stop-count=0 > qpid-config -b $DST_HOST add queue dstQ1 --max-queue-size=1200000000 > --max-queue-count=4000000 --flow-stop-size=0 --flow-stop-count=0 > qpid-config -b $DST_HOST add exchange fanout dstX1 > qpid-config -b $DST_HOST bind dstX1 dstQ1 > qpid-route --ack=1000 queue add $DST_HOST $SRC_HOST dstX1 srcQ1 > qpid-route --ack=1000 queue add $DST_HOST $SRC_HOST dstX1 srcQ2 > qpid-receive -b $DST_HOST -a dstQ1 -f -m 4000000 --capacity 2000 > --ack-frequency 1000 --print-content no --report-total & > qpid-send -b $SRC_HOST -a srcQ1 -m 2000000 --content-size 300 --capacity 2000 > --report-total --sequence no --timestamp no & > qpid-send -b $SRC_HOST -a srcQ2 -m 2000000 --content-size 300 --capacity 2000 > --report-total --sequence no --timestamp no & > > > trunk: patched: > tp(m/s) tp(m/s) > > S1 :: S2 :: R1 S1 :: S2 :: R1 > 86150 :: 84146 :: 87665 90877 :: 88534 :: 108418 > 89188 :: 88319 :: 82469 87014 :: 86753 :: 107386 > 88221 :: 85298 :: 82455 89790 :: 88573 :: 104119 > > Still TBD: > - fix message group errors > - verify/fix cluster > > > This addresses bug qpid-3890. > https://issues.apache.org/jira/browse/qpid-3890 > > > Diffs > ----- > > /trunk/qpid/cpp/src/qpid/broker/Queue.h 1297185 > /trunk/qpid/cpp/src/qpid/broker/Queue.cpp 1297185 > /trunk/qpid/cpp/src/qpid/broker/QueueListeners.h 1297185 > /trunk/qpid/cpp/src/qpid/broker/QueueListeners.cpp 1297185 > > Diff: https://reviews.apache.org/r/4232/diff > > > Testing > ------- > > unit test (non-cluster) > performance tests as described above. > > > Thanks, > > Kenneth > >
