[
https://issues.apache.org/jira/browse/AMQ-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620375#comment-16620375
]
Gary Tully commented on AMQ-7028:
---------------------------------
I pulled the mkahadb fix.
on the xa scenario, AMQ-6707 has ~8 commits, all of which are relevant but none
of which are on the 5.15.x branch. It may be best to include all the fixes
together, but this is obviously a larger change.
maybe try and pull in all the relevant changes and open a pr once the branch is
good.
{code:java}
git log apache/master | grep AMQ-6707
AMQ-6707 - fix destination filter delegate param, refactor-auto-gen method;
jees
AMQ-6707 - ensure entryLocator is used for rollback of prepared add to
avoid NPE, relates to AMQ-5567
AMQ-6707 - remove duplicated started state flag
AMQ-6707 - ensure trace logging does not flip cacheEnabled flag outside
required sync
AMQ-6707 - fix trace log reporting in error
AMQ-6707 - skip tracked ack dependent test for leveldb
AMQ-6707 - mKahadb, track recovered tx per store for completion, resolve
test regression
AMQ-6707 - JDBC XA recovery and completion.{code}
> Poor performance when concurrentStoreAndDispatchQueues + slow FS + Slow
> Consumers
> ---------------------------------------------------------------------------------
>
> Key: AMQ-7028
> URL: https://issues.apache.org/jira/browse/AMQ-7028
> Project: ActiveMQ
> Issue Type: Improvement
> Components: KahaDB
> Affects Versions: 5.15.4
> Reporter: Alan Protasio
> Priority: Major
>
> Using high latency FS (as NFS) to store kahadb files and setting
> concurrentStoreAndDispatchQueues=true may cause poor performance for slow
> consumer. This happens because using this option makes activemq write the
> produced messages one by one to the underlying file system (this is
> implemented by using a SingleThread ExecutorService).
> Lets say that for each write to the FS takes 10ms and the queue has slow
> consumers. In this case, does not matter the number of concurrent messages
> the producers try to send to the queue, the maximum performance we can
> achieve is 100 TPS. Tuning this flag off, we can see a really better
> performance for sending messages in parallel as those messages can be batched
> to the FS in a single write (the performance increases with the number of
> concurrent messages being sent in parallel).
> Looking at Activemq code we found that there is an flag used on levelDb to
> detect if the queue has fast or slow consumers, and decide if it will use
> concurrentStoreAndDispach or not.
> https://issues.apache.org/jira/browse/AMQ-3750
> but this flag is not used on the KahaDb implementation.
> We made a code change to receive the flag in the KahaDbStore and use it to
> decide if the message will be stored async or not.
> We think that there is no reason to try to "StoreAndDispatch" if the
> destination has slow consumers. This only brings overhead and in case of high
> latency FS, really poor performance when the queue has slow consumer.
> For fast consumers, this change will have no effect giving the better of the
> 2 options.
> Some Results:
> Original Version:
> Fast Consumers:
> Producer
> mean rate = 8248.50 calls/second
> min = 0.42 milliseconds
> max = 756.61 milliseconds
> mean = 11.30 milliseconds
> stddev = 44.05 milliseconds
> median = 6.02 milliseconds
> 75% <= 9.79 milliseconds
> 95% <= 18.15 milliseconds
> 98% <= 27.71 milliseconds
> 99% <= 123.51 milliseconds
> 99.9% <= 756.61 milliseconds
> Slow consumers:
> Producer
> mean rate = 84.29 calls/second
> min = 86.27 milliseconds
> max = 1467.53 milliseconds
> mean = 1082.55 milliseconds
> stddev = 154.04 milliseconds
> median = 1075.94 milliseconds
> 75% <= 1169.10 milliseconds
> 95% <= 1308.90 milliseconds
> 98% <= 1350.85 milliseconds
> 99% <= 1363.61 milliseconds
> 99.9% <= 1466.67 milliseconds
> Patched Version:
> Fast Consumers:
> Producer
> count = 890783
> mean rate = 8099.33 calls/second
> min = 0.47 milliseconds
> max = 2259.10 milliseconds
> mean = 13.90 milliseconds
> stddev = 84.84 milliseconds
> median = 5.00 milliseconds
> 75% <= 9.08 milliseconds
> 95% <= 15.66 milliseconds
> 98% <= 32.94 milliseconds
> 99% <= 355.52 milliseconds
> 99.9% <= 731.69 milliseconds
> Slow consumers:
> Producer
> mean rate = 1732.25 calls/second
> 1-minute rate = 1811.80 calls/second
> min = 17.52 milliseconds
> max = 1249.54 milliseconds
> mean = 50.95 milliseconds
> stddev = 130.68 milliseconds
> median = 28.73 milliseconds
> 75% <= 32.51 milliseconds
> 95% <= 57.04 milliseconds
> 98% <= 461.46 milliseconds
> 99% <= 937.87 milliseconds
> 99.9% <= 1249.48 milliseconds
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)