[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

Richard Wagg (JIRA) Fri, 28 Mar 2014 10:52:34 -0700

    [ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951061#comment-13951061
 ]


Richard Wagg commented on AMQ-5077:
-----------------------------------

Leaving aside the question of message loss on failover, 
We have 2 goals here:
- getting the maximum possible throughput to consumers 
- Never blocking/delaying a producer until the JMS hits an understood/visible 
limit (memory/diskstore). 
I need to come up with a better test case to see how a larger 
ProducerWindowSize affects this, but for the moment i don't believe it's 
working as we would want it to. 

Currently the limit on the rate which we're able to deliver messages from 
producers to consumers is the speed at which the JMS can write/remove messages 
from the index & diskstore. 
This happens in such a way that producers block on the send() call. 

org.apache.activemq.store.kahadb.KahaDBStore:
public Future<Object> asyncAddQueueMessage(final ConnectionContext context, 
final Message message)
public void removeAsyncMessage(ConnectionContext context, MessageAck ack)

My reading of the code is that messages can be dispatched before the store task 
has completed, and if the ACK arrives before the store completes, then the 
store operation is cancelled. 
This also implies that a message could be delivered without being written to 
disk. I'm not sure at what point in this process the producer receives the ACK. 

If the consumer were quick enough to receive, process and ACK the messages in 
question, then we'd optimise away the need to ever write to the diskstore, and 
not have an issue here. 
However, our SAN seems to be fast enough, combined with network latency, to 
ensure that the disk writes are nearly always in progress before the ACK 
arrives. 

In this case, all the work to write/remove messages from the diskstore & index, 
and the synchronisation overheads of doing this, happen on the NIO worker 
threads. 
This delays the producers in a way that isn't visible to the producer. Whether 
sending messages synch or async, all the producer code sees is that calling 
send() takes longer. 
My understanding is that increasing the producer send window would allow it to 
keep more messages in flight before it has received ACKs for them - but would 
not help when it's blocked at the network level. 
I'll see if i can come up with a more specific test case that shows the effect 
of varying the producer send window. 

What i think we're looking for is some option along the lines of 
ConcurrentDispatchThenStoreIfNeeded - first dispatch the message, then wait for 
a timeout period, and then only persist the message to the diskstore incurring 
the disk/synchronisation penalties if an ACK doesn't arrive on time. 
This would be for a low (<100ms?) time, respect memory limits on the broker for 
total messages in flight, and would allow the producer send rate to scale with 
the slowest consumer receive speed, rather than the sum of all queue writes 
possible on the JMS. 

Current behaviour: 

Producer -> broker with topic -> queue routings -> consumers:
- Producer is blocked by speed at which broker can write to all queues. 
Consumers receive messages at speed JMS can write. Queue write limit is global. 

Producer -> broker with embedded routing bean -> consumers (router waits for 
send call to complete before acking the message received): 
- Producer is able to write up to producer window size. Embedded bean is 
blocked by broker write speed - consumers receive messages at speed JMS can 
write. Queue write limit is global. 

Ideal situation: 
Producer -> broker with topic -> queue routings and 
ConcurrentDispatchThenStoreIfNeeded set: 
Producer is able to write up to producer window size. Broker is able to 
dispatch to consumers at consumer receive rate limit, only writing to disk if 
consumers slow down. 

Does that make sense, or do you think i'm misunderstanding the issue here? 

Thanks,
Richard

> Improve performance of ConcurrentStoreAndDispatch
> -------------------------------------------------
>
>                 Key: AMQ-5077
>                 URL: https://issues.apache.org/jira/browse/AMQ-5077
>             Project: ActiveMQ
>          Issue Type: Wish
>          Components: Message Store
>    Affects Versions: 5.9.0
>         Environment: 5.9.0.redhat-610343
>            Reporter: Jason Shepherd
>            Assignee: Gary Tully
>         Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
> topicRouting.zip
>
>
> We have publishers publishing to a topic which has 5 topic -> queue routings, 
> and gets a max message rate attainable of ~833 messages/sec, with each 
> message around 5k in size.
> To test this i set up a JMS config with topic queues:
> Topic
> TopicRouted.1
> ...
> TopicRouted.11
> Each topic has an increasing number of routings to queues, and a client is 
> set up to subscribe to all the queues.
> Rough message rates:
> routings messages/sec
> 0 2500
> 1 1428
> 2 2000
> 3 1428
> 4 1111
> 5 833
> This occurs whether the broker config has producerFlowControl="false" set to 
> true or false , and KahaDB disk synching is turned off. We also tried 
> experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
> LevelDB didn't give any notable performance improvement either.
> We also have asyncSend enabled on the producer, and have a requirement to use 
> persistent messages. We have also experimented with sending messages in a 
> transaction, but that hasn't really helped.
> It seems like producer throughput rate across all queue destinations, all 
> connections and all publisher machines is limited by something on the broker, 
> through a mechanism which is not producer flow control. I think the prime 
> suspect is still contention on the index.
> We did some test with Yourkit profiler.
> Profiler was attached to broker at startup, allowed to run and then a topic 
> publisher was started, routing to 5 queues. 
> Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
> and then profiling snapshot was taken. During that time, ~9600 messages were 
> logged as being sent for a rate of ~160/sec.
> This ties in roughly with the invocation counts recorded in the snapshot (i 
> think) - ~43k calls. 
> From what i can work out, in the snapshot (filtering everything but 
> org.apache.activemq.store.kahadb), 
> For the 60 second sample period, 
> 24.8 seconds elapsed in 
> org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
>  MessageAck).
> 18.3 seconds elapsed in 
> org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
>  Message, boolean).
> From these, a further large portion of the time is spent inside 
> MessageDatabase:
> org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
>  Location) - 10 secs elapsed
> org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
>  Location) - 8.5 secs elapsed.
> As both of these lock on indexLock.writeLock(), and both take place on the 
> NIO transport threads, i think this accounts for at least some of the message 
> throughput limits. As messages are added and removed from the index one by 
> one, regardless of sync type settings, this adds a fair amount of overhead. 
> While we're not synchronising on writes to disk, we are performing work on 
> the NIO worker thread which can block on locks, and could account for the 
> behaviour we've seen client side. 
> To Reproduce:
> 1. Install a broker and use the attached configuration.
> 2. Use the 5.8.0 example ant script to consume from the queues, 
> TopicQueueRouted.1 - 5. eg:
>    ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
> -Duser=admin -Dpassword=admin -Dmax=-1
> 3. Use the modified version of 5.8.0 example ant script (attached) to send 
> messages to topics, TopicRouted.1 - 5, eg:
>    ant producer 
> -Durl='tcp://localhost:61616?jms.useAsyncSend=true&wireFormat.tightEncodingEnabled=false&keepAlive=true&wireFormat.maxInactivityDuration=60000&socketBufferSize=32768'
>  -Dsubject=TopicRouted.1 -Duser=admin -Dpassword=admin -Dmax=1 -Dtopic=true 
> -DsleepTime=0 -Dmax=10000 -DmessageSize=5000
> This modified version of the script prints the number of messages per second 
> and prints it to the console.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

Reply via email to