[ 
https://issues.apache.org/jira/browse/ARTEMIS-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345064#comment-17345064
 ] 

Francesco Nigro edited comment on ARTEMIS-3289 at 5/15/21, 3:48 PM:
--------------------------------------------------------------------

I've uploaded the flamegraphs of 10 seconds of sampling (at 100 Hz) of a 
replica hammered by 8 producers + 8 consumers sending 100 bytes durable 
messages.

Throughput for the changes for this Jira vs main is the same in the above test 
case, hence the only relevant difference is the CPU usage, much lower (~30%) 
for the new version.


was (Author: nigrofranz):
I've uploaded the flamegraphs of 10 seconds of sampling (at 100 Hz) of a 
replica hammered by 8 producers + 8 consumers sending 100 bytes durable 
messages.

Throughput for the changes for this Jira vs main is the same, so the only 
relevant difference is the CPU usage, much lower (~30%) for the new version.

> Reduce journal appender executor Thread wakeup cost
> ---------------------------------------------------
>
>                 Key: ARTEMIS-3289
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3289
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>            Reporter: Francesco Nigro
>            Assignee: Francesco Nigro
>            Priority: Major
>         Attachments: 3289_backup.html, image-2021-05-11-09-32-15-538.png, 
> main_backup.html
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> As mentioned in https://issues.apache.org/jira/browse/ARTEMIS-2877 one of the 
> major factors that contribute to reduce the scalability of a shared-nothing 
> replication setup is the thread wake-up cost of the {{JournalImpl}}'s 
> {{appendExecutor}} I/O threads.
>  See the flamegraph below for a busy replica while appending replicated 
> journal record:
> !image-2021-05-11-09-32-15-538.png|width=966,height=313!
> The violet bars represent the CPU cycles spent to awake the Journal appender 
> thread(s): despite https://issues.apache.org/jira/browse/ARTEMIS-2877 allow 
> backup to batch append tasks as much as possible, it seems the signaling cost 
> is still too high, if compared with the rest of replica packet processing.
> Given that the append executor is an ordered executor built on top of I/O 
> thread pool, see {{ActiveMQServerImpl}}:
> {code:java}
>       if (serviceRegistry.getIOExecutorService() != null) {
>          this.ioExecutorFactory = new 
> OrderedExecutorFactory(serviceRegistry.getIOExecutorService());
>       } else {
>          ThreadFactory tFactory = AccessController.doPrivileged(new 
> PrivilegedAction<ThreadFactory>() {
>             @Override
>             public ThreadFactory run() {
>                return new ActiveMQThreadFactory("ActiveMQ-IO-server-" + 
> this.toString(), false, ClientSessionFactoryImpl.class.getClassLoader());
>             }
>          });
>          this.ioExecutorPool = new ThreadPoolExecutor(0, Integer.MAX_VALUE, 
> 60L, TimeUnit.SECONDS, new SynchronousQueue<Runnable>(), tFactory);
>          this.ioExecutorFactory = new OrderedExecutorFactory(ioExecutorPool);
>       }
> {code}
> And it's using a {{SynchronousQueue}} to submit/take new wakeup tasks, it 
> worths investigate if using a different thread pool, executor or a different 
> "sleeping" strategy could reduce such cost under heavy load and improve 
> response time with/without replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to