[
https://issues.apache.org/jira/browse/ARTEMIS-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734656#comment-16734656
]
ASF GitHub Bot commented on ARTEMIS-2216:
-----------------------------------------
Github user franz1981 commented on the issue:
https://github.com/apache/activemq-artemis/pull/2484
@michaelandrepearce Done, the PR has been sent, now we can just wait the
perf results on it :)
I have improved quite a bit the live page cache behaviour/reliability
(especially if OOME), but sadly I see that the most called method `getMessage`
cannot be improved anymore without making the lock-free code a real nightmare.
The original version was O(n) depending which message was queried, because
it needs to walk the entire linked list of paged messages.
In my version I have amortized the cost by using an interesting hybrid
between an ArrayList and a LinkedList, similar to
https://en.wikipedia.org/wiki/Unrolled_linked_list, but (very) optimized for
addition.
I'm mentioning this, because is a long time I want to design a
single-threaded version of this same data-structure to be used as the main
datastructure inside QueueImpl.
> Use a specific executor for pageSyncTimer
> -----------------------------------------
>
> Key: ARTEMIS-2216
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2216
> Project: ActiveMQ Artemis
> Issue Type: Improvement
> Affects Versions: 2.6.3
> Reporter: Qihong Xu
> Priority: Major
> Attachments: contention_MASTER_global.svg, contention_PR_global.svg,
> contention_PR_single.svg
>
>
> Improving throughput on paging mode is one of our concerns since our cluster
> uses paging a lot.
> We found that pageSyncTimer in PagingStoreImpl shared the same executor with
> pageCursorProvider from thread pool. In heavy load scenario like hundreds of
> consumers receiving messages simultaneously, it became difficult for
> pageSyncTimer to get the executor due to race condition. Therefore page sync
> was delayed and producers suffered low throughput.
>
> To achieve higher performance we assign a specific executor to pageSyncTimer
> to avoid racing. And we run a small-scale test on a single modified broker.
>
> Broker: 4C/8G/500G SSD
> Producer: 200 threads, non-transactional send
> Consumer 200 threads, transactional receive
> Message text size: 100-200 bytes randomly
> AddressFullPolicy: PAGE
>
> Test result:
> | |Only Send TPS|Only Receive TPS|Send&Receive TPS|
> |Original ver|38k|33k|3k/30k|
> |Modified ver|38k|34k|30k/12.5k|
>
> The chart above shows that on modified broker send TPS improves from “poor”
> to “extremely fast”, while receive TPS drops from “extremely fast” to
> “not-bad” under heavy load. Considering consumer systems usually have a long
> processing chain after receiving messages, we don’t need too fast receive
> TPS. Instead, we want to guarantee send TPS to cope with traffic peak and
> lower producer’s delay time. Moreover, send and receive TPS in total raises
> from 33k to about 43k. From all above this trade-off seems beneficial and
> acceptable.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)