[ 
https://issues.apache.org/jira/browse/ARTEMIS-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734656#comment-16734656
 ] 

ASF GitHub Bot commented on ARTEMIS-2216:
-----------------------------------------

Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2484
  
    @michaelandrepearce Done, the PR has been sent, now we can just wait the 
perf results on it :)
    I have improved quite a bit the live page cache behaviour/reliability 
(especially if OOME), but sadly I see that the most called method `getMessage` 
cannot be improved anymore without making the lock-free code a real nightmare.
    The original version was O(n) depending which message was queried, because 
it needs to walk the entire linked list of paged messages. 
    In my version I have amortized the cost by using an interesting hybrid 
between an ArrayList and a LinkedList, similar to 
https://en.wikipedia.org/wiki/Unrolled_linked_list, but (very) optimized for 
addition.
    I'm mentioning this, because is a long time I want to design a 
single-threaded version of this same data-structure to be used as the main 
datastructure inside QueueImpl.



> Use a specific executor for pageSyncTimer
> -----------------------------------------
>
>                 Key: ARTEMIS-2216
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2216
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>    Affects Versions: 2.6.3
>            Reporter: Qihong Xu
>            Priority: Major
>         Attachments: contention_MASTER_global.svg, contention_PR_global.svg, 
> contention_PR_single.svg
>
>
> Improving throughput on paging mode is one of our concerns since our cluster 
> uses paging a lot.
> We found that pageSyncTimer in PagingStoreImpl shared the same executor with 
> pageCursorProvider from thread pool. In heavy load scenario like hundreds of 
> consumers receiving messages simultaneously, it became difficult for 
> pageSyncTimer to get the executor due to race condition. Therefore page sync 
> was delayed and producers suffered low throughput.
>  
> To achieve higher performance we assign a specific executor to pageSyncTimer 
> to avoid racing. And we run a small-scale test on a single modified broker.
>  
> Broker: 4C/8G/500G SSD
> Producer: 200 threads, non-transactional send
> Consumer 200 threads, transactional receive
> Message text size: 100-200 bytes randomly
> AddressFullPolicy: PAGE
>  
> Test result:
> | |Only Send TPS|Only Receive TPS|Send&Receive TPS|
> |Original ver|38k|33k|3k/30k|
> |Modified ver|38k|34k|30k/12.5k|
>  
> The chart above shows that on modified broker send TPS improves from “poor” 
> to “extremely fast”, while receive TPS drops from “extremely fast” to 
> “not-bad” under heavy load. Considering consumer systems usually have a long 
> processing chain after receiving messages, we don’t need too fast receive 
> TPS. Instead, we want to guarantee send TPS to cope with traffic peak and 
> lower producer’s delay time. Moreover, send and receive TPS in total raises 
> from 33k to about 43k. From all above this trade-off seems beneficial and 
> acceptable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to