[jira] [Commented] (ARTEMIS-2216) Use a specific executor for pageSyncTimer

ASF GitHub Bot (JIRA) Mon, 07 Jan 2019 10:32:15 -0800


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736158#comment-16736158
 ]


ASF GitHub Bot commented on ARTEMIS-2216:
-----------------------------------------

Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2484
  
    @michaelandrepearce I would like first to trigger a CI job of some kind, 
maybe @clebertsuconic can help with his superbox (just this time) to get an 
answer sooner?
    
    Re the cache I was thinking already to send another PR, but I have verified 
that is virtually impossible that's the reason of the consumer slow-down. These 
are the numbers of a the bench comparing it with the original version:
    ```
    Benchmark               (size)      (type)   Mode  Cnt          Score       
   Error  Units
    CacheBench.getMessage1      32     chunked  thrpt   10  150039261.251 ± 
12504804.694  ops/s
    CacheBench.getMessage1      32  linkedlist  thrpt   10   31776755.611 ±  
1405838.635  ops/s
    CacheBench.getMessage1    1024     chunked  thrpt   10   31433127.788 ±  
3902498.303  ops/s
    CacheBench.getMessage1    1024  linkedlist  thrpt   10    2638404.341 ±   
119171.758  ops/s
    CacheBench.getMessage1  102400     chunked  thrpt   10     344799.911 ±    
27267.965  ops/s
    CacheBench.getMessage1  102400  linkedlist  thrpt   10      20020.925 ±     
5392.418  ops/s
    CacheBench.getMessage3      32     chunked  thrpt   10  384605640.192 ± 
35164543.632  ops/s
    CacheBench.getMessage3      32  linkedlist  thrpt   10   14124979.510 ±  
2875341.272  ops/s
    CacheBench.getMessage3    1024     chunked  thrpt   10   90418506.375 ±  
4593688.556  ops/s
    CacheBench.getMessage3    1024  linkedlist  thrpt   10    1562687.000 ±    
91433.926  ops/s
    CacheBench.getMessage3  102400     chunked  thrpt   10     978575.016 ±    
44800.161  ops/s
    CacheBench.getMessage3  102400  linkedlist  thrpt   10      21614.717 ±     
5344.742  ops/s
    ```
    Where `getMessage1` is `LivePageCacheImpl::getMessage` called @ random 
positions by 1 thread and 
    `getMessage3` is `LivePageCacheImpl::getMessage` called @ random positions 
by 3 threads.
    `chunked` is the version and `linkedlist` the original version: the 
difference is quite large and the new version just scale linearly...


> Use a specific executor for pageSyncTimer
> -----------------------------------------
>
>                 Key: ARTEMIS-2216
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2216
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>    Affects Versions: 2.6.3
>            Reporter: Qihong Xu
>            Priority: Major
>         Attachments: contention_MASTER_global.svg, contention_PR_global.svg, 
> contention_PR_single.svg
>
>
> Improving throughput on paging mode is one of our concerns since our cluster 
> uses paging a lot.
> We found that pageSyncTimer in PagingStoreImpl shared the same executor with 
> pageCursorProvider from thread pool. In heavy load scenario like hundreds of 
> consumers receiving messages simultaneously, it became difficult for 
> pageSyncTimer to get the executor due to race condition. Therefore page sync 
> was delayed and producers suffered low throughput.
>  
> To achieve higher performance we assign a specific executor to pageSyncTimer 
> to avoid racing. And we run a small-scale test on a single modified broker.
>  
> Broker: 4C/8G/500G SSD
> Producer: 200 threads, non-transactional send
> Consumer 200 threads, transactional receive
> Message text size: 100-200 bytes randomly
> AddressFullPolicy: PAGE
>  
> Test result：
> | |Only Send TPS|Only Receive TPS|Send&Receive TPS|
> |Original ver|38k|33k|3k/30k|
> |Modified ver|38k|34k|30k/12.5k|
>  
> The chart above shows that on modified broker send TPS improves from “poor” 
> to “extremely fast”, while receive TPS drops from “extremely fast” to 
> “not-bad” under heavy load. Considering consumer systems usually have a long 
> processing chain after receiving messages, we don’t need too fast receive 
> TPS. Instead, we want to guarantee send TPS to cope with traffic peak and 
> lower producer’s delay time. Moreover, send and receive TPS in total raises 
> from 33k to about 43k. From all above this trade-off seems beneficial and 
> acceptable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARTEMIS-2216) Use a specific executor for pageSyncTimer

Reply via email to