[ 
https://issues.apache.org/jira/browse/ARTEMIS-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733824#comment-16733824
 ] 

ASF GitHub Bot commented on ARTEMIS-2214:
-----------------------------------------

Github user qihongxu commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    @michaelandrepearce 
    There are some cases will perform lots of rollbacks in a short period of 
time. For example if we would like to upgrade our server while thousands of 
consumers are receiving message, the close of server causes massive rollbacks 
to original address, thus queue might blocked on reading GC'ed pages. Under 
this circumstance the upgrade will take more than 5-10 minutes(e.g 2000 
consumers) and make a negative impact on downstream systems.
    
    > deliveryTime can be set in the constructor like transactionID , messageID 
, etc :)
    
    @wy96f 
    Yes we do find similar block in 
PagedReferenceImpl::getScheduledDeliveryTime() since if deliveryTime is not set 
it will call getMessage() during rollback. 
    
    To these two situations, detailed stacks are shown in attachment.
    
    Considering priority only occupy one byte, it might be worthwhile to add it 
in PageRef to improve stability:) As for deliveryTime, since it is already in 
PageRef, we can simply add `this.deliveryTime = 
message.getMessage().getScheduledDeliveryTime();` in constructor to avoid block 
on rollback.


> Cache durable&priority in PagedReference to avoid blocks in consuming paged 
> messages
> ------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-2214
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2214
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.6.3
>            Reporter: Qihong Xu
>            Priority: Major
>         Attachments: stacks.txt
>
>
> We recently performed a test on artemis broker and found a severe performance 
> issue.
> When paged messages are being consumed, decrementMetrics in 
> QueuePendingMessageMetrics will try to ‘getMessage’ to check whether they are 
> durable or not. In this way queue will be locked for a long time because page 
> may be GCed and need to be reload entirely. Other operations rely on queue 
> will be blocked at this time, which cause a significant TPS drop. Detailed 
> stacks are attached below.
> This also happens when consumer is closed and messages are pushed back to the 
> queue, artemis will check priority on return if these messages are paged.
> To solve the issue, durable and priority need to be cached in PagedReference 
> just like messageID, transactionID and so on. I have applied a patch to fix 
> the issue. Any review is appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to