[ 
https://issues.apache.org/jira/browse/QPID-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Vavricka updated QPID-8681:
---------------------------------
    Summary: [Broker-J] Addressing lock contention in Sorted Queues under high 
load by optimizing property fetching  (was: Addressing lock contention in 
Sorted Queues under high load by optimizing property fetching)

> [Broker-J] Addressing lock contention in Sorted Queues under high load by 
> optimizing property fetching
> ------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-8681
>                 URL: https://issues.apache.org/jira/browse/QPID-8681
>             Project: Qpid
>          Issue Type: Improvement
>          Components: Broker-J
>    Affects Versions: qpid-java-broker-7.0.9, qpid-java-broker-9.2.0
>            Reporter: Sudheer Sana
>            Assignee: Robert Godfrey
>            Priority: Major
>              Labels: contention, performance
>             Fix For: qpid-java-broker-9.2.1
>
>         Attachments: image-2025-01-27-22-41-16-668.png, 
> image-2025-01-27-22-42-13-739.png
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> *Summary:*
> Apache Qpid Broker J provides Sorted Queues, which allow users to implement a 
> message re-enqueue with delay feature. This feature is crucial in scenarios 
> where messages dequeued from regular queues cannot be processed immediately 
> due to certain preconditions (e.g., available resources, concurrency limits). 
> These messages are re-enqueued to Sorted Queues, where they are sorted based 
> on their delay expiry time. Periodic jobs then check these Sorted Queues for 
> expired messages and move them back to the regular queues for processing.
> Under high load conditions (e.g., a re-enqueue rate of ~7,000 messages per 
> second), we observed that the broker experiences contention issues, causing 
> it to stop responding to REST API calls (which time out if REST timeouts are 
> set on the client side otherwise just hung around). These APIs are used to 
> periodically fetch queue statistics.
> *Analysis:*
> By analyzing Java Flight Recorder (JFR) data, we identified the root cause of 
> the contention:
>  # REST API calls to retrieve queue depths required querying some specific 
> predefined properties from the broker.
>  # However, for each requested property, the broker was inadvertently 
> fetching all 26 properties, resulting in repeated and excessive lock 
> acquisition attempts on the Sorted Queue data structure.
>  # Among the requested properties, the *oldestMessageAge* property (used for 
> delay queues) significantly contributed to the contention by increasing the 
> number of lock requests.
>  # On the application side, querying for the *oldestMessageAge* property is 
> unnecessary when dealing with delay queues, so avoiding this query will 
> further mitigate the contention issue.
> *Stack Trace:*
> The following stack trace illustrates the contention observed during the 
> issue:
> {code:java}
> at 
> org.apache.qpid.server.queue.SortedQueueEntryList.next(SortedQueueEntryList.java:292)
>     at 
> org.apache.qpid.server.queue.SortedQueueEntryList$QueueEntryIteratorImpl.atTail(SortedQueueEntryList.java:686)
>     at 
> org.apache.qpid.server.queue.SortedQueueEntryList$QueueEntryIteratorImpl.advance(SortedQueueEntryList.java:698)
>     at 
> org.apache.qpid.server.queue.SortedQueueEntryList.getOldestEntry(SortedQueueEntryList.java:350)
>     at 
> org.apache.qpid.server.queue.AbstractQueue.getOldestMessageArrivalTime(AbstractQueue.java:1518)
>     at 
> org.apache.qpid.server.queue.AbstractQueue.getOldestMessageAge(AbstractQueue.java:1546)
>     at jdk.internal.reflect.GeneratedMethodAccessor167.invoke(Unknown 
> Source:-1)    at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:566)    at 
> org.apache.qpid.server.model.ConfiguredObjectMethodAttributeOrStatistic.getValue(ConfiguredObjectMethodAttributeOrStatistic.java:68)
>     at 
> org.apache.qpid.server.model.ConfiguredObjectMethodStatistic.getValue(ConfiguredObjectMethodStatistic.java:26)
>     at 
> org.apache.qpid.server.model.AbstractConfiguredObject.getStatistics(AbstractConfiguredObject.java:3181)
>     at 
> org.apache.qpid.server.queue.SortedQueueImplWithAccessChecking.getStatistics(SortedQueueImplWithAccessChecking.java:42)
>     at 
> org.apache.qpid.server.model.AbstractConfiguredObject.getStatistics(AbstractConfiguredObject.java:3168)
>     at 
> org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectExpressionFactory$ConfiguredObjectPropertyExpression.getValue(ConfiguredObjectExpressionFactory.java:312)
>     at 
> org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectExpressionFactory$ConfiguredObjectPropertyExpression.evaluate(ConfiguredObjectExpressionFactory.java:285)
>     at 
> org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectExpressionFactory$ConfiguredObjectPropertyExpression.evaluate(ConfiguredObjectExpressionFactory.java:272)
>     at 
> org.apache.qpid.server.filter.ComparisonExpression.evaluate(ComparisonExpression.java:388)
>     at 
> org.apache.qpid.server.filter.ComparisonExpression.matches(ComparisonExpression.java:580)
>     at 
> org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectQuery.filterObjects(ConfiguredObjectQuery.java:210)
>     at 
> org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectQuery.<init>(ConfiguredObjectQuery.java:86)
>     at 
> org.apache.qpid.server.management.plugin.servlet.rest.QueryServlet.performQuery(QueryServlet.java:93)
>     at 
> org.apache.qpid.server.management.plugin.servlet.rest.QueryServlet.doGet(QueryServlet.java:56)
>     at 
> org.apache.qpid.server.management.plugin.servlet.rest.AbstractServlet.doGet(AbstractServlet.java:128)
>     at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)    at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:790) {code}
>  * Java Flight Recorder (JFR) data
> !image-2025-01-27-22-41-16-668.png!
> *Steps to Reproduce:*
>  # Configure a Qpid Broker with Sorted Queues.
>  # Enable REST APIs for queue statistics retrieval.
>  # Simulate high load by re-enqueuing ~7,000 messages per second to the 
> Sorted Queues.
>  # Monitor broker performance and REST API response times.
> *Expected Behavior:* The broker should handle high re-enqueue rates without 
> contention issues, and REST API calls for queue statistics should not time 
> out.
> *Actual Behavior:* Under high load conditions, the broker experiences 
> contention on the Sorted Queue data structure, leading to REST API timeouts.
> *Proposed Solution:* 
> To address this issue, we propose optimizing the property fetching mechanism 
> in the Qpid Broker in ConfiguredObjectExpressionFactory.java:
>  # Modify the broker code (ConfiguredObjectExpressionFactory) to retrieve 
> only the specifically requested properties rather than all 26 properties.  
> !image-2025-01-27-22-42-13-739.png!
>  # Avoid querying the oldestMessageAge property on the application side when 
> dealing with delay queues, as it is not required for processing.
>  # These optimizations will reduce the load on the Sorted Queue's locking 
> mechanism and prevent redundant data fetches, thereby improving the broker's 
> responsiveness under high load conditions.
> We have tested the proposed changes in our environment and observed 
> significant performance improvements, including reduced lock contention and 
> faster REST API responses under high load.
>  
> _*(Originally opened by Ram Mantripragada, but Reporter updated to Sudheer 
> Sana, following discussion below)*_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to