[ https://issues.apache.org/jira/browse/QPID-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17921754#comment-17921754 ]
Robbie Gemmell commented on QPID-8681: -------------------------------------- I said comment, rather than change the assignee, because setting an assignee requires rights that typical user accounts do not have, which is why you couldn't set the assignee. So simply comment that you intend to raise a PR if you do (or Sudheer in this case). Or better yet, raise the PR immediately after the new Jira and comment you have done so, if you have a change prepared already. If you have spent a long time working on and testing something before getting to raising the Jira, and you know you are going to raise a PR, then either ensure it is clear on the Jira that a PR is coming later when raising a Jira, or else just wait the small amount of extra time needed for you (or Sudheer in this case) to raise the PR at the same time before creating the Jira, and and then raise the PR right after creating the Jira and comment about it. You asked about the screenshot as being indicative of a PR coming. It isnt. In my experience, its actually generally more indicative of the reverse. Most of the time people present screenshots of potential changes, instead of just patches or PRs with the actual change, and dont make any indication that they (or in this case, Sudheer) are going to be preparing such a patch or a PR, or dont immediately do it along with the Jira, and then several hours pass, they (or someone else) typically _dont_ then raise a PR. Its actually far more typically an indication they wont or sometimes aren't allowed to. I'm not going to revert what is in the end actually a really fairly trivial change, even if impactful in your scenario, just to apply an effectively equal (but different) change that was actually only presented several hours after the first was pushed. This Jira also serves as some attribution of your/Sudheers efforts in itself, and in the end Rob did actually make that change himself and commit it so I wont unwind it. If Rob wants to burn some of his time unwinding his own change he can feel free to. Following this discussion you / Sudheer hopefully now have a better idea how to proceed in future in cases you actually intend to raise PRs with contributions and being credited on an actual commit is of importance to you / Sudheer. Noone is going to release 7.0.x at this point, it has been superceded by multiple other 7.x, 8.x and 9.x release streams in 5 years since it ceased being released. The only current release stream is 9.x at this point (/for years now already) and any security fixes would be made there. I dont believe there is any Slack/other discussion channel. The users (not dev) mailing lists, or Jira, or a PR, is where discussion usually happens. > Addressing lock contention in Sorted Queues under high load by optimizing > property fetching > ------------------------------------------------------------------------------------------- > > Key: QPID-8681 > URL: https://issues.apache.org/jira/browse/QPID-8681 > Project: Qpid > Issue Type: Improvement > Components: Broker-J > Affects Versions: qpid-java-broker-7.0.9 > Reporter: Ram Mantripragada > Assignee: Robert Godfrey > Priority: Critical > Labels: contention, performance > Fix For: qpid-java-broker-9.2.1 > > Attachments: image-2025-01-27-22-41-16-668.png, > image-2025-01-27-22-42-13-739.png > > Original Estimate: 6h > Remaining Estimate: 6h > > *Summary:* > Apache Qpid Broker J provides Sorted Queues, which allow users to implement a > message re-enqueue with delay feature. This feature is crucial in scenarios > where messages dequeued from regular queues cannot be processed immediately > due to certain preconditions (e.g., available resources, concurrency limits). > These messages are re-enqueued to Sorted Queues, where they are sorted based > on their delay expiry time. Periodic jobs then check these Sorted Queues for > expired messages and move them back to the regular queues for processing. > Under high load conditions (e.g., a re-enqueue rate of ~7,000 messages per > second), we observed that the broker experiences contention issues, causing > it to stop responding to REST API calls (which time out if REST timeouts are > set on the client side otherwise just hung around). These APIs are used to > periodically fetch queue statistics. > *Analysis:* > By analyzing Java Flight Recorder (JFR) data, we identified the root cause of > the contention: > # REST API calls to retrieve queue depths required querying some specific > predefined properties from the broker. > # However, for each requested property, the broker was inadvertently > fetching all 26 properties, resulting in repeated and excessive lock > acquisition attempts on the Sorted Queue data structure. > # Among the requested properties, the *oldestMessageAge* property (used for > delay queues) significantly contributed to the contention by increasing the > number of lock requests. > # On the application side, querying for the *oldestMessageAge* property is > unnecessary when dealing with delay queues, so avoiding this query will > further mitigate the contention issue. > *Stack Trace:* > The following stack trace illustrates the contention observed during the > issue: > {code:java} > at > org.apache.qpid.server.queue.SortedQueueEntryList.next(SortedQueueEntryList.java:292) > at > org.apache.qpid.server.queue.SortedQueueEntryList$QueueEntryIteratorImpl.atTail(SortedQueueEntryList.java:686) > at > org.apache.qpid.server.queue.SortedQueueEntryList$QueueEntryIteratorImpl.advance(SortedQueueEntryList.java:698) > at > org.apache.qpid.server.queue.SortedQueueEntryList.getOldestEntry(SortedQueueEntryList.java:350) > at > org.apache.qpid.server.queue.AbstractQueue.getOldestMessageArrivalTime(AbstractQueue.java:1518) > at > org.apache.qpid.server.queue.AbstractQueue.getOldestMessageAge(AbstractQueue.java:1546) > at jdk.internal.reflect.GeneratedMethodAccessor167.invoke(Unknown > Source:-1) at > jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:566) at > org.apache.qpid.server.model.ConfiguredObjectMethodAttributeOrStatistic.getValue(ConfiguredObjectMethodAttributeOrStatistic.java:68) > at > org.apache.qpid.server.model.ConfiguredObjectMethodStatistic.getValue(ConfiguredObjectMethodStatistic.java:26) > at > org.apache.qpid.server.model.AbstractConfiguredObject.getStatistics(AbstractConfiguredObject.java:3181) > at > org.apache.qpid.server.queue.SortedQueueImplWithAccessChecking.getStatistics(SortedQueueImplWithAccessChecking.java:42) > at > org.apache.qpid.server.model.AbstractConfiguredObject.getStatistics(AbstractConfiguredObject.java:3168) > at > org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectExpressionFactory$ConfiguredObjectPropertyExpression.getValue(ConfiguredObjectExpressionFactory.java:312) > at > org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectExpressionFactory$ConfiguredObjectPropertyExpression.evaluate(ConfiguredObjectExpressionFactory.java:285) > at > org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectExpressionFactory$ConfiguredObjectPropertyExpression.evaluate(ConfiguredObjectExpressionFactory.java:272) > at > org.apache.qpid.server.filter.ComparisonExpression.evaluate(ComparisonExpression.java:388) > at > org.apache.qpid.server.filter.ComparisonExpression.matches(ComparisonExpression.java:580) > at > org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectQuery.filterObjects(ConfiguredObjectQuery.java:210) > at > org.apache.qpid.server.management.plugin.servlet.query.ConfiguredObjectQuery.<init>(ConfiguredObjectQuery.java:86) > at > org.apache.qpid.server.management.plugin.servlet.rest.QueryServlet.performQuery(QueryServlet.java:93) > at > org.apache.qpid.server.management.plugin.servlet.rest.QueryServlet.doGet(QueryServlet.java:56) > at > org.apache.qpid.server.management.plugin.servlet.rest.AbstractServlet.doGet(AbstractServlet.java:128) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) {code} > * Java Flight Recorder (JFR) data > !image-2025-01-27-22-41-16-668.png! > *Steps to Reproduce:* > # Configure a Qpid Broker with Sorted Queues. > # Enable REST APIs for queue statistics retrieval. > # Simulate high load by re-enqueuing ~7,000 messages per second to the > Sorted Queues. > # Monitor broker performance and REST API response times. > *Expected Behavior:* The broker should handle high re-enqueue rates without > contention issues, and REST API calls for queue statistics should not time > out. > *Actual Behavior:* Under high load conditions, the broker experiences > contention on the Sorted Queue data structure, leading to REST API timeouts. > *Proposed Solution:* > To address this issue, we propose optimizing the property fetching mechanism > in the Qpid Broker in ConfiguredObjectExpressionFactory.java: > # Modify the broker code (ConfiguredObjectExpressionFactory) to retrieve > only the specifically requested properties rather than all 26 properties. > !image-2025-01-27-22-42-13-739.png! > # Avoid querying the oldestMessageAge property on the application side when > dealing with delay queues, as it is not required for processing. > # These optimizations will reduce the load on the Sorted Queue's locking > mechanism and prevent redundant data fetches, thereby improving the broker's > responsiveness under high load conditions. > We have tested the proposed changes in our environment and observed > significant performance improvements, including reduced lock contention and > faster REST API responses under high load. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org