[ https://issues.apache.org/jira/browse/ARTEMIS-2418?focusedWorklogId=273223&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273223 ]
ASF GitHub Bot logged work on ARTEMIS-2418: ------------------------------------------- Author: ASF GitHub Bot Created on: 08/Jul/19 11:13 Start Date: 08/Jul/19 11:13 Worklog Time Spent: 10m Work Description: wy96f commented on pull request #2743: ARTEMIS-2418 Race conditions between cursor movement and page writing URL: https://github.com/apache/activemq-artemis/pull/2743 The current code of CursorIterator::internalGetNext is a little complicated and not easy to follow logically. And there are two race conditions between cursor movement and page writing: 1. Suppose the cursor's initial position is (p1,0) and page p1 is live with 0 msg. When we call internalGetNext(), the cursor moves to next page, i.e. position is (p2, 0) now. Meanwhile p1 is filled with message m1 and p2 is created with m2. Then we retrieve m2 from p2 rather than m1. 2. Suppose the cursor's initial position is (p1, 1) and the page p1 is non live with 1 msg. When we call internalGetNext(), the cursor moves to position(p2, 0) and get null page cache since p2 is not yet created. Then p2 is created with m1 and p3 is created with m2 which means current writing page no. is p3. After the while loop the cursor moves to position(p3, 0) and we retrieve m2 from p3 rather than m1. In both cases we would miss message m1 and subsequent page files won't be deleted unless the broker restarts. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 273223) Time Spent: 10m Remaining Estimate: 0h > Race conditions between cursor movement and page writing > -------------------------------------------------------- > > Key: ARTEMIS-2418 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2418 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 2.9.0 > Reporter: yangwei > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The current code of CursorIterator::internalGetNext is a little complicated > and not easy to follow logically. > And there are two race conditions between cursor movement and page writing: > 1. Suppose the cursor's initial position is (p1,0) and page p1 is live with 0 > msg. When we call internalGetNext(), the cursor moves to next page, i.e. > position is (p2, 0) now. Meanwhile p1 is filled with message m1 and p2 is > created with m2. Then we retrieve m2 from p2 rather than m1. > 2. Suppose the cursor's initial position is (p1, 1) and the page p1 is non > live with 1 msg. When we call internalGetNext(), the cursor moves to > position(p2, 0) and get null page cache since p2 is not yet created. Then p2 > is created with m1 and p3 is created with m2 which means current writing page > no. is p3. After the while loop the cursor moves to position(p3, 0) and we > retrieve m2 from p3 rather than m1. > In both cases we would miss message m1 and subsequent page files won't be > deleted unless the broker restarts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)