MarkGaox opened a new pull request, #2698:
URL: https://github.com/apache/helix/pull/2698

   ### Issues
   
   - [X] My PR addresses the following Helix issues and references them in the 
PR description:
   - This PR mainly addresses the recent behavior regression of helix-lock
   
   (#200 - Link your issue number here: You can write "Fixes #XXX". Please use 
the proper keyword so that the issue gets closed automatically. See 
https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue
   Any of the following keywords can be used: close, closes, closed, fix, 
fixes, fixed, resolve, resolves, resolved)
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI 
changes:
   - The incompatibility between old and new helix-lock versions was caused by 
the last update to helix-lock in Dec 2020, the update is in 
https://github.com/apache/helix/pull/1564 which added priority and notification 
support to Helix locks.
   #### There are two separate regressions in this PR
   1. When a lock request is made from the new helix-lock version to a lock 
path currently locked by the old helix-lock version, what happens is:
   * The lock request sees current lock priority is -1 (due to the priority 
field not present in lock ZNode), which is lower than the priority 0 of the 
lock being requested. Thus it will try to preempt the current lock owner by 
writing its own user id, priority and waiting timeout to the requestor fields 
in the lock ZNode. The lock request pending timeout is set to -1 since the 
current lock cleanup timeout is -1 (due to the cleanup timeout field not 
present in lock ZNode). Lock status of the lock request now becomes PENDING.
   * The lock request waits on a CountDownLatch for the pending timeout which 
is -1, therefore the wait immediately returns.
   The current lock owner won’t clean up itself and release the lock since it’s 
using the old helix-lock version which doesn’t react to lock requests with 
higher priority.
   * Since the lock status of the lock request is still PENDING and the lock 
request by default is not forceful (forceful is also not desired for Espresso 
use cases), an exception is thrown saying “Cleanup has not been finished by 
lock owner”, which breaks clients' workflow.
   2. The non-lock owners is able to `unlock()` a lock if its priority is 
larger than the priority recorded in the lock. This should be avoided as well. 
If the non-lock owners want to acquire a lock, it should only call `tryLock()`. 
   
   #### To resolve the two issue discussed above
   1. Change the default value of priority from -1 to 0. In this way, a lock 
request made from new helix-lock version with default priority won't be able to 
acquire the lock holding by the old helix-lock.
   2. When processing the unlock request, the updater should read the priority 
from the overwriting lockInfo instead of reading the priority from 
`ZKDistributedNonblockingLock` class field. Since the priority of any 
`unlock()`  request in the overwriting lockInfo will always be 0, the priority 
of `unlock()` will always be less than or equal to the priority of any lock. 
Thus, unless the `unlock` requestor is the owner, the requestor can't `unlock`, 
which is the expected behavior.
   
   
   (Write a concise description including what, why, how)
   
   ### Tests
   `mvn test 
-Dtest=TestZKHelixNonblockingLock,TestZKHelixNonblockingLockWithPriority -pl 
helix-lock`
   - [X] The following tests are written for this issue:
   - 
   ```
   [INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
54.493 s - in TestSuite
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] 
   [INFO] --- jacoco:0.8.6:report (generate-code-coverage-report) @ helix-lock 
---
   [INFO] Loading execution data file 
/Users/xiaxgao/IdeaProjects/helix_ps/helix-lock/target/jacoco.exec
   [INFO] Analyzed bundle 'Apache Helix :: Distributed Lock' with 13 classes
   [INFO] 
------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] 
------------------------------------------------------------------------
   [INFO] Total time:  56.611 s
   [INFO] Finished at: 2023-11-15T10:39:34-08:00
   [INFO] 
------------------------------------------------------------------------
   
   ```
   
   (List the names of added unit/integration tests)
   
   - The following is the result of the "mvn test" command on the appropriate 
module:
   
   (If CI test fails due to known issue, please specify the issue and test PR 
locally. Then copy & paste the result of "mvn test" to here.)
   
   ### Changes that Break Backward Compatibility (Optional)
   
   - My PR contains changes that break backward compatibility or previous 
assumptions for certain methods or API. They include:
   
   (Consider including all behavior changes for public methods or API. Also 
include these changes in merge description so that other developers are aware 
of these changes. This allows them to make relevant code changes in feature 
branches accounting for the new method/API behavior.)
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following 
wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their 
subject lines. In addition, my commits follow the guidelines from "[How to 
write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to