keith-turner commented on a change in pull request #2329:
URL: https://github.com/apache/accumulo/pull/2329#discussion_r740272026
##########
File path:
core/src/main/java/org/apache/accumulo/fate/zookeeper/DistributedReadWriteLock.java
##########
@@ -218,22 +237,54 @@ public boolean tryLock() {
}
SortedMap<Long,byte[]> entries = qlock.getEarlierEntries(entry);
Iterator<Entry<Long,byte[]>> iterator = entries.entrySet().iterator();
- if (!iterator.hasNext())
+ if (!iterator.hasNext()) {
throw new IllegalStateException("Did not find our own lock in the
queue: " + this.entry
+ " userData " + new String(this.userData, UTF_8) + " lockType " +
lockType());
- return iterator.next().getKey().equals(entry);
+ }
+ if (!failBlockers) {
+ return iterator.next().getKey().equals(entry);
+ } else {
+ ZooStore<DistributedReadWriteLock> zs;
+ try {
+ zs = new ZooStore<>(zooPath, zrw);
+ } catch (KeeperException | InterruptedException e1) {
+ log.error("Error creating zoo store", e1);
+ return false;
+ }
+ final AdminUtil<DistributedReadWriteLock> util = new AdminUtil<>();
+ boolean result = true;
+ while (iterator.hasNext()) {
+ Entry<Long,byte[]> e = iterator.next();
+ if (!e.getKey().equals(entry)) {
+ result &= util.prepFail(zs, zrw, zooManagerPath,
Long.toString(e.getKey(), 16));
Review comment:
I think as long as the manager and this code are using the same zoostore
object, then the call to reserve will address the 2nd problem I mentioned
above. I see a third problem, prepFail transitions FATE ops to
FAILED_IN_PROGRESS. This means those fate ops will unwind an execute their
undo() operations which could modify the metadata table and zookeeper. So this
could lead to other FATE ops that no longer hold the lock still modifying
persisted state related to the table. Would probably need to wait for these
FATE ops to transition from FAILED_IN_PROGRESS to FAILED before getting the
lock. Have to be careful about how this wait is done. If all threads in the
fate thread pool are waiting for other operations to transition from
FAILED_IN_PROGRESS to FAILED, then that would not leave any threads available
to transitions those operations leading to deadlock.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]