[
https://issues.apache.org/jira/browse/CURATOR-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13833414#comment-13833414
]
Orcun Simsek edited comment on CURATOR-79 at 8/5/14 7:01 PM:
-------------------------------------------------------------
Also adding a test that fails. (slight modification of the test attached in the
original thread)
{code:title=Test.java|borderStyle=solid}
private static final int SESSION_TIMEOUT_MS = 180 * 1000;
private static final int CONNECTION_TIMEOUT_MS = 16 * 1000;
private static final int BASE_SLEEP_TIME_MS = 1000;
private static final int MAX_SLEEP_TIME_MS = 16 * 1000;
private static final int MAX_RETRIES = 10;
@Test
public void testInterruptDeadlock() throws Exception {
CuratorFramework client = createClientWithNamespace("testCluster",
"127.0.0.1:2181");
client.start();
final InterProcessMutex lock = new InterProcessMutex(client,
"/testInterruption");
try {
try {
lock.acquire();
Thread.currentThread().interrupt();
lock.release();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
if (lock.isAcquiredInThisProcess()) {
lock.release();
}
}
assertTrue(lock.acquire(10, TimeUnit.MILLISECONDS));
} finally {
if (lock.isAcquiredInThisProcess()) {
lock.release();
}
}
}
private static CuratorFramework createClientWithNamespace(String
clusterName, String connectString) {
RetryPolicy retryPolicy = new
BoundedExponentialBackoffRetry(BASE_SLEEP_TIME_MS, MAX_SLEEP_TIME_MS,
MAX_RETRIES);
return CuratorFrameworkFactory.builder()
.sessionTimeoutMs(SESSION_TIMEOUT_MS)
.connectionTimeoutMs(CONNECTION_TIMEOUT_MS)
.namespace(clusterName)
.retryPolicy(retryPolicy)
.connectString(connectString)
.build();
}
{code}
was (Author: ortschun):
Also adding a test that fails. (slight modification of the test attached in the
original thread)
{code:title=Test.java|borderStyle=solid}
@Test
public void testInterruptDeadlock() throws Exception {
CuratorFramework client = CuratorFrameworkFactory.builder()
.connectString("127.0.0.1:2181")
.retryPolicy(new RetryNTimes(10, 1000))
.build();
client.start();
Thread.currentThread().interrupt();
final InterProcessMutex lock = new InterProcessMutex(client,
"/testInterruption4");
try {
try {
lock.acquire();
lock.release();
} catch (InterruptedException e) {
if (lock.isAcquiredInThisProcess()) {
lock.release();
}
}
assertTrue(lock.acquire(10, TimeUnit.MILLISECONDS));
} finally {
if (lock.isAcquiredInThisProcess()) {
System.out.println("Lock released successfully.");
lock.release();
}
}
}
{code}
> InterProcessMutex doesn't clean up after interrupt
> --------------------------------------------------
>
> Key: CURATOR-79
> URL: https://issues.apache.org/jira/browse/CURATOR-79
> Project: Apache Curator
> Issue Type: Bug
> Affects Versions: 2.0.0-incubating, 2.1.0-incubating, 2.2.0-incubating,
> 2.3.0
> Reporter: Orcun Simsek
> Assignee: Jordan Zimmerman
>
> InterProcessMutex can deadlock if a thread is interrupted during acquire().
> Specifically, CreateBuilderImpl.pathInForeground submits a create request to
> ZooKeeper, and an InterruptedException is thrown after the node is created in
> ZK but before ZK.create returns. ZK.create propagates a non-KeeperException,
> so Curator assumes the create has failed, but does not retry, and the node is
> now orphaned. At some point in the future, the node becomes the next in the
> acquisition sequence, but is not reclaimed as the ZK session has not expired.
> <stack trace attached in comments below>
> Curator should catch the InterruptedException and other non-KeeperExceptions,
> and delete the created node before propagating these exceptions.
> (as originally discussed on
> https://groups.google.com/forum/#!topic/curator-users/9ii5of8SbdQ)
--
This message was sent by Atlassian JIRA
(v6.2#6252)