[
https://issues.apache.org/jira/browse/JCR-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Unico Hommes updated JCR-3888:
------------------------------
Description:
When using clustering SharedItemStateManager#Update#begin sets a lock on the
database (line 573) and subsequently a write lock is set (line 579):
{code:java}
// let listener know about change
try {
eventChannel.updateCreated(this); // line 573
} catch (ClusterException e) {
throw new ItemStateException(e.getMessage(), e);
}
try {
writeLock = acquireWriteLock(local); // line 579
} finally {
if (writeLock == null) {
eventChannel.updateCancelled(this); // line 582
}
}
{code}
In SharedItemStateManager#Update#end the update is committed and the write lock
released:
{code:java}
eventChannel.updateCommitted(this, path); // line 849
setAttribute(ATTRIBUTE_UPDATE_SIZE, null);
if (writeLock != null) {
// exception occurred before downgrading lock
writeLock.release();
writeLock = null;
} else if (readLock != null) {
try {
if (succeeded) {
/* dispatch the events */
events.dispatch();
}
} finally {
readLock.release(); // 863
}
}
{code}
However, the implementation of eventChannel#updateCommitted in
ClusterNode#WorkspaceUpdateChannel#updateCommitted has the following code:
{code:java}
long journalUpdateSize = record.update(); // line 700
long recordRevision = record.getRevision();
setRevision(recordRevision); // line 703
{code}
The database lock is freed during the call to update the record on line 700.
If a contending thread now sets a database lock at SharedItemStateManager:573
before this thread can set the revision in the database at ClusterNode:703 we
have a dead lock. This is because this thread holds the write lock in
SharedItemStateManager causing the contending thread to wait at
SharedItemStateManager:579 while the contending thread holds the database lock
causing this thread to wait at ClusterNode:703
was:
When using clustering SharedItemStateManager#Update#begin sets a lock on the
database (line 573) and subsequently a write lock is set (line 582):
{code:java}
// let listener know about change
try {
eventChannel.updateCreated(this); // line 573
} catch (ClusterException e) {
throw new ItemStateException(e.getMessage(), e);
}
try {
writeLock = acquireWriteLock(local); // line 579
} finally {
if (writeLock == null) {
eventChannel.updateCancelled(this); // line 582
}
}
{code}
In SharedItemStateManager#Update#end the update is committed and the write lock
released:
{code:java}
eventChannel.updateCommitted(this, path); // line 849
setAttribute(ATTRIBUTE_UPDATE_SIZE, null);
if (writeLock != null) {
// exception occurred before downgrading lock
writeLock.release();
writeLock = null;
} else if (readLock != null) {
try {
if (succeeded) {
/* dispatch the events */
events.dispatch();
}
} finally {
readLock.release(); // 863
}
}
{code}
However, the implementation of eventChannel#updateCommitted in
ClusterNode#WorkspaceUpdateChannel#updateCommitted has the following code:
{code:java}
long journalUpdateSize = record.update(); // line 700
long recordRevision = record.getRevision();
setRevision(recordRevision); // line 703
{code}
The database lock is freed during the call to update the record on line 700.
If a contending thread now sets a database lock at SharedItemStateManager:573
before this thread can set the revision in the database at ClusterNode:703 we
have a dead lock. This is because this thread holds the write lock in
SharedItemStateManager causing the contending thread to wait at
SharedItemStateManager:579 while the contending thread holds the database lock
causing this thread to wait at ClusterNode:703
> Possible dead lock in SharedItemStateManager in clustered environment
> ---------------------------------------------------------------------
>
> Key: JCR-3888
> URL: https://issues.apache.org/jira/browse/JCR-3888
> Project: Jackrabbit Content Repository
> Issue Type: Bug
> Affects Versions: 2.10.1
> Reporter: Unico Hommes
> Assignee: Unico Hommes
> Fix For: 2.10.2
>
>
> When using clustering SharedItemStateManager#Update#begin sets a lock on the
> database (line 573) and subsequently a write lock is set (line 579):
> {code:java}
> // let listener know about change
> try {
> eventChannel.updateCreated(this); // line 573
> } catch (ClusterException e) {
> throw new ItemStateException(e.getMessage(), e);
> }
> try {
> writeLock = acquireWriteLock(local); // line 579
> } finally {
> if (writeLock == null) {
> eventChannel.updateCancelled(this); // line 582
> }
> }
> {code}
> In SharedItemStateManager#Update#end the update is committed and the write
> lock released:
> {code:java}
> eventChannel.updateCommitted(this, path); // line 849
> setAttribute(ATTRIBUTE_UPDATE_SIZE, null);
> if (writeLock != null) {
> // exception occurred before downgrading lock
> writeLock.release();
> writeLock = null;
> } else if (readLock != null) {
> try {
> if (succeeded) {
> /* dispatch the events */
> events.dispatch();
> }
> } finally {
> readLock.release(); // 863
> }
> }
> {code}
> However, the implementation of eventChannel#updateCommitted in
> ClusterNode#WorkspaceUpdateChannel#updateCommitted has the following code:
> {code:java}
> long journalUpdateSize = record.update(); // line 700
> long recordRevision = record.getRevision();
> setRevision(recordRevision); // line 703
> {code}
> The database lock is freed during the call to update the record on line 700.
> If a contending thread now sets a database lock at SharedItemStateManager:573
> before this thread can set the revision in the database at ClusterNode:703 we
> have a dead lock. This is because this thread holds the write lock in
> SharedItemStateManager causing the contending thread to wait at
> SharedItemStateManager:579 while the contending thread holds the database
> lock causing this thread to wait at ClusterNode:703
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)