https://bz.apache.org/bugzilla/show_bug.cgi?id=69527

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |NEEDINFO

--- Comment #14 from Mark Thomas <ma...@apache.org> ---
The full description of the original concurrency issue:

There is a race condition if concurrent threads are trying to PUT and DELETE
the same resource. The following sequence is problematic:

- Target resource exists on the file system but is not currently cached.
- Processing starts with the PUT thread. The existing resource is added to the
cache (entry-1) and the cache size is increased by the resource size.
- Just before the existing cache entry is removed in StandardRoot#write(),
processing switches to the DELETE thread.
- The DELETE thread retrieves entry-1 from the cache and invalidates it because
the cached last modified date does not match the file on the file system
because the PUT thread replaced it.
- entry-1 is removed from the cache. That reduces the cache size by the
resource size.

So far, so good. The cache size is consistent.

- The DELETE thread then creates a new cache entry (entry-2).
- Processing switches back to the PUT thread before entry-2 is validated in
Cache#getResource().
- The PUT thread entry-2 from the cache.
- The cache size is reduced by zero because webResource for the entry-2 is null
because it has not been validated.
- The DELETE thread then continues and validates the resource (the file has not
yet been deleted). That sets the size of entry-2 to the resource size and
increases the size of the resource cache by the resource size.

At this point entry-2 has been added and removed from the cache but
inconsistent sizes have been used. This corrupts the cache size value.

The original fix first ensured that whatever value was first used for the
content length was cached so add/remove was consistent. Further testing showed
a lock was required in case there were multiple threads calling
getContentLength() at the same time. Hence how I arrived at the original fix.

There may still be a narrow timing issue as the original fix used
cachedContentLengthLock in getContentLength() and synchronized (this) in
validateResource(). Those locks should almost certainly be the same to ensure
consistency.

Next step is to confirm that using a single lock addresses the original
concurrency issue.

Note: While the original concurrency issue is now well understood, the issue
reported here is not. More information is needed to determine the root cause of
the issue reported here so it can be addressed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to