[
https://issues.apache.org/jira/browse/OAK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcel Reutegger resolved OAK-8986.
-----------------------------------
Fix Version/s: 1.28.0
Resolution: Fixed
Applied updated patch to trunk: http://svn.apache.org/r1876190
[~smiroslav], thanks for reporting and providing a patch!
> Segment flush thread can remanin in TIMED_WAITING state even when segment
> queue is empty
> ----------------------------------------------------------------------------------------
>
> Key: OAK-8986
> URL: https://issues.apache.org/jira/browse/OAK-8986
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: segment-azure
> Affects Versions: 1.24.0, 1.26.0
> Reporter: Miroslav Smiljanic
> Assignee: Marcel Reutegger
> Priority: Major
> Fix For: 1.28.0
>
> Attachments: OAK-8986.patch, proposed_patch.patch, test.patch,
> test_and_proposed_patch.patch
>
>
> If thread is in interrupted state, during execution of [SegmentWriteQueue.
> addToQueue
> |https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L166]
> InterruptedException will be thrown and wrapped in IOException.
> Right befire calling queue.offer, element is added to segmentsByUUID map, and
> never removed.
> Normally that happens in thread that reads from
> [queue|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L100],
> and that invokes [consume(SegmentWriteAction
> segment).|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L117]
> Since item is not removed form the segmentsByUUID map,
> [flusher|http://[https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L183]]
> thread will remain in TIMED_WAITING state.
> TarMK flush thread holds exclusivelly monitor needed by number of other
> threads, causing repository to be blocked.
> {noformat}
> "TarMK flush [/opt/aem/launcher/repository/segmentstore-composite-global]"
> #82 daemon prio=5 os_prio=0 cpu=83628.24ms elapsed=291420.48s
> tid=0x00007fce902f3000 nid=0x1c2b in Object.wait() [0x00007fce00aa5000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait([email protected]/Native Method)
> - waiting on <no object reference available>
> at
> org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.flush(SegmentWriteQueue.java:183)
> - waiting to re-lock in wait() <0x00000006b4911830> (a
> java.util.concurrent.ConcurrentHashMap)
> at
> org.apache.jackrabbit.oak.segment.azure.AzureSegmentArchiveWriter.flush(AzureSegmentArchiveWriter.java:187)
> at
> org.apache.jackrabbit.oak.segment.file.tar.TarWriter.flush(TarWriter.java:186)
> - locked <0x00000006b4911960> (a java.lang.Object)
> at
> org.apache.jackrabbit.oak.segment.file.tar.TarFiles.flush(TarFiles.java:535)
> at
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$tryFlush$9(FileStore.java:359)
> at
> org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$232/0x000000080067ac40.flush(Unknown
> Source)
> at
> org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:236)
> at
> org.apache.jackrabbit.oak.segment.file.TarRevisions.tryFlush(TarRevisions.java:216)
> at
> org.apache.jackrabbit.oak.segment.file.FileStore.tryFlush(FileStore.java:357)
> at
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$new$5(FileStore.java:212)
> at
> org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$203/0x000000080064b440.run(Unknown
> Source)
> at
> org.apache.jackrabbit.oak.segment.file.SafeRunnable.run(SafeRunnable.java:67)
> at
> java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:515)
> at
> java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:305)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1128)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
> at java.lang.Thread.run([email protected]/Thread.java:834)
> {noformat}
> Here is the test case that demonstrates the problem.
> [^test.patch]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)