[ https://issues.apache.org/jira/browse/OAK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nitin Gupta closed OAK-8986. ---------------------------- > Segment flush thread can remain in TIMED_WAITING state even when segment > queue is empty > --------------------------------------------------------------------------------------- > > Key: OAK-8986 > URL: https://issues.apache.org/jira/browse/OAK-8986 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-azure > Affects Versions: 1.24.0, 1.26.0 > Reporter: Miroslav Smiljanic > Assignee: Marcel Reutegger > Priority: Major > Fix For: 1.30.0 > > Attachments: OAK-8986.patch, proposed_patch.patch, test.patch, > test_and_proposed_patch.patch > > > If thread is in interrupted state, during execution of [SegmentWriteQueue. > addToQueue > |https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L166] > InterruptedException will be thrown and wrapped in IOException. > Right befire calling queue.offer, element is added to segmentsByUUID map, and > never removed. > Normally that happens in thread that reads from > [queue|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L100], > and that invokes [consume(SegmentWriteAction > segment).|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L117] > Since item is not removed form the segmentsByUUID map, > [flusher|http://[https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L183]] > thread will remain in TIMED_WAITING state. > TarMK flush thread holds exclusivelly monitor needed by number of other > threads, causing repository to be blocked. > {noformat} > "TarMK flush [/opt/aem/launcher/repository/segmentstore-composite-global]" > #82 daemon prio=5 os_prio=0 cpu=83628.24ms elapsed=291420.48s > tid=0x00007fce902f3000 nid=0x1c2b in Object.wait() [0x00007fce00aa5000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base@11.0.3/Native Method) > - waiting on <no object reference available> > at > org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.flush(SegmentWriteQueue.java:183) > - waiting to re-lock in wait() <0x00000006b4911830> (a > java.util.concurrent.ConcurrentHashMap) > at > org.apache.jackrabbit.oak.segment.azure.AzureSegmentArchiveWriter.flush(AzureSegmentArchiveWriter.java:187) > at > org.apache.jackrabbit.oak.segment.file.tar.TarWriter.flush(TarWriter.java:186) > - locked <0x00000006b4911960> (a java.lang.Object) > at > org.apache.jackrabbit.oak.segment.file.tar.TarFiles.flush(TarFiles.java:535) > at > org.apache.jackrabbit.oak.segment.file.FileStore.lambda$tryFlush$9(FileStore.java:359) > at > org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$232/0x000000080067ac40.flush(Unknown > Source) > at > org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:236) > at > org.apache.jackrabbit.oak.segment.file.TarRevisions.tryFlush(TarRevisions.java:216) > at > org.apache.jackrabbit.oak.segment.file.FileStore.tryFlush(FileStore.java:357) > at > org.apache.jackrabbit.oak.segment.file.FileStore.lambda$new$5(FileStore.java:212) > at > org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$203/0x000000080064b440.run(Unknown > Source) > at > org.apache.jackrabbit.oak.segment.file.SafeRunnable.run(SafeRunnable.java:67) > at > java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515) > at > java.util.concurrent.FutureTask.runAndReset(java.base@11.0.3/FutureTask.java:305) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.3/ScheduledThreadPoolExecutor.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.3/Thread.java:834) > {noformat} > Here is the test case that demonstrates the problem. > [^test.patch] -- This message was sent by Atlassian Jira (v8.3.4#803005)