Michael Dürig created OAK-7867:
----------------------------------
Summary: Flush thread gets stuck when input stream of binaries
block
Key: OAK-7867
URL: https://issues.apache.org/jira/browse/OAK-7867
Project: Jackrabbit Oak
Issue Type: Bug
Components: segment-tar
Reporter: Michael Dürig
Assignee: Michael Dürig
Fix For: 1.10
This issue tackles the root cause of the sever data loss that has been reported
in OAK-7852:
When a the input stream of a binary value blocks indefinitely on read the flush
thread of the segment store get blocked:
{noformat}
"pool-2-thread-1" #15 prio=5 os_prio=31 tid=0x00007fb0f21e3000 nid=0x5f03
waiting on condition [0x000070000a46d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076bba62b0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at com.google.common.util.concurrent.Monitor.await(Monitor.java:963)
at com.google.common.util.concurrent.Monitor.enterWhen(Monitor.java:402)
at
org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.safeEnterWhen(SegmentBufferWriterPool.java:179)
at
org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.flush(SegmentBufferWriterPool.java:138)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter.flush(DefaultSegmentWriter.java:138)
at
org.apache.jackrabbit.oak.segment.file.FileStore.lambda$doFlush$8(FileStore.java:307)
at
org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$22/1345968304.flush(Unknown
Source)
at
org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:237)
at
org.apache.jackrabbit.oak.segment.file.TarRevisions.flush(TarRevisions.java:195)
at org.apache.jackrabbit.oak.segment.file.FileStore.doFlush(FileStore.java:306)
at org.apache.jackrabbit.oak.segment.file.FileStore.flush(FileStore.java:318)
{noformat}
The condition {{0x000070000a46d000}} is waiting for the following thread to
return its {{SegmentBufferWriter}}, which will never happen if
{{InputStream.read(...)}} does not progress.
{noformat}
"pool-1-thread-1" #14 prio=5 os_prio=31 tid=0x00007fb0f223a800 nid=0x5d03
runnable [0x000070000a369000
] java.lang.Thread.State: RUNNABLE
at com.google.common.io.ByteStreams.read(ByteStreams.java:833)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.internalWriteStream(DefaultSegmentWriter.java:641)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeStream(DefaultSegmentWriter.java:618)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeBlob(DefaultSegmentWriter.java:577)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeProperty(DefaultSegmentWriter.java:691)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeProperty(DefaultSegmentWriter.java:677)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeNodeUncached(DefaultSegmentWriter.java:900)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeNode(DefaultSegmentWriter.java:799)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.access$800(DefaultSegmentWriter.java:252)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$8.execute(DefaultSegmentWriter.java:240)
at
org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.execute(SegmentBufferWriterPool.java:105)
at
org.apache.jackrabbit.oak.segment.DefaultSegmentWriter.writeNode(DefaultSegmentWriter.java:235)
at
org.apache.jackrabbit.oak.segment.SegmentWriter.writeNode(SegmentWriter.java:79)
{noformat}
This issue is critical as such a misbehaving input stream causes the flush
thread to get stuck preventing transient segments from being flushed and thus
causing data loss.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)