[
https://issues.apache.org/jira/browse/OAK-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346451#comment-14346451
]
Chetan Mehrotra edited comment on OAK-2570 at 3/4/15 6:13 AM:
--------------------------------------------------------------
[Attached patch|^OAK-2570.patch] wraps the {{LuceneIndexProvider}} with
{{BackgroundObserver}}
[~mduerig] Can you review the usage around {{BackgroundObserver}}. In this case
its not required to distinguish between local change and external change so if
content changes are collapsed that should be fine, hence I set the size to 5 ..
probably even size being 1 would be better. Possibly {{BackgroundObserver}}
provide another option to indicate that diff are always merged if the current
one is being processed (kind of like size = 1)
[~alex.parvulescu] This would make writing test bit difficult as it would not
be easy to determine when the index has catched up. There are two options
# Disable the usage of {{BackgroundObserver}} via config. So on test setup we
set {{enableOpenIndexAsync}} to false
# Use a different way. Currently we record the time at which the index writer
is closed. We can surface that information as part of JMX stats for Lucene
[~mreutegg] [~mduerig] [~alex.parvulescu] [~mmarth] Given that such operation
might cause delay in some critical code path I would like to use
{{BackgroundObserver}} by default. Or for safety purpose we expose the OSGi
config and only enable this for system where such slowness is observed. Later
we can enable it by default. On trunk we would keep it enabled
was (Author: chetanm):
[Attached patch|^OAK-2570.patch] wraps the {{LuceneIndexProvider}} with
{{BackgroundObserver}}
[~mduerig] Can you review the usage around {{BackgroundObserver}}
[~alex.parvulescu] This would make writing test bit difficult as it would not
be easy to determine when the index has catched up. There are two options
* Disable the usage of {{BackgroundObserver}} via config. So on test setup we
set {{enableOpenIndexAsync}} to false
* Use a different way. Currently we record the time at which the index writer
is closed. We can surface that information as part of JMX stats for Lucene
[~mreutegg] [~mduerig] [~alex.parvulescu] [~mmarth] Given that such operation
might cause delay in some critical code path I would like to use
{{BackgroundObserver}} by default. Or for safety purpose we expose the OSGi
config and only enable this for system where such slowness is observed. Later
we can enable it by default. On trunk we would keep it enabled
> Open indexes in IndexTracker non blocking way
> ---------------------------------------------
>
> Key: OAK-2570
> URL: https://issues.apache.org/jira/browse/OAK-2570
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: oak-lucene
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.2, 1.0.12
>
> Attachments: OAK-2570.patch
>
>
> Currently {{IndexTracker}} opens the indexes in same thread where the commit
> happens as it acts as an {{Observer}}. If opening the index takes long time
> it would block the commit thread and might introduce delay in other operation
> on same thread.
> For example {{DocumentNodeStore}} invokes the {{Observer}} in
> {{backgroundRead}} and that call would not complete unless the {{Observer}}
> returns back. For a system to be responsive the {{backgroundRead}} has to be
> quick as it holds the write lock. Due to current implementation at time
> system turn unresponsive.
> Looking at the threaddump from one such unresponsive system it appears that
> slowness is due to Lucene index taking time to reopen.
> Thread which hold the {{backgroundOperationLock}} for {{read}} is
> {noformat}
> "pool-9-thread-1" Id=101 in RUNNABLE
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:146)
> at
> org.apache.jackrabbit.core.data.LazyFileInputStream.open(LazyFileInputStream.java:104)
> at
> org.apache.jackrabbit.core.data.LazyFileInputStream.read(LazyFileInputStream.java:163)
> at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
> at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:218)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
> at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
> at org.apache.lucene.util.fst.FST.<init>(FST.java:318)
> at org.apache.lucene.util.fst.FST.<init>(FST.java:304)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.<init>(BlockTreeTermsReader.java:484)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader.<init>(BlockTreeTermsReader.java:176)
> at
> org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
> at
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
> at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
> at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
> at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
> at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
> at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.<init>(IndexNode.java:94)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.open(IndexNode.java:62)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker$1.leave(IndexTracker.java:96)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:105)
> - locked
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker@6243264e
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:61)
> at
> org.apache.jackrabbit.oak.spi.commit.CompositeObserver.contentChanged(CompositeObserver.java:53)
> - locked org.apache.jackrabbit.oak.spi.commit.CompositeObserver@2687fd48
> at
> org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher.contentChanged(ChangeDispatcher.java:79)
> - locked org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher@2a0595bb
> at
> org.apache.jackrabbit.oak.plugins.document.CommitQueue.afterTrunkCommit(CommitQueue.java:131)
> - locked org.apache.jackrabbit.oak.plugins.document.CommitQueue@54a93d8a
> at
> org.apache.jackrabbit.oak.plugins.document.CommitQueue.done(CommitQueue.java:83)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.done(DocumentNodeStore.java:586)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:187)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:91)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:39)
> at
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$InMemory.merge(AbstractNodeStoreBranch.java:525)
> at
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.merge(AbstractNodeStoreBranch.java:318)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.merge(DocumentNodeStoreBranch.java:135)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.merge(DocumentRootBuilder.java:159)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.merge(DocumentNodeStore.java:1339)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.mergeWithConcurrencyCheck(AsyncIndexUpdate.java:420)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:389)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:300)
> - locked
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@401daa3f
> at
> org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:207)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Further {{DocumentNodeStore background thread}} is also waiting to acquire
> the write lock
> {noformat}
> "DocumentNodeStore background thread" Id=82 in WAITING on
> lock=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@6990d994
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
> at
> org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:156)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:1662)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundOperations(DocumentNodeStore.java:1456)
> - locked
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore@350b4c49
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundOperation.execute(DocumentNodeStore.java:2064)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:2039)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Now as javadoc of {{ReentrantReadWriteLock}} the readLock would not be
> granted if there is a waiter for write lock. Hence in such a case no further
> reads would be possible and system would not respond well
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)