[
https://issues.apache.org/jira/browse/OAK-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208131#comment-15208131
]
Chetan Mehrotra commented on OAK-4099:
--------------------------------------
bq. E.g. the those callbacks are sometimes triggered while holding locks, which
can easily lead to deadlocks.
Yup so my thought was along the same lines as {{RefreshOnGC}}. Upon receiving
the callback close the current indexes in {{IndexTracker}} via an async job
i.e. done in a separate thread. With this the next call to fetch index would
lead to new index.
The {{root}} reference would anyway get updated in the next observation call. I
can come up with a patch for the same if you think this would be fine to do
bq. Furthermore there is no guarantee that these call backs are called "on
time". That is, once you get the call back a underlying reference might have
gone already.
Can you elaborate bit more
> Lucene index appear to be corrupted with compaction enabled
> -----------------------------------------------------------
>
> Key: OAK-4099
> URL: https://issues.apache.org/jira/browse/OAK-4099
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: lucene
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Priority: Minor
> Labels: resilience
> Fix For: 1.6
>
>
> While running on SegmentNodStore and online compaction enabled it can happen
> that access to Lucene index start failing with SegmentNotFoundException
> {noformat}
> Caused by:
> org.apache.jackrabbit.oak.plugins.segment.SegmentNotFoundException: Segment
> a949519a-8903-44f9-a17e-b6d83fb32186 not found
> at
> org.apache.jackrabbit.oak.plugins.segment.file.FileStore.readSegment(FileStore.java:870)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentTracker.getSegment(SegmentTracker.java:136)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentId.getSegment(SegmentId.java:108)
> at
> org.apache.jackrabbit.oak.plugins.segment.Record.getSegment(Record.java:82)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:64)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:259)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:307)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:404)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:411)
> at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2397)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1973)
> at
> org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225)
> at
> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78)
> at
> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
> at
> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
> at
> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
> at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:418)
> at
> org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:636)
> at
> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:683)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
> {noformat}
> The above segmentId was mentioned in the compaction log
> {noformat}
> 06.03.2016 02:03:30.706 *INFO* [TarMK flush thread
> [/app/repository/segmentstore], active since Sun Mar 06 02:03:29 GMT 2016,
> previous max duration 8218ms]
> org.apache.jackrabbit.oak.plugins.segment.file.TarReader-GC Cleaned segments
> from data00233a.tar:
> 37ec786e-a9f7-46eb-a3b5-ce5d4777ea01,
> f36051fe-d8c4-46d1-ac1d-081946389eb6, fae91ff2-8ca6-4ac1-a8d8-d4bd09b7f6a6,
> 16d87f09-721b-4155-a9c8-b8ecf471bfc3,
> e641f1a3-b323-44e6-aad0-7b894a1efb69,
> edc9d141-6c05-42c9-a2a2-d7130fd9c826, b602372c-b17a-448a-a8e9-8bdccc64fb82,
> acc2f032-07ba-46ed-a9c7-d3a05ab53d7a,
> a7323ed2-b2de-4006-ae51-e4f84165a0e4,
> cb320c70-5ca9-4ed1-a972-e87a6bba9f9b, f45afd7e-5417-42dd-a2f7-4624f74b6c6e,
> c66f66ef-cdd0-4327-abc6-bf910cb5768d,
> 7f925a07-ff56-4613-ac8f-272a0e481926,
> 4ad044ec-3b2d-4c3e-aeb0-d5f5a04bc23e, 82f1c3aa-2e0c-421c-a033-e4ffcb6002c7,
> 1387655b-f633-4011-a55c-d9580e40929b,
> c50c94fc-2e8b-4904-a37f-0a33cc001312,
> 7915e9ce-bb9d-4628-ad6f-e7f2844b2399, e7cd013b-a147-426a-af29-fa025058a08a,
> f16d43b0-2113-4808-aea6-5910102e5c7d,
> ...
> *31edad2e-e14b-463d-a6af-540bac6009f1*,
> ...,
> *a949519a-8903-44f9-a17e-b6d83fb32186*,
> ...
> {noformat}
> *Note that system recovered after a restart so the corruption was transient*
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)