[ 
https://issues.apache.org/jira/browse/OAK-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206859#comment-15206859
 ] 

Michael Dürig commented on OAK-4099:
------------------------------------

Something along the lines of approach B should solve this. However the callback 
of the {{GCMonitor}} are for monitoring, logging, etc. purposes. Putting 
anything significant here will result in brittle couplings with the gc process. 
E.g. the those callbacks are sometimes triggered while holding locks, which can 
easily lead to deadlocks. Have a look at 
{{org.apache.jackrabbit.oak.jcr.repository.RepositoryImpl.RefreshOnGC}}, which 
does something similar for refreshing  the sessions with observation listeners 
after a gc. This implementation simply sets a flag so the session knows it need 
to refresh itself when it is accessed next time. 
Furthermore there is no guarantee that these call backs are called "on time". 
That is, once you get the call back a underlying reference might have gone 
already. 

Like I said, with OAK-3348 this is all in flux. Once we have a better 
understanding of where these things move, we also need to discuss how the rest 
of the stack should handle the retention policy of the underlying store. 
Ironically this was the topic of OAK-114 already where the initial idea was [10 
minutes should suffice | 
https://issues.apache.org/jira/browse/OAK-114?focusedCommentId=13406509&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13406509]

> Lucene index appear to be corrupted with compaction enabled
> -----------------------------------------------------------
>
>                 Key: OAK-4099
>                 URL: https://issues.apache.org/jira/browse/OAK-4099
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>            Priority: Minor
>              Labels: resilience
>             Fix For: 1.6
>
>
> While running on SegmentNodStore and online compaction enabled it can happen 
> that access to Lucene index start failing with SegmentNotFoundException
> {noformat}
> Caused by: 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNotFoundException: Segment 
> a949519a-8903-44f9-a17e-b6d83fb32186 not found
>        at 
> org.apache.jackrabbit.oak.plugins.segment.file.FileStore.readSegment(FileStore.java:870)
>        at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentTracker.getSegment(SegmentTracker.java:136)
>        at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentId.getSegment(SegmentId.java:108)
>        at 
> org.apache.jackrabbit.oak.plugins.segment.Record.getSegment(Record.java:82)
>        at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:64)
>        at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:259)
>        at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:307)
>        at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:404)
>        at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:411)
>        at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
>        at 
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2397)
>        at 
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1973)
>        at 
> org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225)
>        at 
> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78)
>        at 
> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
>        at 
> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
>        at 
> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
>        at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:418)
>        at 
> org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:636)
>        at 
> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:683)
>        at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
> {noformat}
> The above segmentId was mentioned in the compaction log
> {noformat}
> 06.03.2016 02:03:30.706 *INFO* [TarMK flush thread 
> [/app/repository/segmentstore], active since Sun Mar 06 02:03:29 GMT 2016, 
> previous max duration 8218ms] 
> org.apache.jackrabbit.oak.plugins.segment.file.TarReader-GC Cleaned segments 
> from data00233a.tar:
>        37ec786e-a9f7-46eb-a3b5-ce5d4777ea01, 
> f36051fe-d8c4-46d1-ac1d-081946389eb6, fae91ff2-8ca6-4ac1-a8d8-d4bd09b7f6a6, 
> 16d87f09-721b-4155-a9c8-b8ecf471bfc3,
>        e641f1a3-b323-44e6-aad0-7b894a1efb69, 
> edc9d141-6c05-42c9-a2a2-d7130fd9c826, b602372c-b17a-448a-a8e9-8bdccc64fb82, 
> acc2f032-07ba-46ed-a9c7-d3a05ab53d7a,
>        a7323ed2-b2de-4006-ae51-e4f84165a0e4, 
> cb320c70-5ca9-4ed1-a972-e87a6bba9f9b, f45afd7e-5417-42dd-a2f7-4624f74b6c6e, 
> c66f66ef-cdd0-4327-abc6-bf910cb5768d,
>        7f925a07-ff56-4613-ac8f-272a0e481926, 
> 4ad044ec-3b2d-4c3e-aeb0-d5f5a04bc23e, 82f1c3aa-2e0c-421c-a033-e4ffcb6002c7, 
> 1387655b-f633-4011-a55c-d9580e40929b,
>        c50c94fc-2e8b-4904-a37f-0a33cc001312, 
> 7915e9ce-bb9d-4628-ad6f-e7f2844b2399, e7cd013b-a147-426a-af29-fa025058a08a, 
> f16d43b0-2113-4808-aea6-5910102e5c7d,
> ...
> *31edad2e-e14b-463d-a6af-540bac6009f1*,
> ...,
> *a949519a-8903-44f9-a17e-b6d83fb32186*,
> ...
> {noformat}
> *Note that system recovered after a restart so the corruption was transient*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to