[ 
https://issues.apache.org/jira/browse/OAK-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322672#comment-14322672
 ] 

Michael Dürig commented on OAK-2294:
------------------------------------

Good stuff, thanks for looking into this! One minor nag: the segment overflow 
check in {{Segment#getRefCount}} will never trigger as {{n & 0xff > 255}} never 
holds. 

On a more general note, would it make sense to also explore the viability of an 
alternative solution where we'd lift the 255 segment references limit? One way 
to do this would be to reserve the last slot as an "extension slot" pointing to 
more references. Really not sure whether this would be time well spent or 
whether we'd rather go with what you did.

In both cases we still need to tackle the issue with backward compatibility and 
migration as we are changing the storage format. Migration could be done as we 
go: accept the old and new format when reading and write the new format when 
writing. Proactive migration could then be done by running offline compaction 
once. What's more difficult is preventing previous Oak versions to run with the 
new storage format. The segment header includes a version field, which we could 
use. Unfortunately all current Oak versions do not check that field effectively 
pretending to be infinitely forward compatible :( 



> Corrupt repository after concurrent version operations
> ------------------------------------------------------
>
>                 Key: OAK-2294
>                 URL: https://issues.apache.org/jira/browse/OAK-2294
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Alex Parvulescu
>              Labels: corruption
>             Fix For: 1.2, 1.0.12
>
>         Attachments: OAK-2294-2.patch, OAK-2294-v3.patch, OAK-2294.patch
>
>
> Performing version operations (checkin / checkout / addVersionLabel) 
> concurrently can corrupt the repository. 
> Executing the following code in parallel from multiple threads demonstrates 
> this:
> {code}
> Version version = versionManager.checkin(vPath);
> versionManager.checkout(vPath);
> String label = version.getName() + " " + Thread.currentThread().getName();
> version.getContainingHistory()
>     .addVersionLabel(version.getName(), label, true);
> {code}
> In my tests this eventually lead to all sorts of exceptions:
> {noformat}
> java.lang.IllegalStateException: RefId '85' doesn't exist in data segment 
> 0c5c0814-902c-429c-ad41-cd82aea276a2
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.getRefId(Segment.java:196)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.internalReadRecordId(Segment.java:307)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readRecordId(Segment.java:303)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.getBucketList(MapRecord.java:134)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.getEntries(MapRecord.java:347)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.getEntries(MapRecord.java:325)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:474)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:394)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:544)
> ...
> {noformat}
> {noformat}
> java.lang.IllegalStateException: String is too long: 2159501163930351661
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.loadString(Segment.java:352)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readString(Segment.java:319)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readString(Segment.java:313)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.loadTemplate(Segment.java:418)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readTemplate(Segment.java:367)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readTemplate(Segment.java:361)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.getTemplate(SegmentNodeState.java:78)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:408)
> ...
> {noformat}
> {noformat}
> java.lang.IllegalStateException
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:134)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.file.TarWriter.writeEntry(TarWriter.java:206)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.file.TarWriter.writeEntry(TarWriter.java:200)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.file.FileStore.writeSegment(FileStore.java:682)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.flush(SegmentWriter.java:228)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.prepare(SegmentWriter.java:329)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeTemplate(SegmentWriter.java:969)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1039)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1062)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:395)
> ...
> {noformat}
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Invalid type tag: 81
>       at org.apache.jackrabbit.oak.api.Type.fromTag(Type.java:202)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.loadTemplate(Segment.java:418)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readTemplate(Segment.java:367)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readTemplate(Segment.java:361)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.getTemplate(SegmentNodeState.java:78)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.getProperty(SegmentNodeState.java:122)
> ...
> {noformat}
> {noformat}
> Caused by: java.lang.IllegalStateException
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:134)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.pos(Segment.java:178)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.loadString(Segment.java:326)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readString(Segment.java:319)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.Segment.readString(Segment.java:313)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentPropertyState.getValue(SegmentPropertyState.java:174)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentPropertyState.getValue(SegmentPropertyState.java:147)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.AbstractPropertyState.equal(AbstractPropertyState.java:53)
> ...
> {noformat}
> Will attach a patch with a test case shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to