hi shaun, are you sure that this is a 1.3.1 specific issue?
i remember an earlier post were you described the same problem, but apparently you weren't using 1.3.1: http://www.nabble.com/Strange-%22ignoring-nonexistent-item%22-and-removeitem-fails-tf4169086.html On 8/17/07, sbarriba <[EMAIL PROTECTED]> wrote: > Hi Stefan et al, > Further update on this, plus some answers to your questions. > > The consistency check and fix logic in JackRabbit 1.3.1 solved all but 1 of > the issues. However although the log reports the remaining issue has been > fixed each time, this message appears after repeated restarts :( > > org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager - > acme: checked 1000/0 bundles... > org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager - > NodeState fe75116c-5617-423b-8c9a-4a964b667f20 references unexistent child > {http://www.acme.co.uk/xmlns/contentmodel}components with id > d3c09b52-d3be-4d3c-8807-b7827d337973 > org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager - > acme: checked 2000/0 bundles... > org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager - > acme: Fixing 1 inconsistent bundle(s)... > org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager - > acme: Fixing bundle fe75116c-5617-423b-8c9a-4a964b667f20 > org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager - > acme: checked 2505/0 bundles. > > Is the consistency checker the only way to fix up these problems, or is > there any way we can 'open the hood' to investigate further? only by getting your hands really dirty and by delving deep into the code... > > Stefan wrote: > "did you notice anything peculiar about the corrupt nodes? is there a chance > to reconstruct the steps that lead to this state?" > > What tools what you recommend using to review the corrupt nodes? We only > currently use the command contrib. project. > > Reproducing this scenario is proving really difficult. The original > corruption occurred when a user was creating a particularly complex node > object which included the creation, deletion and re-ordering of various > same-name-siblings. After multi-hours of attempts we are yet to reproduce > the event. Frustrating, but we know its occurred at least twice. > > "furthermore, could you please share some details about your > config/deployment?" > > Sure. > - JackRabbit 1.3.1 > - MySql Bundle Persistence Manager > - Clustered across 2 nodes - only 1 node is read-write, the other is > read-only to the repos > - Spring used to provide a JackRabbit JCRSessionInHttpSession pattern > for the editors who are using a web-based UI. i am not familiar with this. how is the repository instance accessed/created? can you rule out the possibility that a 3rd r/w non-cluster aware instance is created? > - MySql 5.0.45 > - Tomcat 5.0.30 > - Sun JDK 1.5 > - Redhat Enterprise Linux > > All suggestions welcome. hmm, just a few random guesses.... could be - a bundle db pm-related issue - a clustering- or clustering-config related issue - an issue caused by multiple r/w jackrabbit instances accessing the same db - a jr core issue since this is a rather sophisticated setup it's not gonna be easy to investigate. however, we'd definitely need more information about the operations that lead to the corrupt state. btw: please feel free to create a jira issue. cheers stefan > Regards, > Shaun > > > -----Original Message----- > From: Stefan Guggisberg [mailto:[EMAIL PROTECTED] > Sent: 17 August 2007 11:26 > To: [email protected] > Subject: Re: Node corruption using Jackrabbit 1.3.1? > > hi shaun, > > On 8/16/07, sbarriba <[EMAIL PROTECTED]> wrote: > > Hi all, > > > > We upgraded to JackRabbit 1.3.1 a few days ago. > > > > We have since seen a couple of occasions where we've been able to get the > > repository in an indeterminate state. The following output shows the state > > of a node which has an ordered child node property called acme:components > > e.g. > > > > > > > > [miq:FooBar] > nt:base > > > > orderable > > > > + acme:components (acme:Component) multiple COPY > > > > > > > > We have an instance of FooBar where acme:components[5] has disappeared?? > > > > e.g. > > > > > > > > name type node new > modified > > > > ------------------------------ --------------- --------- --------- > --------- > > > > acme:components acme:Section true false false > > > > acme:components[2] acme:Text true false false > > > > acme:components[3] acme:Text true false false > > > > acme:components[4] acme:Text true false false > > > > acme:components[6] acme:Section true false false > > > > acme:components[7] acme:Section true false false > > > > jcr:created Date false false false > > > > jcr:primaryType Name false false false > > > > jcr:uuid String false false false > > > > > > > > I presume this could happen if the deletion of the child node succeeded by > > the saving of the parent FooBar node failed for some reason? > > that should be possible since the changelog of a save operation is stored > atomically. if an error occurs during processing of the change log all > previous changes are rolled back. > > > > > > > > > Surely this is a state that should never happen? > > absolutely, and the problem you're describing is very alarming indeed! > > did you notice anything peculiar about the corrupt nodes? is there a chance > to reconstruct the steps that lead to this state? > > furthermore, could you please share some details about your > config/deployment? > > the only possible explanation i can currently come up with is > that there are multiple jackrabbit instances accessing the same > database... > > cheers > stefan > > > > > > > > > > Regards, > > > > Shaun > > > > > > > > > >
