I'm working on some tests to reproduce this issue. However looking at the code in 1.2.1, in method commit() of FileJournal.java, I can't find the lock operation on the global revision file. Here is the code:
public void commit() throws JournalException { try { out.writeChar('\0'); out.close(); long nextRevision = record.getNextRevision(); FileRecordLog recordLog = new FileRecordLog(journal); if (!recordLog.isNew()) { if (nextRevision - recordLog.getFirstRevision() > maximumSize) { switchLogs(); recordLog = new FileRecordLog(journal); } } recordLog.append(record); tempLog.delete(); globalRevision.set(nextRevision); instanceRevision.set(nextRevision); } catch (IOException e) { String msg = "Unable to close journal log " + tempLog + ": " + e.getMessage(); throw new JournalException(msg); } finally { out = null; globalRevision.unlock(); writeMutex.release(); } } The call to globalRevision.set (that implies a lock) is done after the call to recordLog.append() so I think the write is not protected. I'm rather new to JCR and jackrabbit so maybe I'm missing something but the cluster feature is very important for our product. I'm going to develop some classes to test basic cluster operation and hope it helps to further improve in this area. Regards, On 19/02/07, Dominique Pfister <[EMAIL PROTECTED]> wrote:
Hi Miguel, writing to the journal log file should only be possible after having obtained an exclusive lock on the global revision file (R), located in the same directory as the journal log file (L). The exact sequence of operations is as follows: - exclusive lock is obtained on R - journal entry appended to log file L - revision counter is updated in R - exclusive lock is released on R This should rule out simultaneous writes to the log file L. Are you easily able to reproduce this issue, starting with an empty journal file? I could eliminate some small issues in the not yet released 1.2.2 branch, but I still would be very glad to know more about how and when this problem arises... Kind regards Dominique On 2/14/07, Miguel Ángel Jiménez <[EMAIL PROTECTED]> wrote: > Hi, > > I'm trying the new cluster feature of Jackrabbit 1.2.1 and found some > issues. Using FileJournal to synchronize state between instances, we are > experiencing some errors that point to a possible corruption of the log > file: > > 2007-02-14 10:34:00,911 ERROR [org.apache.jackrabbit.core.RepositoryImpl ] > Unable to start clustered node, forcing shutdown... > org.apache.jackrabbit.core.cluster.JournalException: Unable to iterate over > modified records: malformed input around byte 178 > at org.apache.jackrabbit.core.cluster.FileJournal.sync( FileJournal.java > :313) > at org.apache.jackrabbit.core.cluster.ClusterNode.sync( ClusterNode.java > :217) > at org.apache.jackrabbit.core.cluster.ClusterNode.start( ClusterNode.java > :164) > at org.apache.jackrabbit.core.RepositoryImpl.<init>( RepositoryImpl.java > :308) > at org.apache.jackrabbit.core.RepositoryImpl.create( RepositoryImpl.java > :573) > at org.apache.jackrabbit.core.jndi.BindableRepository.createRepository( > BindableRepository.java:174) > at org.apache.jackrabbit.core.jndi.BindableRepository.init( > BindableRepository.java:138) > at org.apache.jackrabbit.core.jndi.BindableRepository.create( > BindableRepository.java:125) > at > org.apache.jackrabbit.core.jndi.BindableRepositoryFactory.createInstance ( > BindableRepositoryFactory.java:59) > at org.apache.jackrabbit.core.jndi.RegistryHelper.registerRepository ( > RegistryHelper.java:60) > at > com.germinus.xpression.cms.jcr.EmbeddedRepositoryFactory.getRepository( > EmbeddedRepositoryFactory.java:50) > at com.germinus.xpression.cms.jcr.JCRUtil.initRepository( JCRUtil.java > :243) > ... > Caused by: java.io.UTFDataFormatException: malformed input around byte 178 > at java.io.DataInputStream.readUTF(DataInputStream.java:639) > at org.apache.jackrabbit.core.cluster.FileRecord.readCreator( > FileRecord.java:242) > at org.apache.jackrabbit.core.cluster.FileRecord.<init>( FileRecord.java > :106) > at org.apache.jackrabbit.core.cluster.FileRecordCursor.next( > FileRecordCursor.java:101) > at org.apache.jackrabbit.core.cluster.FileJournal.sync( FileJournal.java > :303) > ... 130 more > > Perhaps I'm wrong but looks like two instances are writing the file > simultaneously. Is this behaviour known or misconfiguration? The journal log > is placed in a shared folder on a Linux machine and exported by SAMBA to the > instances. I have tested the lock file capabilities of the shared filesystem > and they are ok. > > -- > Miguel. >
-- Miguel.