[
https://issues.apache.org/jira/browse/CASSANDRA-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195049#comment-13195049
]
Rick Branson commented on CASSANDRA-2930:
-----------------------------------------
After analyzing the corrupt commit log posted, the issue is that the
serialization code actually wrote out an incorrect supercolumn. The commit log
code faithfully checksum'd and wrote the data out as if it was valid. It
appears that the subcolumn count exceeded the actual subcolumn data written out
when serialized by 1, so the deserialization code is expecting another
subcolumn and runs out of buffer, bailing with EOFException.
I was unable to repro with Cathy's script after running it about a half dozen
times.
> corrupt commitlog
> -----------------
>
> Key: CASSANDRA-2930
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2930
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.1
> Environment: Linux, amd64.
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
> Reporter: ivan
> Assignee: Rick Branson
> Attachments: CommitLog-1310637513214.log
>
>
> We get "Exception encountered during startup" error while Cassandra starts.
> Error messages:
> INFO 13:56:28,736 Finished reading
> /var/lib/cassandra/commitlog/CommitLog-1310637513214.log
> ERROR 13:56:28,736 Exception encountered during startup.
> java.io.IOError: java.io.EOFException
> at
> org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265)
> at
> org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281)
> at
> org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236)
> at
> java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
> at
> java.util.concurrent.ConcurrentSkipListMap.<init>(ConcurrentSkipListMap.java:1443)
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419)
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139)
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127)
> at
> org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158)
> at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
> at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368)
> at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394)
> at
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368)
> at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87)
> at
> org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261)
> ... 13 more
> Exception encountered during startup.
> java.io.IOError: java.io.EOFException
> at
> org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265)
> at
> org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281)
> at
> org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236)
> at
> java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
> at
> java.util.concurrent.ConcurrentSkipListMap.<init>(ConcurrentSkipListMap.java:1443)
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419)
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139)
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127)
> at
> org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158)
> at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
> at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368)
> at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394)
> at
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368)
> at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87)
> at
> org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261)
> ... 13 more
> After some debugging I found that in some serialized supercolumns column
> counter is less than the number of serialized columns. Difference was always
> 1 in corrupt commitlogs. This error always appears with supercolumns with
> more than one column, but there are properly serialized supercolumns also in
> commitlog.
> I have no clue yet why this error happens. I suspect it maybe a race
> condition.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira