[
https://issues.apache.org/jira/browse/CASSANDRA-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roland von Herget updated CASSANDRA-5365:
-----------------------------------------
Fix Version/s: 1.2.3
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-5365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5365
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.2.2
> Environment: 3 node debian linux cluster
> oracle jdk - java version "1.7.0_15"
> hadoop 1.0.4
> 1 node running hadoop job which inserts data
> Reporter: Roland von Herget
> Priority: Critical
> Fix For: 1.2.3
>
>
> create keyspace & column family via cli:
> {code}
> create keyspace vectorization WITH placement_strategy = 'SimpleStrategy' AND
> strategy_options = {replication_factor:2};
> create column family dict WITH comparator = UTF8Type AND
> key_validation_class=UTF8Type AND column_metadata = [ {column_name: id,
> validation_class: LongType} {column_name: df, validation_class: LongType}];
> {code}
> Now I run my hadoop job which gets data from another keyspace (same cluster)
> and reduces it to dict.
> Afterwards I try to get some values via cli:
> {code}
> [default@unknown] use vectorization;
> [default@vectorization] assume dict keys as ascii;
> [default@vectorization] get dict['xyz'];
> => (column=df, value=329305, timestamp=1363715523545000)
> => (column=id, value=8477047, timestamp=1363715523545000)
> Returned 2 results.
> Elapsed time: 38 msec(s).
> [default@vectorization] get dict['14'];
> null
> TimedOutException()
> at
> org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7874)
> [...]
> {code}
> and on the server:
> {code}
> ERROR 09:42:46,834 Exception in thread Thread[ReadStage:42281,5,main]
> java.lang.RuntimeException:
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.EOFException
> at
> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:106)
> at
> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:38)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)
> at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)
> at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
> at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
> at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
> at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
> at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
> at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
> at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294)
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1363)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1220)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1132)
> at org.apache.cassandra.db.Table.getRow(Table.java:355)
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
> ... 3 more
> Caused by: java.io.EOFException
> at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
> at java.io.RandomAccessFile.readFully(RandomAccessFile.java:394)
> at
> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380)
> at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
> at
> org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
> at
> org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
> at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
> at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
> at
> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:102)
> ... 23 more
> {code}
> Which keys work, and which don't seems to be completely random, and differ on
> each retry (drop cf, create new cf, rerun hadoop job).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira