Roland von Herget created CASSANDRA-5365:
--------------------------------------------
Summary: org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.EOFException
Key: CASSANDRA-5365
URL: https://issues.apache.org/jira/browse/CASSANDRA-5365
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.2.2
Environment: 3 node debian linux cluster
oracle jdk - java version "1.7.0_15"
hadoop 1.0.4
1 node running hadoop job which inserts data
Reporter: Roland von Herget
Priority: Critical
create keyspace & column family via cli:
{code}
create keyspace vectorization WITH placement_strategy = 'SimpleStrategy' AND
strategy_options = {replication_factor:2};
create column family dict WITH comparator = UTF8Type AND
key_validation_class=UTF8Type AND column_metadata = [ {column_name: id,
validation_class: LongType} {column_name: df, validation_class: LongType}];
{code}
Now I run my hadoop job which gets data from another keyspace (same cluster)
and reduces it to dict.
Afterwards I try to get some values via cli:
{code}
[default@unknown] use vectorization;
[default@vectorization] assume dict keys as ascii;
[default@vectorization] get dict['xyz'];
=> (column=df, value=329305, timestamp=1363715523545000)
=> (column=id, value=8477047, timestamp=1363715523545000)
Returned 2 results.
Elapsed time: 38 msec(s).
[default@vectorization] get dict['14'];
null
TimedOutException()
at
org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7874)
[...]
{code}
and on the server:
{code}
ERROR 09:42:46,834 Exception in thread Thread[ReadStage:42281,5,main]
java.lang.RuntimeException:
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.EOFException
at
org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:106)
at
org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:38)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1363)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1220)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1132)
at org.apache.cassandra.db.Table.getRow(Table.java:355)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more
Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:394)
at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
at
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
at
org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:102)
... 23 more
{code}
Which keys work, and which don't seems to be completely random, and differ on
each retry (drop cf, create new cf, rerun hadoop job).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira