Hi folks,

We have a Cassandra 0.6.6 cluster running in production. We want to run
Hadoop (version 0.20.2) jobs over this cluster in order to generate
reports.
I modified the word_count example in the contrib folder of the cassandra
distribution. While the program is running fine for small datasets (in the
order of 100-200 MB) on small clusters (2 machines), it starts to give
errors while trying to run on a bigger cluster (5 machines) with much larger
dataset (400 GB). Here is the error that we get -

java.lang.RuntimeException: TimedOutException()
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:98)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
        at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: TimedOutException()
        at 
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094)
        at 
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628)
        at 
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164)
        ... 11 more




I came across this page on the Cassandra wiki -
http://wiki.apache.org/cassandra/HadoopSupport and tried modifying the
ulimit and changing batch sizes. These did not help. Though the number of
successful map tasks increased, it eventually fails since the total number
of map tasks is huge.

Any idea on what could be causing this? The program we are running is a very
slight modification of the word_count example with respect to reading from
Cassandra. The only change being specific keyspace, columnfamily and
columns. The rest of the code for reading is the same as the word_count
example in the source code for Cassandra 0.6.6.

Thanks and regards,
Jairam Chandar

Reply via email to