I've written a number of MapReduce jobs using the CQL3 driver that allows 
input/output from/to Cassandra column families.


The output from the Reducer has always a been a Map<String, ByteBuffer> for the 
primary key(s) and a List<ByteBuffer> for the values. This works fine for all 
data types that can be converted easily to a ByteBuffer with 
"org.apache.cassandra.utils.ByteBufferUtil.bytes()", namely double, float, int, 
String, etc.


Now I'd like to output data to a column in Cassandra that has the datatype 
"map", but I'm not sure if I should still pass it as an item in the List of 
ByteBuffers and, if so, how I'd correctly cast it to a bunch of bytes.


My problem is like the traditional WordCount problem, only I need to output 
more than one bit of data about the words (imagine I was storing, for each 
word, the number of times it appeared in the text, the average length of the 
sentences it appears in, and the date of publication of the oldest text it 
appears in). I can conceive of a solution with more than one column family, but 
Cassandra appears to provide the map datatype to avoid this.


Is there a way to output to a Cassandra column of datatype Map, or a way to 
avoid having to do so?


Cheers,


Andrew

Reply via email to