BulkOutputFormat and CQL3

James Campbell Tue, 22 Apr 2014 06:45:08 -0700

Hi Cassandra Users-

I have a Hadoop job that uses the pattern in Cassandra 2.0.6's 
hadoop_cql3_word_count example to load data from HDFS into Cassandra.  Having 
read about BulkOutputFormat as a way to potentially significantly increase the 
write throughput from Hadoop to Cassandra, I am considering testing against 
that pattern (http://www.datastax.com/dev/blog/improved-hadoop-output, 
http://shareitexploreit.blogspot.com/2012/03/bulkloadto-cassandra-with-hadoop.html
 ).


Is it possible/supported/recommended to use the BulkOutputFormat to load data 
from Hadoop to a CQL3 table in Cassandra?

I see several examples of building composite keys using Hector (e.g. 
http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1, 
http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html
 ), but with the changes to support CQL3 having left a lot of different 
documentation out there for different versions, it's not clear to me what the 
"proper" way to build the requisite ByteBuffer, List<Mutation> pairs that the 
ColumnFamilyOutputFormat (and so BulkOutputFormat) needs.

James

BulkOutputFormat and CQL3

Reply via email to