As the subject implies I am trying to dump Cassandra rows into Hadoop.
What is the easiest way for me to accomplish this? Thanks.
Should I be looking into pig for something like this?
Depends on what you mean by dumping into Hadoop.
If you want to read them from a Hadoop Job then you can use either native
Hadoop or Pig. See the contrib/word_count and contrib/pig examples.
If you want to copy the data into a Hadoop File System install then I guess
almost anything that can
Is there an easy way to retrieve all values from a CF.. similar to a
dump?
How about retrieving all columns for a particular key?
In the second use case a simple iteration would work using a start and
finish but how would this be accomplished across all keys for a
particular CF when you
sstable2json discussed here http://wiki.apache.org/cassandra/Operations may be
what you are after, or the snapshot feature. Not sure what you want to use the
dump for.
If you do not know the keys in the CF in advance take a look at
get_range_slices (http://wiki.apache.org/cassandra/API) it