Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "HadoopSupport" page has been changed by SilvereLestang. The comment on this change is: Fix and add URLs. http://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=30&rev2=31 -------------------------------------------------- <<Anchor(Overview)>> == Overview == - Cassandra 0.6+ enables certain Hadoop functionality against Cassandra's data store. Specifically, support has been added for [[http://hadoop.apache.org/mapreduce/|MapReduce]], [[http://pig.apache.org|Pig]] and [[http://hive.apache.org/|Hive]]. + Cassandra 0.6+ enables certain [[http://hadoop.apache.org/|Hadoop]] functionality against Cassandra's data store. Specifically, support has been added for [[http://hadoop.apache.org/mapreduce/|MapReduce]], [[http://pig.apache.org|Pig]] and [[http://hive.apache.org/|Hive]]. [[#Top|Top]] @@ -22, +22 @@ == MapReduce == ==== Input from Cassandra ==== - Cassandra 0.6+ adds support for retrieving data from Cassandra. This is based on implementations of [[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/InputSplit.html|InputSplit]], [[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/InputFormat.html|InputFormat]], and [[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/RecordReader.html|RecordReader]] so that Hadoop !MapReduce jobs can retrieve data from Cassandra. For an example of how this works, see the contrib/word_count example in 0.6 or later. Cassandra rows or row fragments (that is, pairs of key + `SortedMap` of columns) are input to Map tasks for processing by your job, as specified by a `SlicePredicate` that describes which columns to fetch from each row. + Cassandra 0.6+ adds support for retrieving data from Cassandra. This is based on implementations of [[http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/InputSplit.html|InputSplit]], [[http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/InputFormat.html|InputFormat]], and [[http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/RecordReader.html|RecordReader]] so that Hadoop !MapReduce jobs can retrieve data from Cassandra. For an example of how this works, see the contrib/word_count example in 0.6 or later. Cassandra rows or row fragments (that is, pairs of key + `SortedMap` of columns) are input to Map tasks for processing by your job, as specified by a `SlicePredicate` that describes which columns to fetch from each row. Here's how this looks in the word_count example, which selects just one configurable columnName from each row:
