Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "HadoopSupport" page has been changed by jeremyhanna: http://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=51&rev2=52 <<Anchor(Overview)>> == Overview == - Starting way back in Cassandra 0.6 enabled certain [[http://hadoop.apache.org/|Hadoop]] functionality against Cassandra's data store. Specifically, support has been added for [[http://hadoop.apache.org/mapreduce/|MapReduce]], [[http://pig.apache.org|Pig]] and [[http://hive.apache.org/|Hive]]. Cassandra's Hadoop support implements the same interface as HDFS to achieve data locality. That means, that if you have task trackers running on Cassandra nodes, you'll get input data locality. See the section on [[#ClusterConfig|cluster configuration]] for details on how to best configure Hadoop with Cassandra as well as how to split your analytic load out from your realtime read load. + [[http://hadoop.apache.org/|Hadoop]] integration was added way back in version 0.6 of Cassandra. It began with [[http://hadoop.apache.org/mapreduce/|MapReduce]] support. Since then the support has matured significantly and now includes native support for [[http://pig.apache.org|Apache Pig]] and [[http://hive.apache.org/|Apache Hive]]. Cassandra's Hadoop support implements the same interface as HDFS to achieve input data locality (see [[#ClusterConfig|cluster configuration]] for details on data locality and how to split your analytic and realtime read loads). !DataStax, a company that creates products around Cassandra, has created a simplified way to use Hadoop with Cassandra and built it into its !DataStax Enterprise product. For details on DSE, see [[http://www.datastax.com/products/enterprise|this product page]].
