Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "HadoopSupport" page has been changed by jeremyhanna:
http://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=51&rev2=52

  <<Anchor(Overview)>>
  
  == Overview ==
- Starting way back in Cassandra 0.6 enabled certain 
[[http://hadoop.apache.org/|Hadoop]] functionality against Cassandra's data 
store.  Specifically, support has been added for 
[[http://hadoop.apache.org/mapreduce/|MapReduce]], 
[[http://pig.apache.org|Pig]] and [[http://hive.apache.org/|Hive]].  
Cassandra's Hadoop support implements the same interface as HDFS to achieve 
data locality.  That means, that if you have task trackers running on Cassandra 
nodes, you'll get input data locality.  See the section on 
[[#ClusterConfig|cluster configuration]] for details on how to best configure 
Hadoop with Cassandra as well as how to split your analytic load out from your 
realtime read load.
+ [[http://hadoop.apache.org/|Hadoop]] integration was added way back in 
version 0.6 of Cassandra.  It began with 
[[http://hadoop.apache.org/mapreduce/|MapReduce]] support.  Since then the 
support has matured significantly and now includes native support for 
[[http://pig.apache.org|Apache Pig]] and [[http://hive.apache.org/|Apache 
Hive]].  Cassandra's Hadoop support implements the same interface as HDFS to 
achieve input data locality (see [[#ClusterConfig|cluster configuration]] for 
details on data locality and how to split your analytic and realtime read 
loads).
  
  !DataStax, a company that creates products around Cassandra, has created a 
simplified way to use Hadoop with Cassandra and built it into its !DataStax 
Enterprise product.  For details on DSE, see 
[[http://www.datastax.com/products/enterprise|this product page]].
  

Reply via email to