[ 
https://issues.apache.org/jira/browse/CASSANDRA-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851442#action_12851442
 ] 

Jonathan Ellis commented on CASSANDRA-913:
------------------------------------------

Starting points:

The Cassandra inputformat for Hadoop is in 
org.apache.cassandra.hadoop.ColumnFamilyInputFormat; the record reader and 
input split are in the same package.  There's an example of using these in 
contrib/word_count, and Pig integration in contrib/pig.

You can look at the .7 patch to HIVE-705 to see how HBase support was added.  
Unfortunately this is not split into "Hive infrastructure refactoring" and 
"HBase support," they are all mixed in together.

> Add Hive support
> ----------------
>
>                 Key: CASSANDRA-913
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-913
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Contrib
>            Reporter: Jonathan Ellis
>
> http://hadoop.apache.org/hive/ is a project that runs SQL queries against 
> Hadoop map/reduce clusters.  (For analytics; it is too high-latency to run 
> applications against Hive directly).  HIVE-705 added support for backends 
> other than HDFS, with HBase as the first.  Cassandra support should be doable 
> too now.
> The Hive storage backends are described in 
> http://wiki.apache.org/hadoop/Hive/StorageHandlers and the HBase backend 
> specifically in http://wiki.apache.org/hadoop/Hive/HBaseIntegration.
> I also note that John Sichi, author of the HBase backend, seems like a 
> helpful guy and I imagine would be totally cool with answering questions 
> about implementation details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to