[
https://issues.apache.org/jira/browse/CASSANDRA-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851442#action_12851442
]
Jonathan Ellis commented on CASSANDRA-913:
------------------------------------------
Starting points:
The Cassandra inputformat for Hadoop is in
org.apache.cassandra.hadoop.ColumnFamilyInputFormat; the record reader and
input split are in the same package. There's an example of using these in
contrib/word_count, and Pig integration in contrib/pig.
You can look at the .7 patch to HIVE-705 to see how HBase support was added.
Unfortunately this is not split into "Hive infrastructure refactoring" and
"HBase support," they are all mixed in together.
> Add Hive support
> ----------------
>
> Key: CASSANDRA-913
> URL: https://issues.apache.org/jira/browse/CASSANDRA-913
> Project: Cassandra
> Issue Type: New Feature
> Components: Contrib
> Reporter: Jonathan Ellis
>
> http://hadoop.apache.org/hive/ is a project that runs SQL queries against
> Hadoop map/reduce clusters. (For analytics; it is too high-latency to run
> applications against Hive directly). HIVE-705 added support for backends
> other than HDFS, with HBase as the first. Cassandra support should be doable
> too now.
> The Hive storage backends are described in
> http://wiki.apache.org/hadoop/Hive/StorageHandlers and the HBase backend
> specifically in http://wiki.apache.org/hadoop/Hive/HBaseIntegration.
> I also note that John Sichi, author of the HBase backend, seems like a
> helpful guy and I imagine would be totally cool with answering questions
> about implementation details.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.