So, you essentially want to dump HBase tables into sequence files/RC files/text files and read it from Hive?
How do you plan to handle updates, deletes, IVS etc if you use the log edits to replicate from hbase to these files? Getting Hive to talk to HFiles gives you the same problem.. Isn't it easier to take a snapshot of the table when you actually want to run queries on it? In my prelim testing, I did see Hive-HBase full table scans slower than direct Hive table scans but I don't remember the numbers off hand. On Thu, Mar 10, 2011 at 10:43 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote: > > Hi, > > Since HBase has a mechanism to replicate edit logs to another HBase cluster, I > was wondering if people think it would be possible to implement HBase=>Hive > replication? (and really make the destination pluggable later on) > > I'm asking because while one can integrate Hive and HBase by creating external > tables in Hive that actually point to tables in HBase, apparently Hive queries > run about x5 slower than queries that go against normal Hive tables. > > And because all HBase export options are for 1 table at a time and not point > in > time snapshots of the whole table, exporting data from HBase and importing > into > Hive doesn't sound like a viable option. > > Thanks, > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop > Hadoop ecosystem search :: http://search-hadoop.com/ >