hi, I am wondering whether there are existing methods to ETL HBase data to ORC(or other open source columnar) file?
I understand in Hive "insert into Hive_ORC_Table from SELET * from HIVE_HBase_Table", can probably get the job done. Is this the common way to do so? Performance is acceptable and able to handle the delta update in the case HBase table changed? I did a bit google, and find this https://community.hortonworks.com/questions/2632/loading-hbase-from-hive-orc-tables.html which is another way around. Will it perform better(comparing to above Hive stmt) if using either replication logic or snapshot backup to generate ORC file from hbase tables and with incremental update ability? I hope to has as fewer dependency as possible. in the Example of ORC, will only depend on Apache ORC's API, and not depend on Hive Demai
