storage handler bulk load: SET hive.hbase.bulk=true; INSERT OVERWRITE TABLE users SELECT … ; But for now, you have to do some work and issue multiple Hive commands Sample source data for range partitioning Save sampling results to a file Run CLUSTER BY query using HiveHFileOutputFormat and TotalOrderPartitioner (sorts data, producing a large number of region files) Import HFiles into HBase HBase can merge files if necessary
On Sat, Jan 28, 2017 at 11:32 AM, Chetan Khatri <chetan.opensou...@gmail.com > wrote: > @Ted, I dont think so. > > On Thu, Jan 26, 2017 at 6:35 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Does the storage handler provide bulk load capability ? >> >> Cheers >> >> On Jan 25, 2017, at 3:39 AM, Amrit Jangid <amrit.jan...@goibibo.com> >> wrote: >> >> Hi chetan, >> >> If you just need HBase Data into Hive, You can use Hive EXTERNAL TABLE >> with >> >> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'. >> >> >> Try this if you problem can be solved >> >> >> https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration >> >> >> Regards >> >> Amrit >> >> >> . >> >> On Wed, Jan 25, 2017 at 5:02 PM, Chetan Khatri < >> chetan.opensou...@gmail.com> wrote: >> >>> Hello Spark Community Folks, >>> >>> Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk >>> Load from Hbase to Hive. >>> >>> I have seen couple of good example at HBase Github Repo: >>> https://github.com/apache/hbase/tree/master/hbase-spark >>> >>> If I would like to use HBaseContext with HBase 1.2.4, how it can be done >>> ? Or which version of HBase has more stability with HBaseContext ? >>> >>> Thanks. >>> >> >> >> >> >> >