Re: HBaseContext with Spark
storage handler bulk load: SET hive.hbase.bulk=true; INSERT OVERWRITE TABLE users SELECT … ; But for now, you have to do some work and issue multiple Hive commands Sample source data for range partitioning Save sampling results to a file Run CLUSTER BY query using HiveHFileOutputFormat and TotalOrderPartitioner (sorts data, producing a large number of region files) Import HFiles into HBase HBase can merge files if necessary On Sat, Jan 28, 2017 at 11:32 AM, Chetan Khatri wrote: > @Ted, I dont think so. > > On Thu, Jan 26, 2017 at 6:35 AM, Ted Yu wrote: > >> Does the storage handler provide bulk load capability ? >> >> Cheers >> >> On Jan 25, 2017, at 3:39 AM, Amrit Jangid >> wrote: >> >> Hi chetan, >> >> If you just need HBase Data into Hive, You can use Hive EXTERNAL TABLE >> with >> >> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'. >> >> >> Try this if you problem can be solved >> >> >> https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration >> >> >> Regards >> >> Amrit >> >> >> . >> >> On Wed, Jan 25, 2017 at 5:02 PM, Chetan Khatri < >> chetan.opensou...@gmail.com> wrote: >> >>> Hello Spark Community Folks, >>> >>> Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk >>> Load from Hbase to Hive. >>> >>> I have seen couple of good example at HBase Github Repo: >>> https://github.com/apache/hbase/tree/master/hbase-spark >>> >>> If I would like to use HBaseContext with HBase 1.2.4, how it can be done >>> ? Or which version of HBase has more stability with HBaseContext ? >>> >>> Thanks. >>> >> >> >> >> >> >
Re: HBaseContext with Spark
@Ted, I dont think so. On Thu, Jan 26, 2017 at 6:35 AM, Ted Yu wrote: > Does the storage handler provide bulk load capability ? > > Cheers > > On Jan 25, 2017, at 3:39 AM, Amrit Jangid > wrote: > > Hi chetan, > > If you just need HBase Data into Hive, You can use Hive EXTERNAL TABLE > with > > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'. > > > Try this if you problem can be solved > > > https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration > > > Regards > > Amrit > > > . > > On Wed, Jan 25, 2017 at 5:02 PM, Chetan Khatri < > chetan.opensou...@gmail.com> wrote: > >> Hello Spark Community Folks, >> >> Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk >> Load from Hbase to Hive. >> >> I have seen couple of good example at HBase Github Repo: >> https://github.com/apache/hbase/tree/master/hbase-spark >> >> If I would like to use HBaseContext with HBase 1.2.4, how it can be done >> ? Or which version of HBase has more stability with HBaseContext ? >> >> Thanks. >> > > > > >
Re: HBaseContext with Spark
Does the storage handler provide bulk load capability ? Cheers > On Jan 25, 2017, at 3:39 AM, Amrit Jangid wrote: > > Hi chetan, > > If you just need HBase Data into Hive, You can use Hive EXTERNAL TABLE with > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'. > > Try this if you problem can be solved > > https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration > > Regards > Amrit > > . > >> On Wed, Jan 25, 2017 at 5:02 PM, Chetan Khatri >> wrote: >> Hello Spark Community Folks, >> >> Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk Load >> from Hbase to Hive. >> >> I have seen couple of good example at HBase Github Repo: >> https://github.com/apache/hbase/tree/master/hbase-spark >> >> If I would like to use HBaseContext with HBase 1.2.4, how it can be done ? >> Or which version of HBase has more stability with HBaseContext ? >> >> Thanks. > > >
Re: HBaseContext with Spark
Hi chetan, If you just need HBase Data into Hive, You can use Hive EXTERNAL TABLE with STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'. Try this if you problem can be solved https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration Regards Amrit . On Wed, Jan 25, 2017 at 5:02 PM, Chetan Khatri wrote: > Hello Spark Community Folks, > > Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk > Load from Hbase to Hive. > > I have seen couple of good example at HBase Github Repo: > https://github.com/apache/hbase/tree/master/hbase-spark > > If I would like to use HBaseContext with HBase 1.2.4, how it can be done ? > Or which version of HBase has more stability with HBaseContext ? > > Thanks. >
Re: HBaseContext with Spark
The references are vendor specific. Suggest contacting vendor's mailing list for your PR. My initial interpretation of HBase repository is that of Apache. Cheers On Wed, Jan 25, 2017 at 7:38 AM, Chetan Khatri wrote: > @Ted Yu, Correct but HBase-Spark module available at HBase repository > seems too old and written code is not optimized yet, I have been already > submitted PR for the same. I dont know if it is clearly mentioned that now > it is part of HBase itself then people are committing to older repo where > original code is still old. [1] > > Other sources has updated info [2] > > Ref. > [1] http://blog.cloudera.com/blog/2015/08/apache-spark- > comes-to-apache-hbase-with-hbase-spark-module/ > [2] https://github.com/cloudera-labs/SparkOnHBase , > https://github.com/esamson/SparkOnHBase > > On Wed, Jan 25, 2017 at 8:13 PM, Ted Yu wrote: > >> Though no hbase release has the hbase-spark module, you can find the >> backport patch on HBASE-14160 (for Spark 1.6) >> >> You can build the hbase-spark module yourself. >> >> Cheers >> >> On Wed, Jan 25, 2017 at 3:32 AM, Chetan Khatri < >> chetan.opensou...@gmail.com> wrote: >> >>> Hello Spark Community Folks, >>> >>> Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk >>> Load from Hbase to Hive. >>> >>> I have seen couple of good example at HBase Github Repo: >>> https://github.com/apache/hbase/tree/master/hbase-spark >>> >>> If I would like to use HBaseContext with HBase 1.2.4, how it can be done >>> ? Or which version of HBase has more stability with HBaseContext ? >>> >>> Thanks. >>> >> >> >
Re: HBaseContext with Spark
@Ted Yu, Correct but HBase-Spark module available at HBase repository seems too old and written code is not optimized yet, I have been already submitted PR for the same. I dont know if it is clearly mentioned that now it is part of HBase itself then people are committing to older repo where original code is still old. [1] Other sources has updated info [2] Ref. [1] http://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-hbase-spark-module/ [2] https://github.com/cloudera-labs/SparkOnHBase , https://github.com/esamson/SparkOnHBase On Wed, Jan 25, 2017 at 8:13 PM, Ted Yu wrote: > Though no hbase release has the hbase-spark module, you can find the > backport patch on HBASE-14160 (for Spark 1.6) > > You can build the hbase-spark module yourself. > > Cheers > > On Wed, Jan 25, 2017 at 3:32 AM, Chetan Khatri < > chetan.opensou...@gmail.com> wrote: > >> Hello Spark Community Folks, >> >> Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk >> Load from Hbase to Hive. >> >> I have seen couple of good example at HBase Github Repo: >> https://github.com/apache/hbase/tree/master/hbase-spark >> >> If I would like to use HBaseContext with HBase 1.2.4, how it can be done >> ? Or which version of HBase has more stability with HBaseContext ? >> >> Thanks. >> > >
Re: HBaseContext with Spark
Though no hbase release has the hbase-spark module, you can find the backport patch on HBASE-14160 (for Spark 1.6) You can build the hbase-spark module yourself. Cheers On Wed, Jan 25, 2017 at 3:32 AM, Chetan Khatri wrote: > Hello Spark Community Folks, > > Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk > Load from Hbase to Hive. > > I have seen couple of good example at HBase Github Repo: > https://github.com/apache/hbase/tree/master/hbase-spark > > If I would like to use HBaseContext with HBase 1.2.4, how it can be done ? > Or which version of HBase has more stability with HBaseContext ? > > Thanks. >