Thank you Chetan
________________________________ From: Chetan Khatri <chetan.opensou...@gmail.com> Sent: Friday, January 27, 2017 8:15 PM To: user@hbase.apache.org Subject: Re: Writing/Importing large number of records into HBase Oh. Sorry. https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java [https://avatars1.githubusercontent.com/u/47359?v=3&s=400]<https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java> hbase/JavaHBaseBulkPutExample.java at master ยท apache ...<https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java> github.com hbase - Mirror of Apache HBase ... Switch branches/tags. Branches; Tags On Sat, Jan 28, 2017 at 9:27 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Chetan: > The link you posted was from personal repo. > > There hasn't been commit for at least a year. > > Meanwhile, the hbase-spark module in hbase repo is being actively > maintained. > > FYI > > > On Jan 27, 2017, at 7:47 PM, Chetan Khatri <chetan.opensou...@gmail.com> > wrote: > > > > Adding to @Ted Check Bulk Put Example - > > https://github.com/tmalaska/SparkOnHBase/blob/master/src/ [https://avatars3.githubusercontent.com/u/1946016?v=3&s=400]<https://github.com/tmalaska/SparkOnHBase/blob/master/src/> tmalaska/SparkOnHBase<https://github.com/tmalaska/SparkOnHBase/blob/master/src/> github.com Contribute to SparkOnHBase development by creating an account on GitHub. > main/scala/org/apache/hadoop/hbase/spark/example/hbasecontext/ > HBaseBulkPutExampleFromFile.scala > > > >> On Sat, Jan 28, 2017 at 9:11 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> > >> Have you looked at hbase-spark module (currently in master branch) ? > >> > >> See hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/ > >> example/datasources/AvroSource.scala > >> and hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/ > >> DefaultSourceSuite.scala > >> for examples. > >> > >> There may be other options. > >> > >> FYI > >> > >> On Fri, Jan 27, 2017 at 7:28 PM, jeff saremi <jeffsar...@hotmail.com> > >> wrote: > >> > >>> Hi > >>> I'm seeking some pointers/guidance on what we could do to insert > billions > >>> of records that we already have in avro files in hadoop into HBase. > >>> > >>> I read some articles online and one of them recommended using HFile > >>> format. I took a cursory look at the documentation for that. Given the > >>> complexity of that I think that may be the last resort we want to > pursue. > >>> Unless some library is out there that easily helps us write our files > >> into > >>> that format. I didn't see any. > >>> Assuming that the Hbase native client may be our best bet, is there any > >>> advice around pre-paritioning our records or such techniques that we > >> could > >>> use? > >>> thanks > >>> > >>> Jeff > >> >