Re: Writing/Importing large number of records into HBase

jeff saremi Sat, 28 Jan 2017 10:57:54 -0800

Thank you Chetan


________________________________
From: Chetan Khatri <chetan.opensou...@gmail.com>
Sent: Friday, January 27, 2017 8:15 PM
To: user@hbase.apache.org
Subject: Re: Writing/Importing large number of records into HBase

Oh. Sorry.
https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java
[https://avatars1.githubusercontent.com/u/47359?v=3&s=400]<https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java>

hbase/JavaHBaseBulkPutExample.java at master · apache 
...<https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java>
github.com
hbase - Mirror of Apache HBase ... Switch branches/tags. Branches; Tags



On Sat, Jan 28, 2017 at 9:27 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Chetan:
> The link you posted was from personal repo.
>
> There hasn't been commit for at least a year.
>
> Meanwhile, the hbase-spark module in hbase repo is being actively
> maintained.
>
> FYI
>
> > On Jan 27, 2017, at 7:47 PM, Chetan Khatri <chetan.opensou...@gmail.com>
> wrote:
> >
> > Adding to @Ted Check Bulk Put Example -
> > https://github.com/tmalaska/SparkOnHBase/blob/master/src/
[https://avatars3.githubusercontent.com/u/1946016?v=3&s=400]<https://github.com/tmalaska/SparkOnHBase/blob/master/src/>

tmalaska/SparkOnHBase<https://github.com/tmalaska/SparkOnHBase/blob/master/src/>
github.com
Contribute to SparkOnHBase development by creating an account on GitHub.


> main/scala/org/apache/hadoop/hbase/spark/example/hbasecontext/
> HBaseBulkPutExampleFromFile.scala
> >
> >> On Sat, Jan 28, 2017 at 9:11 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> >>
> >> Have you looked at hbase-spark module (currently in master branch) ?
> >>
> >> See hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/
> >> example/datasources/AvroSource.scala
> >> and hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/
> >> DefaultSourceSuite.scala
> >> for examples.
> >>
> >> There may be other options.
> >>
> >> FYI
> >>
> >> On Fri, Jan 27, 2017 at 7:28 PM, jeff saremi <jeffsar...@hotmail.com>
> >> wrote:
> >>
> >>> Hi
> >>> I'm seeking some pointers/guidance on what we could do to insert
> billions
> >>> of records that we already have in avro files in hadoop into HBase.
> >>>
> >>> I read some articles online and one of them recommended using HFile
> >>> format. I took a cursory look at the documentation for that. Given the
> >>> complexity of that I think that may be the last resort we want to
> pursue.
> >>> Unless some library is out there that easily helps us write our files
> >> into
> >>> that format. I didn't see any.
> >>> Assuming that the Hbase native client may be our best bet, is there any
> >>> advice around pre-paritioning our records or such techniques that we
> >> could
> >>> use?
> >>> thanks
> >>>
> >>> Jeff
> >>
>

Re: Writing/Importing large number of records into HBase

Reply via email to