No iI had not.I will take a look. Thanks Ted
________________________________ From: Ted Yu <[email protected]> Sent: Friday, January 27, 2017 7:41 PM To: [email protected] Subject: Re: Writing/Importing large number of records into HBase Have you looked at hbase-spark module (currently in master branch) ? See hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/example/datasources/AvroSource.scala and hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/DefaultSourceSuite.scala for examples. There may be other options. FYI On Fri, Jan 27, 2017 at 7:28 PM, jeff saremi <[email protected]> wrote: > Hi > I'm seeking some pointers/guidance on what we could do to insert billions > of records that we already have in avro files in hadoop into HBase. > > I read some articles online and one of them recommended using HFile > format. I took a cursory look at the documentation for that. Given the > complexity of that I think that may be the last resort we want to pursue. > Unless some library is out there that easily helps us write our files into > that format. I didn't see any. > Assuming that the Hbase native client may be our best bet, is there any > advice around pre-paritioning our records or such techniques that we could > use? > thanks > > Jeff >
