Re: Bulk import - is the error general to both MapReduce and non-MapReduce programs?

Stuart White Thu, 02 Apr 2009 14:44:40 -0700

To my understanding, the problem I am facing is not specific to
mapreduce.  So, I would expect that Ryan's code is equally applicable
to your case.


On Thu, Apr 2, 2009 at 4:37 PM, Taylor, Ronald C <[email protected]> wrote:
>
> Hello,
>
> I have been following this thread, and got a question. I am new to Hbase 
> coding, and I have within the past few days written a standalone (not 
> MapReduce based) Java program to do a bulk upload into one Hbase table. I 
> believe that I got the same error that you folks have been talking about. The 
> program works fine on small uploads, fails with the error msg you mention 
> when moving to import of ten of thousands of rows. So - I wanted to ask: has 
> this import error been reported for only MapReduce-based programs, or is it 
> indeed more general (which I could then assume may be something that affects 
> by current import program, and I should try using the doCommit() code shown 
> below as a fix)?
>  Cheers,
>  Ron Taylor
> ___________________________________________
> Ronald Taylor, Ph.D.
> Computational Biology & Bioinformatics Group
> Pacific Northwest National Laboratory
> 902 Battelle Boulevard
> P.O. Box 999, MSIN K7-90
> Richland, WA  99352 USA
> Office:  509-372-6568
> Email: [email protected]
> www.pnl.gov
>
> -----Original Message-----
> From: Stuart White [mailto:[email protected]]
> Sent: Thursday, April 02, 2009 1:37 PM
> To: [email protected]
> Subject: Re: Bulk import - does sort order of input data affect success rate?
>
> On Thu, Apr 2, 2009 at 3:30 PM, Ryan Rawson <[email protected]> wrote:
>> The last thing - success should not be a function of sort order.
>>
>> However, speed will be related.
>
> How?  Sorted = faster, or Sorted = slower?
>
>>
>> One thing I found I had to do was:
>>    private void doCommit(HTable t, BatchUpdate update) throws
>> IOException {
>>      boolean commited = false;
>>      while (!commited) {
>>        try {
>>          t.commit(update);
>>          commited = true;
>>        } catch (RetriesExhaustedException e) {
>>          // DAMN, ignore
>>        }
>>      }
>>    }
>>
>
> I'm running a mapred job, using TableOutputFormat to write the results to 
> HBase.  For the code you've provided, was that for a custom output format?  
> Or a standalone (non-mapred) application?  I see the point you're making, I 
> just don't understand where I'd put that code.
> Thanks!
>

Re: Bulk import - is the error general to both MapReduce and non-MapReduce programs?

Reply via email to