Re: Bulk import - does sort order of input data affect success rate?

Ryan Rawson Thu, 02 Apr 2009 13:31:19 -0700

The last thing - success should not be a function of sort order.

However, speed will be related.


One thing I found I had to do was:
    private void doCommit(HTable t, BatchUpdate update) throws IOException {
      boolean commited = false;
      while (!commited) {
        try {
          t.commit(update);
          commited = true;
        } catch (RetriesExhaustedException e) {
          // DAMN, ignore
        }
      }
    }

good luck!
-ryan

On Thu, Apr 2, 2009 at 1:28 PM, Stuart White <[email protected]>wrote:

> I, like many others, am having difficulty getting a mapred job that
> bulk imports data into an HBase table to run successfully to
> completion.
>
> At this time, rather than get into specifics of my configuration, the
> exceptions I'm receiving, etc..., I wanted to ask a general question:
>
> Should I expect my bulk import to be more likely to succeed if my data
> is sorted by its key?
> Or should I expect my bulk import to be more likely to succeed if my
> data is randomized?
> Or should I expect the ordering of my input data to have no effect on
> my ability to successfully bulk import records?
>
> Thanks.
>

Re: Bulk import - does sort order of input data affect success rate?

Reply via email to