I'm also quite interested in anyone has have feedback on Ryan's reasoning.

On Mon, Sep 12, 2011 at 1:18 PM, Brush,Ryan <[email protected]> wrote:

> If you'll forgive the slight topic shift, it seems like the pattern of
> writing directly to HFiles rather than the TableOutputFormat would be
> better for several cases. For instance, TableOutputFormat results in
> everything being written to the WAL, and later compacted into HFiles.
> When practical, why not skip that interim state and produce the HFile
> directly, then do a bulk load?
>
>
> Of course not all jobs that use the TableOutputFormat can easily write to
> Hfiles; those files require a strict ordering of row keys being output,
> and bulk loads are optimal only if the HFiles align with existing regions.
> But if such requirements are met, it seems like moving away from
> TableOutputFormat could help IO-bound jobs significantly.
>
> Is my reasoning sound?
>
> On 9/12/11 12:40 PM, "Leif Wickland" <[email protected]> wrote:
>
> >Thanks, Bryan.  I'd love to hear any lessons you learn.  I've used that
> >technique successfully at a prototype level, but haven't yet moved it to
> >production.
> >
> >Leif
> >
> >On Mon, Sep 12, 2011 at 10:51 AM, Bryan Keller <[email protected]> wrote:
> >
> >> Ah that is a very interesting solution Leif, this seems optimal to me.
> >>I am
> >> going to try this and I'll report back.
> >>
> >> On Sep 12, 2011, at 9:09 AM, Leif Wickland wrote:
> >>
> >> >
> >> > Bryan,
> >> >
> >> > Have you considered writing your MR output to HFileFormat and then
> >>asking
> >> > the regions to adopt the result?   That would allow you to avoid
> >> committing
> >> > any changes to HBase until you knew that the MR job ran successfully.
> >> >
> >> > Leif
> >>
> >>
>
> ----------------------------------------------------------------------
> CONFIDENTIALITY NOTICE This message and any included attachments are from
> Cerner Corporation and are intended only for the addressee. The information
> contained in this message is confidential and may constitute inside or
> non-public information under international, federal, or state securities
> laws. Unauthorized forwarding, printing, copying, distribution, or use of
> such information is strictly prohibited and may be unlawful. If you are not
> the addressee, please promptly delete this message and notify the sender of
> the delivery error by e-mail or you may call Cerner's corporate offices in
> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>

Reply via email to