Re: schema doubt

Akash Ashok Thu, 15 Sep 2011 06:25:11 -0700

Also could you tell how small these files are ? If they are way less than
64MB default HDFS block size you'd want to splice them before running a
MapReduce.


Cheers,
Akash A

On Thu, Sep 15, 2011 at 6:02 PM, Joey Echeverria <[email protected]> wrote:

> It sounds lik you're planning to use the HBase shell to insert all of
> this data. If that's correct, I'd recommend against it. I would write
> a simple MapReduce program to insert the data instead. You could run a
> map-only job that reads in the files and writes each one as a row in
> HBase. WIth the java APIs you can write the raw bytes pretty easily.
>
> -Joey
>
> On Thu, Sep 15, 2011 at 7:56 AM, Rita <[email protected]> wrote:
> > I have many small files (close to 1 million) and I was thinking of
> creating
> > a key value pair for them. The file name can be the key and the content
> can
> > be value.
> >
> > Would it be better if I do a base64 on the content and load it to hbase
> or
> > try to echo the content for hbase shell?
> >
> > Has anyone done something similar to this?
> >
> >
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Re: schema doubt

Reply via email to