Re: map reduce range of records from hbase table

stack Wed, 08 Oct 2008 22:11:23 -0700

On Wed, Oct 8, 2008 at 9:01 PM, Jaeyun Noh <[EMAIL PROTECTED]> wrote:


> Thx.
>
> BTW, it seems that the output format (subclass of
> org.apache.hadoop.mapred.OutputFormat) of MR job can only be a file. Can we
> define our own file format which hbase clients can access?


No.  You can output to anything as long as you make it implement
OutputFormat.  To output to hbase subclass TableReduce or see
TableOutputFormat.


>
>
> My goal is to implement filter-enabled table scanner which runs by
> multi-process clients using MR. I'm trying to leverage MR since the
> ClientScanner class of HTable sequencially access Hregion and thus involves
> multiple round trips btw servers and clients.


I'm not sure I follow.  Perhaps start simple then see where the bottlenecks
are and optimize here.  Regards roundtrips between client and server, what
you want? A scanner that returns batches rather than row at at time?

St.Ack





>
>
> On Wed, Oct 8, 2008 at 4:30 PM, stack <[EMAIL PROTECTED]> wrote:
>
> > Jaeyun Noh wrote:
> >
> >> Hi,
> >>
> >> May I ask another question?
> >>
> >> I'm running HBase/Hadoop on linux server, and implementing business
> >> application with java, which runs on a different windows machine.
> >>  It looks like MapReduce job runs on a server node. Can I run the
> >> MapReduce
> >> job built on windows client with an existing linux server? How can we
> get
> >> result done by MapReduce job at the server?
> >>
> >>
> >
> > You should be able to, yes.  Make sure you use same java on both
> machines.
> >  This page might help some:
> http://wiki.apache.org/hadoop/Hbase/MapReduce.
> > St.Ack
> >
>

Re: map reduce range of records from hbase table

Reply via email to