Re: Hadoop streaming with HBase as data sink

Nick Dimiduk Tue, 06 Oct 2015 12:54:07 -0700

Maybe run more thrift gateways? Maybe one on each host running map tasks
and have the tasks talk to localhost. That way your job doesn't bottleneck
through a single thrift server.


> Solar Cell Data Management sounds cool.

+1 :)

On Fri, Oct 2, 2015 at 1:19 PM, Stack <[email protected]> wrote:

> On Fri, Oct 2, 2015 at 12:46 PM, Pei Zhao <[email protected]> wrote:
>
> > Hi Stack,
> >
> > In my case, I tried to use HBase Thrift API, but Thrift server sometimes
> > crashes during my MapReduce job, due to out of heap memory.
> >
> > Do you have any suggestions on that please?
> >
> >
> Give it more heap?
>
> What is your client?
>
> Solar Cell Data Management sounds cool.
>
> St.Ack
>
>
>
>
> > Thanks
> >
> > On Fri, Oct 2, 2015 at 3:10 PM, Stack <[email protected]> wrote:
> >
> > > You can't do hadoop streaming into hbase. Maybe explore hbase REST
> > > interface and see if you can format puts that hbase REST can digest.
> > > St.Ack
> > >
> > > On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <[email protected]>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I am a graduate student do research in solar cell data management. My
> > > > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> > > using
> > > > Hadoop streaming.
> > > >
> > > > My question is can I use Hadoop streaming, which outputs to stdout
> > with a
> > > > specific format that HBase can pick it up and put them into tables?
> > > >
> > > > For instance,
> > > >
> > > > If I output lines of RowKey\t\Column\tValue , then HBase can know how
> > to
> > > > put this into tables.
> > > >
> > > > Regards
> > > > Pei
> > >
> >
> >
> >
> > --
> > *Pei (Asher) Zhao*
> > *Electrical Engineering and Computer Science*
> > *Case Western Reserve University*
> > *Cleveland, Ohio 44106*
> >
> > *--The man who has made up his mind to win will never say impossible.*
> >
>

Re: Hadoop streaming with HBase as data sink

Reply via email to