Re: Write to HBase from spark job

Eugen Cepoi Sat, 12 Oct 2013 12:57:26 -0700

Hi Matei,

Ok thanks I will try it. Indeed using saveAsNewAPIHadoopFile was not
working, as TableOutputFormat implements Configurable and its setConf
method was never called.


BTW you have done great job with spark, it combines so nicely with scala,
the api is clean and is really easy to work with. I am impressed =)

Eugen


2013/10/12 Matei Zaharia <[email protected]>

> Hi Eugen,
>
> You should use saveAsHadoopDataset, to which you pass a JobConf object
> that you've configured with TableOutputFormat the same way you would for a
> MapReduce job. The saveAsHadoopFile methods are specifically for output
> formats that go to a filesystem (e.g. HDFS), but HBase isn't a filesystem.
>
> Matei
>
> On Oct 11, 2013, at 8:53 AM, Eugen Cepoi <[email protected]> wrote:
>
> > Hi there,
> >
> > I have got a few questions on how best to write to HBase from a spark
> job.
> >
> > - If we want to write using TableOutputFormat are we supposed to use
> saveAsNewAPIHadoopFile?
> > - Or should we do it by hand (without TableOutputFormat) in a foreach
> loop for example?
> > - Or should use HFileOutputFormat with saveAsNewAPIHadoopFile?
> >
> > Thanks,
> > Eugen
>
>

Re: Write to HBase from spark job

Reply via email to