Re: Write to HBase from spark job

Matei Zaharia Sat, 12 Oct 2013 10:43:22 -0700

Hi Eugen,

You should use saveAsHadoopDataset, to which you pass a JobConf object that 
you've configured with TableOutputFormat the same way you would for a MapReduce 
job. The saveAsHadoopFile methods are specifically for output formats that go 
to a filesystem (e.g. HDFS), but HBase isn't a filesystem.


Matei

On Oct 11, 2013, at 8:53 AM, Eugen Cepoi <[email protected]> wrote:

> Hi there,
> 
> I have got a few questions on how best to write to HBase from a spark job.
> 
> - If we want to write using TableOutputFormat are we supposed to use 
> saveAsNewAPIHadoopFile?
> - Or should we do it by hand (without TableOutputFormat) in a foreach loop 
> for example?
> - Or should use HFileOutputFormat with saveAsNewAPIHadoopFile?
> 
> Thanks,
> Eugen

Re: Write to HBase from spark job

Reply via email to