Hi,

 

After reading several documents, it seems that saveAsHadoopDataset cannot
use HFileOutputFormat.

It's because saveAsHadoopDataset method uses JobConf, so it belongs to the
old Hadoop API, while HFileOutputFormat is a member of mapreduce package
which is for the new Hadoop API.

 

Am I right?

If so, is there another method to bulk-load to HBase from RDD?

 

Thanks.

 

From: innowireless TaeYun Kim [mailto:taeyun....@innowireless.co.kr] 
Sent: Friday, September 19, 2014 7:17 PM
To: user@spark.apache.org
Subject: Bulk-load to HBase

 

Hi,

 

Is there a way to bulk-load to HBase from RDD?

HBase offers HFileOutputFormat class for bulk loading by MapReduce job, but
I cannot figure out how to use it with saveAsHadoopDataset.

 

Thanks.

Reply via email to