Re: HBase export limit bandwith

Michael Segel Thu, 05 Jun 2014 05:13:28 -0700

Ok... 

So when the basic tools don't work... 
How about roll your own?

Step 1 take a snapshot and write the file(s) to a different location outside of 
/hbase. 
(Export to local disk on the cluster)

Step 2 write your own M/R job and control the number of mappers who read from 
HDFS and write to S3. 
Assuming you want a block for block match. If you want to change the #files 
since each region would be a separate file, you could do the write to S3 in the 
reduce phase. 
(Which is what you want.) 

On Jun 4, 2014, at 7:39 AM, Damien Hardy <[email protected]> wrote:

> Hello,
> 
> We are trying to export HBase table on S3 for backup purpose.
> By default export tool run a map per region and we want to limit output
> bandwidth on internet (to amazon s3).
> 
> We were thinking in adding some reducer to limit the number of writers
> but this is explicitly hardcoded to 0 in Export class
> ```
>    // No reducers. Just write straight to output files.
>    job.setNumReduceTasks(0);
> ```
> 
> Is there an other way (propertie?) in hadoop to limit output bandwidth ?
> 
> -- 
> Damien
> 

The opinions expressed here are mine, while they may reflect a cognitive 
thought, that is purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

Re: HBase export limit bandwith

Reply via email to