[ 
https://issues.apache.org/jira/browse/HBASE-21810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HBASE-21810:
--------------------------------
    Description: 
hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the 
table(cf) compression,

if the compression can be set on client ,sometimes,it's useful,

some case in our production:

1、hfile bulkload replication between the data center with bandwidth limit, we 
can set the compression of the bulkload hfile not changing the table compression

2、bulkload hfile not set  compression ,but the table compression is 
gz/zstd/snappy... ,can reduce the hfile created time and compaction will make 
the hfile to compression finally

3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no 
compression lib,but the hbase cluster has,it's useful for this case

  was:
hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the 
table(cf) compression,

if the compression can be set on client ,somethings it's useful,

some case in our production:

1、hfile bulkload replication between the data center with bandwidth limit, we 
can set the compression of the bulkload hfile not changing the table compression

2、bulkload hfile not set  compression ,but the table compression is 
gz/zstd/snappy... ,can reduce the hfile created time and compaction will make 
the hfile to compression finally

3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no 
compression lib,but the hbase cluster has,it's useful for this case


> bulkload  support set hfile compression on client 
> --------------------------------------------------
>
>                 Key: HBASE-21810
>                 URL: https://issues.apache.org/jira/browse/HBASE-21810
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 1.3.3, 1.4.9, 2.1.2, 1.2.10, 2.0.4
>            Reporter: Yechao Chen
>            Assignee: Yechao Chen
>            Priority: Major
>         Attachments: HBASE-21810.branch-1.001.patch, 
> HBASE-21810.branch-1.2.001.patch, HBASE-21810.branch-2.001.patch, 
> HBASE-21810.master.001.patch
>
>
> hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the 
> table(cf) compression,
> if the compression can be set on client ,sometimes,it's useful,
> some case in our production:
> 1、hfile bulkload replication between the data center with bandwidth limit, we 
> can set the compression of the bulkload hfile not changing the table 
> compression
> 2、bulkload hfile not set  compression ,but the table compression is 
> gz/zstd/snappy... ,can reduce the hfile created time and compaction will make 
> the hfile to compression finally
> 3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has 
> no compression lib,but the hbase cluster has,it's useful for this case



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to