[
https://issues.apache.org/jira/browse/HBASE-21810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yechao Chen updated HBASE-21810:
--------------------------------
Description:
hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the
table(cf) compression,
if the compression can be set on client ,sometimes,it's useful,
some case in our production:
1、hfile bulkload replication between the data center with bandwidth limit, we
can set the compression of the bulkload hfile not changing the table compression
2、bulkload hfile not set compression ,but the table compression is
gz/zstd/snappy... ,can reduce the hfile created time and compaction will make
the hfile to compression finally
3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no
compression lib,but the hbase cluster has,it's useful for this case
was:
hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the
table(cf) compression,
if the compression can be set on client ,somethings it's useful,
some case in our production:
1、hfile bulkload replication between the data center with bandwidth limit, we
can set the compression of the bulkload hfile not changing the table compression
2、bulkload hfile not set compression ,but the table compression is
gz/zstd/snappy... ,can reduce the hfile created time and compaction will make
the hfile to compression finally
3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no
compression lib,but the hbase cluster has,it's useful for this case
> bulkload support set hfile compression on client
> --------------------------------------------------
>
> Key: HBASE-21810
> URL: https://issues.apache.org/jira/browse/HBASE-21810
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Affects Versions: 1.3.3, 1.4.9, 2.1.2, 1.2.10, 2.0.4
> Reporter: Yechao Chen
> Assignee: Yechao Chen
> Priority: Major
> Attachments: HBASE-21810.branch-1.001.patch,
> HBASE-21810.branch-1.2.001.patch, HBASE-21810.branch-2.001.patch,
> HBASE-21810.master.001.patch
>
>
> hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the
> table(cf) compression,
> if the compression can be set on client ,sometimes,it's useful,
> some case in our production:
> 1、hfile bulkload replication between the data center with bandwidth limit, we
> can set the compression of the bulkload hfile not changing the table
> compression
> 2、bulkload hfile not set compression ,but the table compression is
> gz/zstd/snappy... ,can reduce the hfile created time and compaction will make
> the hfile to compression finally
> 3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has
> no compression lib,but the hbase cluster has,it's useful for this case
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)