Ishan Chhabra created HBASE-10323:
-------------------------------------
Summary: Auto detect data block encoding in HFileOutputFormat
Key: HBASE-10323
URL: https://issues.apache.org/jira/browse/HBASE-10323
Project: HBase
Issue Type: Improvement
Reporter: Ishan Chhabra
Assignee: Ishan Chhabra
Currently, one has to specify the data block encoding of the table explicitly
using the config parameter
"hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload
load. This option is easily missed, not documented and also works differently
than compression, block size and bloom filter type, which are auto detected.
The solution would be to add support to auto detect datablock encoding similar
to other parameters.
The current patch does the following:
1. Automatically detects datablock encoding in HFileOutputFormat.
2. Keeps the legacy option of manually specifying the datablock encoding
around as a method to override auto detections.
3. Moves string conf parsing to the start of the program so that it fails
fast during starting up instead of failing during record writes. It also
makes the internals of the program type safe.
4. Adds missing doc strings and unit tests for code serializing and
deserializing config paramerters for bloom filer type, block size and
datablock encoding.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)