Wellington Chevreuil created HBASE-29412:
--------------------------------------------
Summary: Extend date tiered compaction to allow for tiering by
values other than cell timestamp
Key: HBASE-29412
URL: https://issues.apache.org/jira/browse/HBASE-29412
Project: HBase
Issue Type: Task
Components: Compaction
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil
Extend DateTieredCompactor with a CustomTieredCompactor that uses a pluggable
value provider for extracting the value to be used for comparison in this
tiered compaction.
Define a built-in value provider that uses a configurable qualifier value for
comparison in the tiered compaction. Using a qualifier value for grouping data
may require propagating this value to all other cells within the same row key:
* We can use cell tags to append the tiering value to all other cells within a
row. This might be needed by the multi file writer, as cells are appended
individually, and the multi file writer must know to which tier file the
appended cell must be forwarded.
* Finding the tiering value for each row requires going back and forward the
cells of a given row. This is needed in order to figure out the tiering value
before starting writing the row to new files.
* If a given row doesn't have the tiering value, just treat it as high priority
and tag its cells to be tiered within the highest priority group.
The tiering boundary values (min/max) of each tiering group should be persisted
to the related store file as a KV pair in the file info portion of the file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)