I created an improvement Jira (PHOENIX-7544) to add an empty column to all
column families. Let me know if you have any questions or concerns for this
Jira. I would like to include it in the next release. Here is the
description of it:

In Phoenix, a mutation is a set of cells with the same row key and
timestamp corresponding to an update on the existing row or insertion of
the first image of the row made by an UPSERT statement. This mutation
always includes an empty column cell which is an internal column/cell not
visible to the user. This cell is used for multiple purposes. It allows a
table schema to include only PK columns. As PK columns are stored in the
row key, we need a cell to represent the row in HBase as a row in HBase is
a set of cells with the same row key and to form a cell we need a non-PK
column.

The empty column does not hold any value for data tables but it includes
the row status for covered indexes. This row status indicates if the index
row is verified or unverified (or in other words, committed or
uncommitted). This is the second use of the empty column. The third use is
for finding out the last modification time (the row timestamp) for a row.
Instead of reading the cells for all non-PK columns of a row to find the
last modification time, we just need to read the empty column as the empty
column is always included in row mutations.

Phoenix compaction, TTLRegionScanner (used for masking expired rows) and
PHOENIX_ROW_TIMESTAMP() rely on the empty column to determine the row
timestamp. In Phoenix, there is only one empty column for a given row.
This empty column is stored in one of the column families if the row has
multiple column families. This creates challenges for Phoenix compaction.

HBase TTL and thus HBase major compaction expire cells individually.
Phoenix compaction is a customized HBase compaction to make sure that the
unit of expiration is row instead of cell in order to eliminate partial row
expiry (a data integrity issue in Phoenix). Compaction is done at the
column family level in HBase however Phoenix compaction needs to fetch all
cells from all column families for a given row to determine if the row
should expire or not. This requires region level scans during major
compaction in addition to the compaction scan done at the column family
(that is, store) level.  This obviously requires scanning the same row
multiple times, one for each column family and causes performance issues.

The reason that Phoenix major compaction needs to read cells from all
column families today is to see if the time gap between mutations on a
given row is more than the TTL value for the table. If so, the mutations
beyond such a gap should expire. This gap analysis will not be necessary if
every column family includes an empty column, requiring Phoenix to insert
an empty column cell for each of the column families to every mutation.
Please note that Phoenix does this for only one of the column families
today.

This is the reason we propose including an empty column for each column
family in Phoenix. Clearly, this proposal requires inserting these empty
column cells not only for new mutations but also the existing mutations on
a table with multiple column families. This means this proposal requires
upgrading existing such tables. The suggested approach for this is to have
an upgrade process using Phoenix task (the SYSTEM.TASK table). For each
table to be upgraded, we can add a specific task to insert empty cells to
existing mutations. This task first disables major compaction on the table,
then adds the empty cells, and finally enables major compaction again.

Please note that Phoenix compaction is an optional feature as it is
disabled by default. Although this proposal is required only for the tables
with multiple column families on which Phoenix compaction is enabled, we
suggest that including empty columns for all column families from now on
since it simplifies Phoenix by eliminating the logic to determine which
column family should include the empty column. Even though Phoenix
compaction is an optional feature, it is actually required for tables with
TTL since it prevents data integrity issues due to the cell level TTL
feature in HBase.

Reply via email to