Gabriel Reid created PHOENIX-1108:
-------------------------------------
Summary: Clarify, verify, and document intended behavior from
using HColumnDescriptor.KEEP_DELETED_CELLS
Key: PHOENIX-1108
URL: https://issues.apache.org/jira/browse/PHOENIX-1108
Project: Phoenix
Issue Type: Improvement
Reporter: Gabriel Reid
Assignee: Gabriel Reid
The current default for all Phoenix tables is to enable the KEEP_DELETED_CELLS
flag on all column families. The general functionality of this default should
be reviewed, as well as checking that it works as intended (particularly in
terms of the ChunkedResultIterator, which uses multiple scans).
The general idea of the KEEP_DELETED_CELLS flag is that it prevents deleted
cells from being permanently removed during a (major) compaction. If the number
of versions to keep for a cell is small (3 is the default) then this won’t
cause a major problem, and is in might be needed in order to function correctly
(i.e. to handle deletes and a major compaction occurring while a query is being
run).
On the other hand, if the number of versions to keep for a column family is
large (e.g. Integer.MAX_VALUE), the default of KEEP_DELETED_CELLS=true will
mean that a delete in Phoenix never actually deletes data.
Tasks to be performed are:
* clear up (and document) the intended behavior that of using
KEEP_DELETED_CELLS=true as a default in Phoenix
* add tests to verify that this intended behavior still works with the
ChunkedResultIterator
* document the implications and/or workaround if a large number of versions is
configured for a column family
--
This message was sent by Atlassian JIRA
(v6.2#6252)