[ https://issues.apache.org/jira/browse/PHOENIX-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073006#comment-14073006 ]
Gabriel Reid commented on PHOENIX-1108: --------------------------------------- Related mail discussion here: http://s.apache.org/FFe > Clarify, verify, and document intended behavior from using > HColumnDescriptor.KEEP_DELETED_CELLS > ----------------------------------------------------------------------------------------------- > > Key: PHOENIX-1108 > URL: https://issues.apache.org/jira/browse/PHOENIX-1108 > Project: Phoenix > Issue Type: Improvement > Reporter: Gabriel Reid > Assignee: Gabriel Reid > > The current default for all Phoenix tables is to enable the > KEEP_DELETED_CELLS flag on all column families. The general functionality of > this default should be reviewed, as well as checking that it works as > intended (particularly in terms of the ChunkedResultIterator, which uses > multiple scans). > The general idea of the KEEP_DELETED_CELLS flag is that it prevents deleted > cells from being permanently removed during a (major) compaction. If the > number of versions to keep for a cell is small (3 is the default) then this > won’t cause a major problem, and is in might be needed in order to function > correctly (i.e. to handle deletes and a major compaction occurring while a > query is being run). > On the other hand, if the number of versions to keep for a column family is > large (e.g. Integer.MAX_VALUE), the default of KEEP_DELETED_CELLS=true will > mean that a delete in Phoenix never actually deletes data. > Tasks to be performed are: > * clear up (and document) the intended behavior that of using > KEEP_DELETED_CELLS=true as a default in Phoenix > * add tests to verify that this intended behavior still works with the > ChunkedResultIterator > * document the implications and/or workaround if a large number of versions > is configured for a column family -- This message was sent by Atlassian JIRA (v6.2#6252)