Hi Gabriel, For (1) ensure that all data types have an explicit way of writing null values, you can write an empty byte array as the value and it'll work the way you want it to already. You'd need to conditionally generate a Put with this empty byte array instead of a Delete, but I believe it's isolated to a few lines of code in PRowImpl. You'd want to make sure that indexes behave correctly as well, but I believe if a Put was generated for the data table update, that a corresponding Put would be generated for the index updates. Perhaps you could add a new config param to control the global default (it'd need to be both on the client and server hbase-sites.xml) and/or SYSTEM.CATALOG column to control this on a per table basis. This would be generally useful for Phoenix to have this mode, as the transaction library we're considering using does not deal with Delete markers, so we need this functionality.
Not sure if the semantics match what you'd want with KEEP_DELETED_CELLS off and a delete of a row occurring, but I think it would, though, since the row deletions would always sort before the puts. Thanks, James On Fri, Jan 9, 2015 at 5:46 AM, Gabriel Reid <gabriel.r...@gmail.com> wrote: > Hi, > > We currently have a use case where we want to be able to do flashback > queries (i.e. provide a CurrentSCN on a Phoenix connection to read > previous versions of records). In the same table where we're doing > this, we are also regularly adding and deleting records. What we want > to do is have versioning at row level, but allow permanently deleting > a row (i.e. if a row is deleted, all versions of all columns should > also be deleted, and no longer be visible in flashback queries). > > The issue at hand is that if we set a cell's value to null (e.g. > "UPSERT INTO mytab (id, name) values (1, null)") then all the history > of that one column for that one record is lost after the next major > compaction. > > This can be worked around by making use of the KEEP_DELETED_CELLS > attribute in the table definition (enabled by default, although as of > PHOENIX-1498 this will no longer be the case). However, > KEEP_DELETED_CELLS also translates to keeping all rows that have been > deleted. In other words, "DELETE FROM mytab WHERE id = 1" means that > the underlying record won't really be deleted, ever. > > I see two ways of facilitating this kind of row-level versioning that > still supports row-level deletes: > > 1. Turn KEEP_DELETED_CELLS off, and ensure that all data types have an > explicit way of writing null values. This would require changing a > number (or all) of PDataTypes so that they all have an explicit null > representation that can be explicitly stored. > > 2. Turn KEEP_DELETED_CELLS on, and have a RegionCoprocessor that > implements preCompact to return a custom InternalScanner that omits > everything per row that is older than the most recent delete of the > row (a delete of a row would be interpreted as a delete marker on the > QueryConstants.EMPTY_COLUMN_NAME cell). > > The first question: is this something that would be useful for Phoenix > in general, or is it too specific? I guess it falls under the case of > general row versioning (as opposed to full table content versioning), > which may or may not be interesting to others. > > The second question: any comments on either of these approaches? > > This is something that we need regardless, so if this sounds like it > may be interesting to add into Phoenix proper then I'll create a JIRA > ticket and patches. If not, then it looks like option 2 is the one > real option without needing to customize Phoenix. > > Thanks in advance for any feedback on this! > > - Gabriel