[ 
https://issues.apache.org/jira/browse/PHOENIX-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel Reid updated PHOENIX-1578:
----------------------------------
    Attachment: PHOENIX-1578-docs.patch
                PHOENIX-1578.2.patch

Thanks for the pointers [~jamestaylor].

Here's an updated patch (as well as a documentation patch for the website) with 
the MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP incremented and the addition of 
the STORE_NULLS column as part of the auto-upgrade path.

{quote}One addition that would be nice IMO to your patch is to provide a config 
property that controls the default value of STORE_NULLS. In that way, a new 
installation could set that to true and not have to remember to always include 
it in CREATE TABLE calls, and existing installations could adopt it also 
without calling ALTER TABLE on all existing tables. Perhaps the config property 
would just control the value that gets set in PTableImpl by default for 
storeNulls?{quote}

In this patch I've added a config parameter (phoenix.table.default.store.nulls) 
to set the default value of the STORE_NULLS flag at table creation time. 

However, if I'm understanding your suggestion correctly, you were saying that 
it would be good to have this setting alter the behavior of existing tables as 
well, which sounds to me like it could be problematic. If this is purely a 
config setting (and not strictly set in the catalog table), then all it would 
take is one person connecting with an out-of-sync config file and setting a 
field to null, and that would wipe out the history of that field for good. It 
seems better (or at least acceptable) to me that existing installations would 
need to explicitly issue an ALTER TABLE statement in order to adopt this 
behavior, as opposed to making sure that this setting is synced over all config 
files. What do you think?


> Support explicit storage of null values
> ---------------------------------------
>
>                 Key: PHOENIX-1578
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1578
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>         Attachments: PHOENIX-1578-docs.patch, PHOENIX-1578.2.patch, 
> PHOENIX-1578.patch
>
>
> Null values are currently represented implicitly by a lack of a KeyValue for 
> a given field. This is implemented by using an HBase delete to remove cells 
> when a given field is set to null via an upsert statement.
> However, this method of setting values to null causes all previous versions 
> of the given field to be removed on the next major compaction, which prevents 
> doing flashback queries for the given field.
> One workaround for this is to enable KEEP_DELETED_CELLS on the underlying 
> HBase table -- however, this means that SQL deletes (i.e. DELETE FROM TABLE) 
> will never actually remove the data.
> This ticket is to propose a flag (defined at table level) which specifies 
> that null values to be explicitly stored in HBase. This flag should not 
> change the behavior of a SQL {{DELETE}} statement, i.e. a SQL {{DELETE}} will 
> still cause a record to be permanently deleted (including historical data).
> The use of this flag in combination with KEEP_DELETED_CELLS=false and 
> VERSIONS=unlimited will allow Phoenix to provide true row-level versioning.
> Additional background in this mail thread: http://s.apache.org/kwz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to