[
https://issues.apache.org/jira/browse/HBASE-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell updated HBASE-23678:
----------------------------------------
Description:
Lars designed the combination of VERSIONS, TTL, MIN_VERSIONS, and
KEEP_DELETED_CELLS with a maximum of flexibility. There is a lot of nuance
regarding their usage. Almost all combinations of these four settings make
sense for some use cases (exceptions are MIN_VERSIONS > 0 without TTL, and
KEEP_DELETED_CELLS=TTL without TTL). There should be a way to make the behavior
with TTL easier to conceive when creating the schema. This could take the form
of a literate builder API for ColumnDescriptor or an extension to an existing
one.
Let me give you a motivating example: We may want to retain all versions for a
given TTL, and then only a specific number of versions. This can be achieved
with VERSIONS=INT_MAX, TTL=_retention_interval_, KEEP_DELETED_CELLS=TTL,
MIN_VERSION=_num_versions_ . This is not intuitive though because VERSIONS has
been used to specify the number of versions to retain (_num_versions_ in this
example) since HBase version 0.1, so this is going to be a source of confusion
- I've seen it in practice.
A literate builder API, by way if its method names, could let a user describe
more or less in speaking language how they want version retention to work, and
internally the builder API could set the low level schema attributes.
was:
Lars designed the combination of VERSIONS, TTL, MIN_VERSIONS, and
KEEP_DELETED_CELLS with a maximum of flexibility. There is a lot of nuance
regarding their usage. Almost all combinations of these four settings make
sense for some use cases (exceptions are MIN_VERSIONS > 0 without TTL, and
KEEP_DELETED_CELLS=TTL without TTL). There should be a way to make the behavior
with TTL easier to conceive when creating the schema. This could take the form
of a literate builder API for ColumnDescriptor or an extension to an existing
one.
Let me give you a motivating example: We may want to retain all versions for a
given TTL, and then only a specific number of versions. This can be achieved
with VERSIONS=INT_MAX, TTL=_retention_interval_, KEEP_DELETED_CELLS=TTL,
MIN_VERSION=_num_versions_ . This is not intuitive though because VERSIONS has
been used to specify _num_versions_ in this example since version 0.1.
A literate builder API, by way if its method names, could let a user describe
more or less in speaking language how they want version retention to work, and
internally the builder API could set the low level schema attributes.
> Literate builder API for version management in schema
> -----------------------------------------------------
>
> Key: HBASE-23678
> URL: https://issues.apache.org/jira/browse/HBASE-23678
> Project: HBase
> Issue Type: Improvement
> Reporter: Andrew Kyle Purtell
> Priority: Major
>
> Lars designed the combination of VERSIONS, TTL, MIN_VERSIONS, and
> KEEP_DELETED_CELLS with a maximum of flexibility. There is a lot of nuance
> regarding their usage. Almost all combinations of these four settings make
> sense for some use cases (exceptions are MIN_VERSIONS > 0 without TTL, and
> KEEP_DELETED_CELLS=TTL without TTL). There should be a way to make the
> behavior with TTL easier to conceive when creating the schema. This could
> take the form of a literate builder API for ColumnDescriptor or an extension
> to an existing one.
> Let me give you a motivating example: We may want to retain all versions for
> a given TTL, and then only a specific number of versions. This can be
> achieved with VERSIONS=INT_MAX, TTL=_retention_interval_,
> KEEP_DELETED_CELLS=TTL, MIN_VERSION=_num_versions_ . This is not intuitive
> though because VERSIONS has been used to specify the number of versions to
> retain (_num_versions_ in this example) since HBase version 0.1, so this is
> going to be a source of confusion - I've seen it in practice.
> A literate builder API, by way if its method names, could let a user describe
> more or less in speaking language how they want version retention to work,
> and internally the builder API could set the low level schema attributes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)