[
https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349824#comment-16349824
]
Paulo Motta commented on CASSANDRA-14092:
-----------------------------------------
Thanks for the quick turnaround [~beobal]! See follow-up below:
{quote}The wording of the NEWS.txt entry is good, I do wonder if we should
maybe place it right at the top of the file rather than just in the 3.0.16
section for extra emphasis. Any thoughts on that?
{quote}
Good idea, I did this and also updated the text to contemplate the possibility
of data loss before this patch and how to fix it with scrub:
{noformat}
MAXIMUM TTL EXPIRATION DATE NOTICE
-----------------------------------
The maximum expiration timestamp that can be represented by the storage engine
is 2038-01-19T03:14:06+00:00,
which means that inserts with TTL that expire after this date are not currently
supported.
Prior to 3.0.16 in the 3.0.X series and 3.11.2 in the 3.11 series, there was no
protection against INSERTS
with TTL expiring after the maximum supported date, causing the expiration time
field to overflow and the
records to expire immediately. Expired records due to overflow may have been
removed permanently after a
compaction. The 2.1.X and 2.2.X series are not subject to data loss due to this
issue if assertions are enabled,
since an AssertionError is thrown during INSERT when the expiration time field
overflows on these versions.
In practice this issue will affect only users that use very large TTLs, close
to the maximum allowed value of
630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time
progresses, the maximum supported
TTL will be gradually reduced as the the maximum expiration date approaches.
For instance, a user on an affected
version on 2028-01-19T03:14:06 with a TTL of 10 years will be affected by this
bug, so we urge users of very
large TTLs to upgrade to a version where this issue is addressed as soon as
possible.
Potentially affected users should inspect their SSTables and search for
negative min local deletion times to
detect this issue. SSTables in this state must be backed up immediately, as
they are subject to data loss
during auto-compactions, and may be recovered by running the sstablescrub tool
from versions 3.0.16+ and/or 3.11.2+.
The Cassandra project plans to fix this limitation in newer versions, but while
the fix is not available, operators
can decide which policy to apply when dealing with inserts with TTL exceeding
the maximum supported expiration date:
- REJECT: this is the default policy and will reject any requests with
expiration date timestamp after 2038-01-19T03:14:06+00:00.
- CAP: any insert with TTL expiring after 2038-01-19T03:14:06+00:00 will
expire on 2038-01-19T03:14:06+00:00 and the client will receive a warning.
- CAP_NOWARN: same as previous, except that the client warning will not be
emitted.
These policies may be specified via the
-Dcassandra.expiration_date_overflow_policy=POLICY startup option which can be
set in the jvm.options file.
See CASSANDRA-14092 for more details about this issue.
{noformat}
Please let me know what do you think of the updated text. We should also
probably publish this text (or a subset of it) during the release announcement
e-mail.
While writing the text above, I figured that there is also a remote possibility
of data loss in 2.1/2.2 if assertions are disabled, but didn't backport the
scrub recovery since it was not a straightforward backport and I didn't think
it was worth the effort right now. We can always do that later if necessary,
the most important thing right now is to ship the policies. To reflect this I
updated the 4th paragraph on 2.1 and 2.2 to:
{noformat}
2.1.X / 2.2.X users in the conditions above should not be subject to data loss
unless assertions are disabled, in which
case the suspect SSTables must be backed up immediately and manually recovered,
as they are subject to data loss
during auto-compaction.
{noformat}
{quote}I also have one piece of feedback on the policies; I don't see any
benefit in being able to turn off logging of capped expirations (especially
since we're using NoSpamLogger) but I do I think the client warning is useful.
{quote}
I agree and updated the patch with this suggestion, but at the same time I
think advanced operators may want to control the periodicity of the logging, so
I created a property
{{cassandra.expiration_overflow_warning_interval_minutes=5}} to control this.
{quote}I also noticed that the logging of a parse error/invalid value for the
policy sysprop is at DEBUG in the current patches, but it might be sensible to
draw a bit more attention to that if it happens.
{quote}
Agreed, changed the logging to WARN.
I finished the cleanup of the patch and already provided a version for all
branches. The 2.1 and 2.2 versions are pretty much the same, as well as the
3.0/3.11/trunk, except for some minor conflicts. Please find below a short
summary of the changes per branch:
* 2.1:
** Add REJECT and CAP expiration date overflow policies and tests
** Cap max default TTL at 20 years and tests
** Add NEWS.txt entry
* 2.2:
** Same as 2.1, few minor import conflicts
* 3.0
** Add REJECT and CAP, CAP_NOWARN expiration date overflow policies and tests
** Add ability to scrub to fix negative localDeletionTime and tests with
broken SSTables
** Add ability to sstablemetadata to show minLocalDeletionTime
** Add expiration date overflow policies to jvm.options file
** Add NEWS.txt entry
* 3.11
** Same as 3.0, few minor conflicts during merge
* master
** Same as 3.11, few minor conflicts during merge
** Removed ability of scrub to fix sstables with negative localdeletionTime
and tests
* dtest
** Test all policies on CQL for default and user supplied TTL
** Test cap policy on thrift for default and user supplied TTL
** Check that offline scrub recovers sstable with negative localDeletionTime
I submitted a preliminary round of CI with the non-cleaned up patch and the
results looked good. I will submit again for all the branches below and post
the results here when they are ready.
||2.1||2.2||3.0||3.11||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-14092-v5]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-14092-v5]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092-v5]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:3.11-14092-v5]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v5]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14092-v5]|
> Max ttl of 20 years will overflow localDeletionTime
> ---------------------------------------------------
>
> Key: CASSANDRA-14092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14092
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Paulo Motta
> Assignee: Paulo Motta
> Priority: Blocker
> Fix For: 2.1.20, 2.2.12, 3.0.16, 3.11.2
>
>
> CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year
> 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for
> {{localDeletionTime}}.
> It turns out that next year the {{localDeletionTime}} will start overflowing
> with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20
> years) > Integer.MAX_VALUE}}), so we should remove this limitation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]