[
https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348101#comment-16348101
]
Paulo Motta edited comment on CASSANDRA-14092 at 2/1/18 6:45 AM:
-----------------------------------------------------------------
After a second thought I agree that trying to automatically recover lost data
is bad not only because users may have reacted to the absence of data, but also
because due to the negative {{localDeletionTime}}, the lost data will likely be
purged instantly during compaction, so not much data will be left to recover
after a while and the auto-recovery behavior will be pretty much undefined.
Furthermore, we will need to keep the recovery logic in the storage engine at
least until 4.0 with limited benefit.
The only way a user that set TTL 20 years ahead can get away with reliably
recovering its data is if it has {{incremental_backups}} enabled, and in this
case, it can easily find out which SSTables have a negative
{{minLocalDeletionTime}} and run scrub on them to fix the overflow, so I'm
strongly leaning towards my previous approach of providing a way for scrub to
fix negative timestamps, rather then embedding this recovery logic in the
storage engine itself - which would be arguably useful in a handful of cases.
Regarding what to do with TTLs exceeding the maximum expiry date until we have
a permanent solution to the problem, I think it's fair to expose the cap vs
reject trade-off as flag to the operator, so I introduced a
{{-Dcassandra.expiration_date_overflow_policy={REJECT, CAP_NOWARN, CAP_WARN}}
option, which defaults to the conservative REJECT behavior and has the
following meaning explained on NEWS.txt:
{noformat}
- REJECT: this is the default policy and will reject any requests with
expiration date timestamp after 2038-01-19T03:14:06+00:00.
- CAP_NOWARN: any insert with TTL expiring after
2038-01-19T03:14:06+00:00 will expire on 2038-01-19T03:14:06+00:00.
- CAP_WARN: same as previous, but will log a warning message when
inserted data is capped.
{noformat}
Any feedback and suggestions of alternative more meaningful naming is welcome.
One downside of defaulting to REJECT I see is that applications will start to
break as time progresses and TTLs start reaching the maximum date more often,
but given that we plan to fix the underlying issue soon in upcoming versions
and users have a way to change the policy I think it shouldn't be a big
problem, but I don't feel strongly about it so if anyone does please chime in.
The patch below implements and tests the expiration_date_overflow_policy, but
also adds a new flag {{cassandra.recover_overflowed_expiration_best_effort}} to
perform the auto-recovery, which I intend to remove due to the reasons
explained above and replace it with the [scrub-based
recovery|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092]
(unless someone objects).
I plan to cleanup these details later today, but if anyone can take a look and
give early feedback that would be nice, as I'm hoping to get this in as soon as
we can to start a vote.
The 2.1 and 3.0 are functionally equivalent, except that the 2.1 patch adds a
missing 20-years cap for the default_time_to_live option, and that the 3.0
patch adds the to-be-removed
{{cassandra.recover_overflowed_expiration_best_effort}} option. The versions
for the other branches are pretty much derived from those so I will do that
after the final review round.
||2.1||3.0||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-14092-v3]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092-v3]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14092]|
was (Author: pauloricardomg):
After a second thought I agree that trying to automatically recover lost data
is bad not only because users may have reacted to the absence of data, but also
because due to the negative {{localDeletionTime}}, the lost data will likely be
purged instantly during compaction, so not much data will be left after a while
and the auto-recovery behavior will be pretty much undefined. Furthermore, we
will need to keep the recovery logic in the storage engine at least until 4.0
with limited benefit.
The only way a user that set TTL 20 years ahead can get away with reliably
recovering its data is if it has {{incremental_backups}} enabled, and in this
case, it can easily find out which SSTables have a negative
{{minLocalDeletionTime}} and run scrub on them to fix the overflow, so I'm
strongly leaning towards my previous approach of providing a way for scrub to
fix negative timestamps, rather then embedding this recovery logic in the
storage engine itself - which would be arguably useful in a handful of cases.
Regarding what to do with TTLs exceeding the maximum expiry date until we have
a permanent solution to the problem, I think it's fair to expose the cap vs
reject trade-off as flag to the operator, so I introduced a
{{-Dcassandra.expiration_date_overflow_policy={REJECT, CAP_NOWARN, CAP_WARN}}
option, which defaults to the conservative REJECT behavior and has the
following meaning explained on NEWS.txt:
{noformat}
- REJECT: this is the default policy and will reject any requests with
expiration date timestamp after 2038-01-19T03:14:06+00:00.
- CAP_NOWARN: any insert with TTL expiring after
2038-01-19T03:14:06+00:00 will expire on 2038-01-19T03:14:06+00:00.
- CAP_WARN: same as previous, but will log a warning message when
inserted data is capped.
{noformat}
Any feedback and suggestions of alternative more meaningful naming is welcome.
One downside of defaulting to REJECT I see is that applications will start to
break as time progresses and TTLs start reaching the maximum date more often,
but given that we plan to fix the underlying issue soon in upcoming versions
and users have a way to change the policy I think it shouldn't be a big
problem, but I don't feel strongly about it so if anyone does please chime in.
The patch below implements and tests the expiration_date_overflow_policy, but
also adds a new flag {{cassandra.recover_overflowed_expiration_best_effort}} to
perform the auto-recovery, which I intend to remove due to the reasons
explained above and replace it with the [scrub-based
recovery|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092]
(unless someone objects).
I plan to cleanup these details later today, but if anyone can take a look and
give early feedback that would be nice, as I'm hoping to get this in as soon as
we can to start a vote.
The 2.1 and 3.0 are functionally equivalent, except that the 2.1 patch adds a
missing 20-years cap for the default_time_to_live option, and that the 3.0
patch adds the to-be-removed
{{cassandra.recover_overflowed_expiration_best_effort}} option. The versions
for the other branches are pretty much derived from those so I will do that
after the final review round.
||2.1||3.0||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-14092-v3]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092-v3]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14092]|
> Max ttl of 20 years will overflow localDeletionTime
> ---------------------------------------------------
>
> Key: CASSANDRA-14092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14092
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Paulo Motta
> Assignee: Paulo Motta
> Priority: Blocker
> Fix For: 2.1.20, 2.2.12, 3.0.16, 3.11.2
>
>
> CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year
> 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for
> {{localDeletionTime}}.
> It turns out that next year the {{localDeletionTime}} will start overflowing
> with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20
> years) > Integer.MAX_VALUE}}), so we should remove this limitation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]