[jira] [Commented] (CASSANDRA-7237) Optimize batchlog manager to avoid full scans

Aleksey Yeschenko (JIRA) Wed, 05 Aug 2015 13:06:06 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658790#comment-14658790
 ]


Aleksey Yeschenko commented on CASSANDRA-7237:
----------------------------------------------

Rebased and fixed the build 
[here|https://github.com/iamaleksey/cassandra/commits/7237]. Also fixed some 
very minor nits myself to not waste your time:
- naming of constants in {{SystemKeyspace}} being inconsistent with the rest of 
the defined tables there
- {{replayAllFailedBatches()}} doesn't actually throw any declared exceptions 
anymore; removed the use of {{WrappedRunnable}} with Java 8 method references
- made some static methods {{static}} there, while we are editing it anyway, to 
satisfy the annoying IDEA inspections
- copy-paste-ish code in {{calculatePageSize()}} was still referring to hints

There are two issues with the patch:
- batches created in 3.0 will not be understood by 2.1/2.2 nodes (new table), 
breaking upgrades
- batches created in 2.1/2.2 will be written to the deprecated table and not 
noticed/replayed until the next node restart, when conversion will happen again

However, I suggest not fixing these issues here, since that would duplicate 
[~Stefania]'s work on CASSANDRA-9673, that already has to deal with 
compatibility in both directions.

If you don't mind my (extremely minor) changes, and letting Stefania handle the 
upgrade issue, I'm going to commit as is as soon as cassci is happy 
([testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-7237-testall/],
 
[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-7237-dtest/]).

Two more things, for a follow up ticket, if reasonable:
1. We can remember the uuid of the last replayed batch and only scan from there 
to (now - timeout). Or maybe add some correction for error and start with (last 
- timeout).
2. If we only scan from (last - timeout) to (now - timeout) - instead of 
pre-3.0 scan (allthethings), then we might consider replaying more often than 
ever 60 seconds (make it 10, or come up with some other number).

> Optimize batchlog manager to avoid full scans
> ---------------------------------------------
>
>                 Key: CASSANDRA-7237
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7237
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Branimir Lambov
>            Priority: Minor
>             Fix For: 3.0.0 rc1
>
>
> Now that we use time-UUIDs for batchlog ids, and given that w/ local strategy 
> the partitions are ordered in time-order here, we can optimize the scanning 
> by limiting the range to replay taking the last replayed batch's id as the 
> beginning of the range, and uuid(now+timeout) as its end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7237) Optimize batchlog manager to avoid full scans

Reply via email to