[
https://issues.apache.org/jira/browse/CASSANDRA-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688423#comment-17688423
]
Benedict Elliott Smith commented on CASSANDRA-18134:
----------------------------------------------------
I think the fact that this touches an existing path isn't a good reason not to
introduce fuzz tests. The project has historically been very bad about this,
and I think we had broadly agreed to take a better approach going forwards,
with each new change to areas that introduce a risk of breaking things being an
opportunity to expand our coverage to where it should always have been.
I haven't looked closely at this patch, but I will note it is a lot bigger than
I would expect given the description, and touches a lot of code (no doubt
cleaning it up a great deal!), and that maybe this is a good signal some
additional testing is warranted. I am not undertaking a review right now
though, so this isn't a well-informed or binding opinion.
It _might_ be that a meta discuss thread is warranted to discuss the standard
of testing we as a project might want for changes _like_ this. We've made a
number of commitments to improve testing, but we haven't had much _specific_
opportunity to discuss what that means to each of us, and this might be a good
test case. Again, just an idea for improving consensus, not a demand (and
somebody else could always start that thread).
One thing I will say, though, is that our new policy of supporting live
downgrades means that we need to either make this change backwards compatible
or an optional upgrade. I think it isn't currently very easy to make this
backwards compatible, and I don't think it is necessary - optionally upgrading
to this format would be fine. But I think it would be possible to make it
backwards compatible if we wanted, and we could take this as an opportunity to
make backwards compatible changes more easy going forwards. We could for
instance write the additional information for the slice at the end of the file,
and introduce some additional metadata indicating if this extra information is
available.
> Improve handling of min/max clustering in sstable
> -------------------------------------------------
>
> Key: CASSANDRA-18134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18134
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/SSTable
> Reporter: Jacek Lewandowski
> Assignee: Jacek Lewandowski
> Priority: Normal
> Fix For: 5.x
>
>
> This patch improves the following things:
> # SSTable metadata will store a covered slice instead of min/max clusterings.
> The difference is that for slices there is available the type of a bound
> rather than just a clustering. In particular it will provide the information
> whether the lower and upper bound of an sstable is opened or closed.
> # SSTable metadata will store a flag whether the SSTable contains any
> partition level deletions or not
> # The above two changes required to introduce a new major format for SSTables
> - {{oa}}
> # Single partition read command makes use of the above changes. In particular
> an sstable can be skipped when it does not intersect with the column filter,
> does not have partition level deletions and does not have statics; In case
> there are partition level deletions, but the other conditions are satisfied,
> only the partition header needs to be accessed (tests attached)
> # Skipping sstables assuming those three conditions are satisfied has been
> implemented also for partition range queries (tests attached). Also added
> minor separate statistics to record the number of accessed sstables in
> partition reads because now not all of them need to be accessed. That
> statistics is also needed in tests to confirm skipping.
> # Artificial lower bound marker is now an object on its own and is not
> implemented as a special case of range tombstone bound. Instead it sorts
> right before the lowest available bound in the data
> # Extended the lower bound optimization usage due the 1 and 2
> # Do not initialize iterator just to get a cached partition and associated
> columns index. The purpose of using lower bound optimization was to avoid
> opening an iterator of an sstable if possible.
> See also CASSANDRA-14861
> The changes in this patch include work of [~blambov], [~slebresne],
> [~jakubzytka] and [~jlewandowski]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]