[
https://issues.apache.org/jira/browse/CASSANDRA-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688164#comment-17688164
]
C. Scott Andreas commented on CASSANDRA-18134:
----------------------------------------------
A discuss thread for this would be good.
I like the idea that the patch proposes. I worry about a major SSTable version
rev in a minor release of the database. Lots of folks have tooling that
generates or reads SSTables that would need to be modified to read a new major
version which might restrict adoptability of the new change.
It would also be excellent to introduce randomized/fuzz test coverage for
SSTable format changes via Harry or similar to catch potential edge cases not
exercised by unit tests with static inputs.
> Improve handling of min/max clustering in sstable
> -------------------------------------------------
>
> Key: CASSANDRA-18134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18134
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/SSTable
> Reporter: Jacek Lewandowski
> Assignee: Jacek Lewandowski
> Priority: Normal
> Fix For: 4.x
>
>
> This patch improves the following things:
> # SSTable metadata will store a covered slice instead of min/max clusterings.
> The difference is that for slices there is available the type of a bound
> rather than just a clustering. In particular it will provide the information
> whether the lower and upper bound of an sstable is opened or closed.
> # SSTable metadata will store a flag whether the SSTable contains any
> partition level deletions or not
> # The above two changes required to introduce a new major format for SSTables
> - {{oa}}
> # Single partition read command makes use of the above changes. In particular
> an sstable can be skipped when it does not intersect with the column filter,
> does not have partition level deletions and does not have statics; In case
> there are partition level deletions, but the other conditions are satisfied,
> only the partition header needs to be accessed (tests attached)
> # Skipping sstables assuming those three conditions are satisfied has been
> implemented also for partition range queries (tests attached). Also added
> minor separate statistics to record the number of accessed sstables in
> partition reads because now not all of them need to be accessed. That
> statistics is also needed in tests to confirm skipping.
> # Artificial lower bound marker is now an object on its own and is not
> implemented as a special case of range tombstone bound. Instead it sorts
> right before the lowest available bound in the data
> # Extended the lower bound optimization usage due the 1 and 2
> # Do not initialize iterator just to get a cached partition and associated
> columns index. The purpose of using lower bound optimization was to avoid
> opening an iterator of an sstable if possible.
> See also CASSANDRA-14861
> The changes in this patch include work of [~blambov], [~slebresne],
> [~jakubzytka] and [~jlewandowski]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]