[
https://issues.apache.org/jira/browse/CASSANDRA-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912409#comment-17912409
]
Michael Semb Wever edited comment on CASSANDRA-18504 at 1/15/25 7:50 AM:
-------------------------------------------------------------------------
bq. if this was a "just in case" then maybe scrub is the best option for 5.0
upgrade?
I think we should floor 5.0 compatibility to `minimum_version=me`, encouraging
users to address this problem before they do their upgrade to 4.x. This means
using tooling from 3.x or 4.x.
A scrub alone is not enough, you need to use the header fix option in
sstablescrub.
All sstables before `me` are most likely affected. While `me` may be affected
if they were written by 3.0.x.
If (in cassandra-5.0 and trunk) we set `minimum_version=me` then we fail-fast
the first node that attempts an upgrade. This importantly tells the user that
they must deal with any potential header fix issues before performing the
upgrade. This would also remove the header fix functionality from the
sstablescrub tooling in cassandra-5.0 and trunk, as minimum_version is shared
between server runtime and tooling.
The alternative is to restore SSTableHeaderFix into the tooling classpath, and
in the server add a custom fail-fast on < `me` sstables (and also possibly a
warning on `me` sstables). This would effectively mean minimum_version only
applies to the tooling. This can kinda make sense, since for an online upgrade
we're expecting >4.0 where any `me` sstables would likely already have been
fixed (having previously caused issues).
While it makes sense to honour our full offline sstable format upgrade
compatibility commitment, this particular bug is painful and forces the
operator to think about the `me` boundary and apply the header fix. Our
testing matrix also focus most of our online upgrade compatibility paths. So
there's an argument here to break on `me`. Offline upgradability is still just
as possible using multiple steps with different versions of the tooling, and
multiple steps might be unavoidable here anyway…
idk 🤷
was (Author: michaelsembwever):
bq. if this was a "just in case" then maybe scrub is the best option for 5.0
upgrade?
I think we should floor 5.0 compatibility to `minimum_version=me`, encouraging
users to address this problem before they do their upgrade to 4.x. This means
using tooling from 3.x or 4.x.
A scrub alone is not enough, you need to use the header fix option in
sstablescrub.
All sstables before `me` are most likely affected. While `me` may affected if
they were written by 3.0.x.
If (in cassandra-5.0 and trunk) we set `minimum_version=me` then we fail-fast
the first node that attempts an upgrade. This importantly tells the user that
they must deal with any potential header fix issues before performing the
upgrade. This would also remove the header fix functionality from the
sstablescrub tooling in cassandra-5.0 and trunk, as minimum_version is shared
between server runtime and tooling.
The alternative is to restore SSTableHeaderFix into the tooling classpath, and
in the server add a custom fail-fast on < `me` sstables (and also possibly a
warning on `me` sstables). This would effectively mean minimum_version only
applies to the tooling. This can kinda make sense, since for an online upgrade
we're expecting >4.0 where any `me` sstables would likely already have been
fixed (having previously caused issues).
While it makes sense to honour our full offline sstable format upgrade
compatibility commitment, this particular bug is painful and forces the
operator to think about the `me` boundary and apply the header fix. Our
testing matrix also focus most of our online upgrade compatibility paths. So
there's an argument here to break on `me`. Offline upgradability is still just
as possible using multiple steps with different versions of the tooling, and
multiple steps might be unavoidable here anyway…
idk 🤷
> Added support for type VECTOR<type, dimension>
> ----------------------------------------------
>
> Key: CASSANDRA-18504
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18504
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Cluster/Schema, CQL/Syntax
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Fix For: 5.0-alpha1, 5.0
>
> Time Spent: 20h 40m
> Remaining Estimate: 0h
>
> Based off several mailing list threads (see "[POLL] Vector type for ML”,
> "[DISCUSS] New data type for vector search”, and "Adding vector search to SAI
> with heirarchical navigable small world graph index”), its desirable to add a
> new type “VECTOR” that has the following properties
> 1) fixed length array
> 2) elements may not be null
> 3) flatten array (aka multi-cell = false)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]