[ 
https://issues.apache.org/jira/browse/CASSANDRA-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912409#comment-17912409
 ] 

Michael Semb Wever edited comment on CASSANDRA-18504 at 1/15/25 7:54 AM:
-------------------------------------------------------------------------

bq.  if this was a "just in case" then maybe scrub is the best option for 5.0 
upgrade?

I think we should floor 5.0 compatibility to `minimum_version=me`, encouraging 
users to address this problem before they do their upgrade to 4.x. This means 
using tooling from 3.x or 4.x.

A scrub alone is not enough, you need to use the header fix option in 
sstablescrub. 

All sstables before `me` are most likely affected.  While `me` may be affected 
if they were written by 3.0.x.

If (in cassandra-5.0 and trunk) we set `minimum_version=me` then we fail-fast 
the first node that attempts an upgrade.  This importantly tells the user that 
they must deal with any potential header fix issues before performing the 
upgrade.    This would also remove the header fix functionality from the 
sstablescrub tooling in cassandra-5.0 and trunk, as minimum_version is shared 
between server runtime and tooling.

The alternative is to restore SSTableHeaderFix into the tooling classpath, and 
in the server add a custom fail-fast on < `me` sstables (and also possibly a 
warning on `me` sstables).  This would effectively mean minimum_version only 
applies to the tooling.  This can kinda makes sense, since for an online 
upgrade we're expecting >4.0 where any `me` sstables would likely already have 
been fixed (having previously caused issues).

While it makes sense to honour our full offline sstable format upgrade 
compatibility commitment, this particular bug is painful and forces the 
operator to think about the `me` boundary and apply the header fix.  Our 
testing matrix also focuses most on our online upgrade compatibility paths.  So 
there's an argument here to break on `me`.  Offline upgradability is still just 
as possible using multiple steps with different versions of the tooling, and 
multiple steps might be unavoidable here anyway…

idk 🤷 


was (Author: michaelsembwever):
bq.  if this was a "just in case" then maybe scrub is the best option for 5.0 
upgrade?

I think we should floor 5.0 compatibility to `minimum_version=me`, encouraging 
users to address this problem before they do their upgrade to 4.x. This means 
using tooling from 3.x or 4.x.

A scrub alone is not enough, you need to use the header fix option in 
sstablescrub. 

All sstables before `me` are most likely affected.  While `me` may be affected 
if they were written by 3.0.x.

If (in cassandra-5.0 and trunk) we set `minimum_version=me` then we fail-fast 
the first node that attempts an upgrade.  This importantly tells the user that 
they must deal with any potential header fix issues before performing the 
upgrade.    This would also remove the header fix functionality from the 
sstablescrub tooling in cassandra-5.0 and trunk, as minimum_version is shared 
between server runtime and tooling.

The alternative is to restore SSTableHeaderFix into the tooling classpath, and 
in the server add a custom fail-fast on < `me` sstables (and also possibly a 
warning on `me` sstables).  This would effectively mean minimum_version only 
applies to the tooling.  This can kinda make sense, since for an online upgrade 
we're expecting >4.0 where any `me` sstables would likely already have been 
fixed (having previously caused issues).

While it makes sense to honour our full offline sstable format upgrade 
compatibility commitment, this particular bug is painful and forces the 
operator to think about the `me` boundary and apply the header fix.  Our 
testing matrix also focus most of our online upgrade compatibility paths.  So 
there's an argument here to break on `me`.  Offline upgradability is still just 
as possible using multiple steps with different versions of the tooling, and 
multiple steps might be unavoidable here anyway…

idk 🤷 

> Added support for type VECTOR<type, dimension>
> ----------------------------------------------
>
>                 Key: CASSANDRA-18504
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18504
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Schema, CQL/Syntax
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: 5.0-alpha1, 5.0
>
>          Time Spent: 20h 40m
>  Remaining Estimate: 0h
>
> Based off several mailing list threads (see "[POLL] Vector type for ML”, 
> "[DISCUSS] New data type for vector search”, and "Adding vector search to SAI 
> with heirarchical navigable small world graph index”), its desirable to add a 
> new type “VECTOR” that has the following properties
> 1) fixed length array
> 2) elements may not be null
> 3) flatten array (aka multi-cell = false)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to