Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-19 Thread Brandon Williams
On Fri, Mar 17, 2023 at 2:38 PM Mick Semb Wever  wrote:
> So is there appetite for such a patch to fail or warn (guardrail?) to prevent 
> a node running on a new version that does not support sstable formats 
> existing on other nodes?

I think this makes sense, whatever the mechanism.


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Mick Semb Wever
On Fri, 17 Mar 2023 at 17:24, Brandon Williams  wrote:

> On Fri, Mar 17, 2023 at 9:25 AM Mick Semb Wever  wrote:
> > Question/Suggestion: should we improve gossip to include what the oldest
> format a node has, and ensure newer versioned node joining fail/warn if it
> does > not support that older format?  That is, should we give a clear
> signal back to operators that their rolling upgrade is not going to work
> smoothly, that they are > going to hit nodes they will need to stop and do
> upgradesstables on (leaving them in a state of mix-versions and nodes busy
> upgrading…)
>
> We already have this (even in 3.0!) to facilitate dropping compact
> storage:
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/gms/ApplicationState.java#L59



Nice!
So is there appetite for such a patch to fail or warn (guardrail?) to
prevent a node running on a new version that does not support sstable
formats existing on other nodes?


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Josh McKenzie
> we (including me) have done a lot of stupid shit over the years on this 
> project. Half the time “this is how we’ve historically done X” to me is a 
> strong argument to start doing things differently. 
Oof. The truth (when applied to myself) hurts doesn't it? :)

> I suggest we should have a way to read/write from/to all sstable versions, I 
> absolutely agree this is useful (e.g. backups in storage). And we should be 
> better at thorough testing. 
Having an external library that both C* and other tools could rely on that 
handles SSTable reading and writing could actually be a very clean solution to 
helping encourage a broader ecosystem of things that want to interface with 
Cassandra but don't necessarily want to go through the StorageEngine to do it. 
Nevermind the value it'd bring to the table internally in terms of supporting 
longer upgrade cycles in C*, making what Claude is wrestling with on downgrades 
a lot simpler, etc.

Would probably be much cleaner to test thoroughly and less overhead to continue 
to maintain support for longer term as well.

On Fri, Mar 17, 2023, at 12:28 PM, Jeremiah D Jordan wrote:
> > As for precedent - we (including me) have done a lot of stupid shit over 
> > the years on this project. Half the time “this is how we’ve historically 
> > done X” to me is a strong argument to start doing things differently. This 
> > is one such case.
> 
> +1.  I definitely agree that this is one area of precedent that we should not 
> be following.  The project has historically been fairly hostile towards 
> longer upgrade timelines, I am glad to see all the recent conversations where 
> this seems to be improving.
> 
> -Jeremiah


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Jeremiah D Jordan
> As for precedent - we (including me) have done a lot of stupid shit over the 
> years on this project. Half the time “this is how we’ve historically done X” 
> to me is a strong argument to start doing things differently. This is one 
> such case.

+1.  I definitely agree that this is one area of precedent that we should not 
be following.  The project has historically been fairly hostile towards longer 
upgrade timelines, I am glad to see all the recent conversations where this 
seems to be improving.

-Jeremiah

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Brandon Williams
On Fri, Mar 17, 2023 at 9:25 AM Mick Semb Wever  wrote:
> Question/Suggestion: should we improve gossip to include what the oldest 
> format a node has, and ensure newer versioned node joining fail/warn if it 
> does > not support that older format?  That is, should we give a clear signal 
> back to operators that their rolling upgrade is not going to work smoothly, 
> that they are > going to hit nodes they will need to stop and do 
> upgradesstables on (leaving them in a state of mix-versions and nodes busy 
> upgrading…)

We already have this (even in 3.0!) to facilitate dropping compact
storage: 
https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/gms/ApplicationState.java#L59


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Aleksey Yeshchenko
> Saying that there is never any complexity and we should keep formats in 
> perpetuity, and I'm sitting here having a heart attack, srsly. 

Nobody is claiming that. Don’t let a straw man give you a heart attack.

> Though I _always_ recommend users upgrade all sstables, before and after 
> every major upgrade.  But I recognise how easy it is to forget or err in that 
> process, and we don't need to punish operators unnecessarily. 

Bit arrogant to call this an error. An operator might have good reasons to not 
do this. Clusters can be quite large, after all.

> Again, I would always recommend a backup before each major upgrade, and I 
> would think this has become standard advice.  On sstables residing in 
> storage, and the need to do a full backup, that's another good point, but 
> which I think we might solve in a smarter way (see below).

See above. Also, not seeing the follow up to “see below”.

> I beg to differ on this. We don't test it, and upgrade code gets limited 
> production time.

The code to read -m* sstables has been heavily battle-tested. Luckily for the 
project, there are users who test this *very* thoroughly before upgrading, who 
are incentivised to file bug reports, and are very much capable of fixing them 
while at it.

As for precedent - we (including me) have done a lot of stupid shit over the 
years on this project. Half the time “this is how we’ve historically done X” to 
me is a strong argument to start doing things differently. This is one such 
case.
 
—
AY

> On 17 Mar 2023, at 14:24, Mick Semb Wever  wrote:
> 
> 
> Ok ok, there's a number of strong arguments to keep sstable formats around 
> for much longer than the previous major Cassandra version, I will unset 
> fixVersion on 18312  :-)   
> 
> 
> Taking a look at the history of sstable formats. They were first introduced 
> in version 0.7, and minor versions introduced in version 1.0.3 with hb.
> 
> Looking at when we have dropped support and cleaned up the code for past 
> formats.
> 
>  - Versions before 1.2.5: formats <=ib; were removed in CASSANDRA-5511
> https://github.com/apache/cassandra/commit/7f2c3a8e40f97c626def5c510d77c1da3d9ae926
> 
>  - Version 1.2.5: format ic; were remove in CASSANDRA-6869
> https://github.com/apache/cassandra/commit/8e172c8563a995808a72a1a7e81a06f3c2a355ce
> 
>  - All pre-3.0 formats were removed in CASSANDRA-12716 
> https://github.com/apache/cassandra/commit/4a2464192e9e69457f5a5ecf26c094f9298bf069
>  
> 
> Saying that dropping the n* formats right now is such a small reduction in 
> code, roughly double the size of 6869's patch, I agree with.  Saying that 
> there is never any complexity and we should keep formats in perpetuity, and 
> I'm sitting here having a heart attack, srsly.  I can also appreciate coming 
> up with a good rule of thumb in advance is difficult when we just don't know 
> how many formats there will be and what they will introduce.
> 
> 
> From Aleksey:
> > But it’s one thing to require a two rolling restarts (3.0 to 4.0, 4.0 to 
> > 5.0), it’s another to require the operator to upgrade every single m* 
> > sstable to n*. 
> 
> 
> Good point. 
> 
> Though I _always_ recommend users upgrade all sstables, before and after 
> every major upgrade.  But I recognise how easy it is to forget or err in that 
> process, and we don't need to punish operators unnecessarily.  Also worth 
> noting since 4.x we have `automatic_sstable_upgrade` (which is wisely false 
> by default).
> 
> Question/Suggestion: should we improve gossip to include what the oldest 
> format a node has, and ensure newer versioned node joining fail/warn if it 
> does not support that older format?  That is, should we give a clear signal 
> back to operators that their rolling upgrade is not going to work smoothly, 
> that they are going to hit nodes they will need to stop and do 
> upgradesstables on (leaving them in a state of mix-versions and nodes busy 
> upgrading…)
> 
> 
> From Scott:
>> To expand on the final point he makes re: requiring SSTables be fully 
>> rewritten prior to rev'ing from 4.x to 5.x (if the cluster previously ran 
>> 3.x) –
>> 
>> This would also invalidate incremental backups. Operators would either be 
>> required to perform a full snapshot backup of each cluster to object storage 
>> prior to upgrading from 4.x to 5.x; or to enumerate the contents of all 
>> snapshots from an incremental backup series to ensure that no m*-series 
>> SSTables were present prior to upgrading.
>> 
>> If one failed to take on the work to do so, incremental backup snapshots 
>> would not be restorable to a 5.x cluster if an m*-series SSTable were 
>> present.
> 
> Again, I would always recommend a backup before each major upgrade, and I 
> would think this has become standard advice.  On sstables residing in 
> storage, and the need to do a full backup, that's another good point, but 
> which I think we might solve in a smarter way (see below).
> 
>  
> From Aleksey:
>>> 2. It’s very stable and 

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Mick Semb Wever
Ok ok, there's a number of strong arguments to keep sstable formats around
for much longer than the previous major Cassandra version, I will unset
fixVersion on 18312  :-)


Taking a look at the history of sstable formats. They were first introduced
in version 0.7, and minor versions introduced in version 1.0.3 with hb.

Looking at when we have dropped support and cleaned up the code for past
formats.

 - Versions before 1.2.5: formats <=ib; were removed in CASSANDRA-5511
https://github.com/apache/cassandra/commit/7f2c3a8e40f97c626def5c510d77c1da3d9ae926

 - Version 1.2.5: format ic; were remove in CASSANDRA-6869
https://github.com/apache/cassandra/commit/8e172c8563a995808a72a1a7e81a06f3c2a355ce

 - All pre-3.0 formats were removed in CASSANDRA-12716
https://github.com/apache/cassandra/commit/4a2464192e9e69457f5a5ecf26c094f9298bf069


Saying that dropping the n* formats right now is such a small reduction in
code, roughly double the size of 6869's patch, I agree with.  Saying that
there is never any complexity and we should keep formats in perpetuity, and
I'm sitting here having a heart attack, srsly.  I can also appreciate
coming up with a good rule of thumb in advance is difficult when we just
don't know how many formats there will be and what they will introduce.


>From Aleksey:
> But it’s one thing to require a two rolling restarts (3.0 to 4.0, 4.0 to
5.0), it’s another to require the operator to upgrade every single m*
sstable to n*.


Good point.

Though I _always_ recommend users upgrade all sstables, before and after
every major upgrade.  But I recognise how easy it is to forget or err in
that process, and we don't need to punish operators unnecessarily.  Also
worth noting since 4.x we have `automatic_sstable_upgrade` (which is wisely
false by default).

Question/Suggestion: should we improve gossip to include what the oldest
format a node has, and ensure newer versioned node joining fail/warn if it
does not support that older format?  That is, should we give a clear signal
back to operators that their rolling upgrade is not going to work smoothly,
that they are going to hit nodes they will need to stop and do
upgradesstables on (leaving them in a state of mix-versions and nodes busy
upgrading…)


>From Scott:

> To expand on the final point he makes re: requiring SSTables be fully
> rewritten prior to rev'ing from 4.x to 5.x (if the cluster previously ran
> 3.x) –
>
> This would also invalidate incremental backups. Operators would either be
> required to perform a full snapshot backup of each cluster to object
> storage prior to upgrading from 4.x to 5.x; or to enumerate the contents of
> all snapshots from an incremental backup series to ensure that no m*-series
> SSTables were present prior to upgrading.
>
> If one failed to take on the work to do so, incremental backup snapshots
> would not be restorable to a 5.x cluster if an m*-series SSTable were
> present.
>
>
Again, I would always recommend a backup before each major upgrade, and I
would think this has become standard advice.  On sstables residing in
storage, and the need to do a full backup, that's another good point, but
which I think we might solve in a smarter way (see below).


>From Aleksey:

> 2. It’s very stable and battle tested at this point
>
>

I beg to differ on this. We don't test it, and upgrade code gets limited
production time.  And I bet operators are less incentivised to file bug
reports on upgrade issues so long as they get through the upgrade one way
or another (and I bet many issues pop up why too late, like the numerous
range tombstone issues over many 3.11.x versions).

We could be testing it more, and IMHO we should…



> 5. There are third-party tools that I know of which benefit from a single
> C* jar that can read all relevant stable versions, and relevant here
> includes 3.0 ones
>
>

I suggest we should have a way to read/write from/to all sstable versions,
I absolutely agree this is useful (e.g. backups in storage). And we should
be better at thorough testing.

With such use-cases only applying only to node-local and offline scenarios,
we can tackle this cross-branch, i.e. take the best of both worlds: simpler
_tested_ code, and forward (and hopefully backward) compatibility _into
perpetuity_.

One example of this is if we could stream sstableupgrades, e.g.
```
   # read from disk any l* sstables, write to disk latest m format
   sstableupgrade-3.11 --stream-output -f jb-1-big-Data.db  |
sstableupgrade-5.0 --stream-input
```
Sure, this is no longer "single C* jar", but that seems a minor trade-off
to get something better. The idea of cross-branch functionality and testing
is nothing new to us (e.g. jvm dtests). Note, this approach would likely be
slower unless you threw cpu+mem at it. And it is applicable regardless of
what the format compatibility policy we decide…

The suggestion, even if it's only a strawman, raises some other questions …

- Why doesn't sstableupgrade today upgrade sstables in parallel, 

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Josh McKenzie
It's always seemed a little odd to me that we drop all the "read old format" 
code given how little maintenance that code takes over time. The ability to 
have a C* node read older format SSTables into perpetuity *seems* like a pretty 
compelling usability feature to me (for some of the reasons mentioned in this 
thread).

So personally, -1 to removing the code for 3.0, and generally think we should 
reconsider how long we maintain support specifically for reading older format 
files as we move forward given this is long-lived infra software. Probably 
pain-points I'm not thinking of here, but worth a deeper discussion IMO.

On Tue, Mar 14, 2023, at 12:02 PM, Brandon Williams wrote:
> On Mon, Mar 13, 2023 at 5:54 PM Mick Semb Wever  wrote:
> 
> > Personally I am not in favour of keeping, or recommending users use,
> > code we don't test.
> 
> How much effort would it be to have some simple smoke tests?  I think
> we should make sure nothing gets indirectly broken if we're going to
> keep it around.
> 


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Brandon Williams
On Mon, Mar 13, 2023 at 5:54 PM Mick Semb Wever  wrote:

> Personally I am not in favour of keeping, or recommending users use,
> code we don't test.

How much effort would it be to have some simple smoke tests?  I think
we should make sure nothing gets indirectly broken if we're going to
keep it around.


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Aleksey Yeshchenko
I’m not sure we have an explicit rule at the moment. Would probably based on 
calendar time in addition to release versions if we had one.

Addressing these on case by case basis for now is fine IMO.

I’d say the general principle should be treat extended compatibility (between 
releases in general and subcomponents) should be treated more as a feature and 
less as a liability. In this particular case you have compatibility by default 
by doing nothing, it’s not a huge complexity burden, it’s a tiny amount of 
code, it has clear value, and you have to get out of your way to actually 
cripple it and remove the compatibility. Bad idea.

> On 14 Mar 2023, at 15:04, Jacek Lewandowski  
> wrote:
> 
> Do we consider it as an occasional exception to the rule or we will define a 
> rule which explicitly says how many versions the user can expect to be 
> supported?



Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Jacek Lewandowski
Hi,

Do we consider it as an occasional exception to the rule or we will define
a rule which explicitly says how many versions the user can expect to be
supported?

I'm slightly towards keeping the support for Mx versions just because the
time gap between 3.11 and 4.0 was very long. I suppose many people are
still on 3.11 and we should not make their life harder when they consider
upgrading directly to 5.0. Though clear rules for that would help the users
and us.

thanks,


- - -- --- -  -
Jacek Lewandowski


wt., 14 mar 2023 o 15:36 C. Scott Andreas  napisał(a):

> I agree with Aleksey's view here.
>
> To expand on the final point he makes re: requiring SSTables be fully
> rewritten prior to rev'ing from 4.x to 5.x (if the cluster previously ran
> 3.x) –
>
> This would also invalidate incremental backups. Operators would either be
> required to perform a full snapshot backup of each cluster to object
> storage prior to upgrading from 4.x to 5.x; or to enumerate the contents of
> all snapshots from an incremental backup series to ensure that no m*-series
> SSTables were present prior to upgrading.
>
> If one failed to take on the work to do so, incremental backup snapshots
> would not be restorable to a 5.x cluster if an m*-series SSTable were
> present.
>
> – Scott
>
> On Mar 14, 2023, at 4:38 AM, Aleksey Yeshchenko  wrote:
>
>
> Raising messaging service minimum, I have a less strong opinion on, but on
> dropping m* sstable code I’m strongly -1.
>
> 1. This is code on a rarely touched path
> 2. It’s very stable and battle tested at this point
> 3. Removing it doesn’t reduce much complexity at all, just a few branches
> are affected
> 4. Removing code comes with risk
> 5. There are third-party tools that I know of which benefit from a single
> C* jar that can read all relevant stable versions, and relevant here
> includes 3.0 ones
>
> Removing a little of battle-tested reliable code and a tinier amount of
> complexity is not, to me, a benefit enough to justify intentionally
> breaking perfectly good and useful functionality.
>
> Oh, to add to that - if an operator wishes to upgrade from 3.0 to 5.0, and
> we don’t support it directly, I think most of us are fine with the
> requirement to go through a 4.X release first. But it’s one thing to
> require a two rolling restarts (3.0 to 4.0, 4.0 to 5.0), it’s another to
> require the operator to upgrade every single m* sstable to n*. Especially
> when we have perfectly working code to read those. That’s incredibly
> wasteful.
>
> AY
>
> On 13 Mar 2023, at 22:54, Mick Semb Wever  wrote:
>
> If we do not recommend and do not test direct upgrades from 3.x to
> 5.x, we can clean up a fair bit by removing code related to sstable
> formats m*, as Cassandra versions 4.x and 5.0 are all on sstable
> formats n*.
>
> We don't allow mixed-version streaming, so it's not possible today to
> stream any such older sstable format between nodes. This
> compatibility-break impacts only node-local and/or offline.
>
> Some arguments raised to keep m* sstable formats are:
> - offline cluster upgrade, e.g. direct from 3.x to 5.0,
> - single-invocation sstableupgrade usage
> - third-party tools based on the above
>
> Personally I am not in favour of keeping, or recommending users use,
> code we don't test.
>
> An _example_ of the code that can be cleaned up is in the patch
> attached to the ticket:
> CASSANDRA-18312 – Drop support for sstable formats before `na`
>
> What do you think?
>
>
>
>
>
>
>


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread J. D. Jordan
Agreed. I also think it is worthwhile to keep that code around. Given how 
widespread C* 3.x use is, I do not think it is worthwhile dropping support for 
those sstable formats at this time.

-Jeremiah

> On Mar 14, 2023, at 9:36 AM, C. Scott Andreas  wrote:
> 
> 
> I agree with Aleksey's view here.
> 
> To expand on the final point he makes re: requiring SSTables be fully 
> rewritten prior to rev'ing from 4.x to 5.x (if the cluster previously ran 
> 3.x) –
> 
> This would also invalidate incremental backups. Operators would either be 
> required to perform a full snapshot backup of each cluster to object storage 
> prior to upgrading from 4.x to 5.x; or to enumerate the contents of all 
> snapshots from an incremental backup series to ensure that no m*-series 
> SSTables were present prior to upgrading.
> 
> If one failed to take on the work to do so, incremental backup snapshots 
> would not be restorable to a 5.x cluster if an m*-series SSTable were present.
> 
> – Scott
> 
>> On Mar 14, 2023, at 4:38 AM, Aleksey Yeshchenko  wrote:
>> 
>> 
>> Raising messaging service minimum, I have a less strong opinion on, but on 
>> dropping m* sstable code I’m strongly -1.
>> 
>> 1. This is code on a rarely touched path
>> 2. It’s very stable and battle tested at this point
>> 3. Removing it doesn’t reduce much complexity at all, just a few branches 
>> are affected
>> 4. Removing code comes with risk
>> 5. There are third-party tools that I know of which benefit from a single C* 
>> jar that can read all relevant stable versions, and relevant here includes 
>> 3.0 ones
>> 
>> Removing a little of battle-tested reliable code and a tinier amount of 
>> complexity is not, to me, a benefit enough to justify intentionally breaking 
>> perfectly good and useful functionality.
>> 
>> Oh, to add to that - if an operator wishes to upgrade from 3.0 to 5.0, and 
>> we don’t support it directly, I think most of us are fine with the 
>> requirement to go through a 4.X release first. But it’s one thing to require 
>> a two rolling restarts (3.0 to 4.0, 4.0 to 5.0), it’s another to require the 
>> operator to upgrade every single m* sstable to n*. Especially when we have 
>> perfectly working code to read those. That’s incredibly wasteful.
>> 
>> AY
>> 
>>> On 13 Mar 2023, at 22:54, Mick Semb Wever  wrote:
>>> 
>>> If we do not recommend and do not test direct upgrades from 3.x to
>>> 5.x, we can clean up a fair bit by removing code related to sstable
>>> formats m*, as Cassandra versions 4.x and 5.0 are all on sstable
>>> formats n*.
>>> 
>>> We don't allow mixed-version streaming, so it's not possible today to
>>> stream any such older sstable format between nodes. This
>>> compatibility-break impacts only node-local and/or offline.
>>> 
>>> Some arguments raised to keep m* sstable formats are:
>>> - offline cluster upgrade, e.g. direct from 3.x to 5.0,
>>> - single-invocation sstableupgrade usage
>>> - third-party tools based on the above
>>> 
>>> Personally I am not in favour of keeping, or recommending users use,
>>> code we don't test.
>>> 
>>> An _example_ of the code that can be cleaned up is in the patch
>>> attached to the ticket:
>>> CASSANDRA-18312 – Drop support for sstable formats before `na`
>>> 
>>> What do you think?
> 
> 
> 
> 
> 


Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread C. Scott Andreas

I agree with Aleksey's view here.To expand on the final point he makes re: requiring 
SSTables be fully rewritten prior to rev'ing from 4.x to 5.x (if the cluster previously ran 
3.x) –This would also invalidate incremental backups. Operators would either be required to 
perform a full snapshot backup of each cluster to object storage prior to upgrading from 
4.x to 5.x; or to enumerate the contents of all snapshots from an incremental backup series 
to ensure that no m*-series SSTables were present prior to upgrading.If one failed to take 
on the work to do so, incremental backup snapshots would not be restorable to a 5.x cluster 
if an m*-series SSTable were present.– ScottOn Mar 14, 2023, at 4:38 AM, Aleksey Yeshchenko 
 wrote:Raising messaging service minimum, I have a less strong 
opinion on, but on dropping m* sstable code I’m strongly -1.1. This is code on a rarely 
touched path2. It’s very stable and battle tested at this point3. Removing it doesn’t 
reduce much complexity at all, just a few branches are affected4. Removing code comes with 
risk5. There are third-party tools that I know of which benefit from a single C* jar that 
can read all relevant stable versions, and relevant here includes 3.0 onesRemoving a little 
of battle-tested reliable code and a tinier amount of complexity is not, to me, a benefit 
enough to justify intentionally breaking perfectly good and useful functionality.Oh, to add 
to that - if an operator wishes to upgrade from 3.0 to 5.0, and we don’t support it 
directly, I think most of us are fine with the requirement to go through a 4.X release 
first. But it’s one thing to require a two rolling restarts (3.0 to 4.0, 4.0 to 5.0), it’s 
another to require the operator to upgrade every single m* sstable to n*. Especially when 
we have perfectly working code to read those. That’s incredibly wasteful.AYOn 13 Mar 2023, 
at 22:54, Mick Semb Wever  wrote:If we do not recommend and do not 
test direct upgrades from 3.x to5.x, we can clean up a fair bit by removing code related to 
sstableformats m*, as Cassandra versions 4.x and  5.0 are all on sstableformats n*.We don't 
allow mixed-version streaming, so it's not possible today tostream any such older sstable 
format between nodes. Thiscompatibility-break impacts only node-local and/or offline.Some 
arguments raised to keep m* sstable formats are:- offline cluster upgrade, e.g. direct from 
3.x to 5.0,- single-invocation sstableupgrade usage- third-party tools based on the 
abovePersonally I am not in favour of keeping, or recommending users use,code we don't 
test.An _example_ of the code that can be cleaned up is in the patchattached to the 
ticket:CASSANDRA-18312 – Drop support for sstable formats before `na`What do you think?

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Aleksey Yeshchenko
Raising messaging service minimum, I have a less strong opinion on, but on 
dropping m* sstable code I’m strongly -1.

1. This is code on a rarely touched path
2. It’s very stable and battle tested at this point
3. Removing it doesn’t reduce much complexity at all, just a few branches are 
affected
4. Removing code comes with risk
5. There are third-party tools that I know of which benefit from a single C* 
jar that can read all relevant stable versions, and relevant here includes 3.0 
ones

Removing a little of battle-tested reliable code and a tinier amount of 
complexity is not, to me, a benefit enough to justify intentionally breaking 
perfectly good and useful functionality.

Oh, to add to that - if an operator wishes to upgrade from 3.0 to 5.0, and we 
don’t support it directly, I think most of us are fine with the requirement to 
go through a 4.X release first. But it’s one thing to require a two rolling 
restarts (3.0 to 4.0, 4.0 to 5.0), it’s another to require the operator to 
upgrade every single m* sstable to n*. Especially when we have perfectly 
working code to read those. That’s incredibly wasteful.

AY

> On 13 Mar 2023, at 22:54, Mick Semb Wever  wrote:
> 
> If we do not recommend and do not test direct upgrades from 3.x to
> 5.x, we can clean up a fair bit by removing code related to sstable
> formats m*, as Cassandra versions 4.x and  5.0 are all on sstable
> formats n*.
> 
> We don't allow mixed-version streaming, so it's not possible today to
> stream any such older sstable format between nodes. This
> compatibility-break impacts only node-local and/or offline.
> 
> Some arguments raised to keep m* sstable formats are:
> - offline cluster upgrade, e.g. direct from 3.x to 5.0,
> - single-invocation sstableupgrade usage
> - third-party tools based on the above
> 
> Personally I am not in favour of keeping, or recommending users use,
> code we don't test.
> 
> An _example_ of the code that can be cleaned up is in the patch
> attached to the ticket:
> CASSANDRA-18312 – Drop support for sstable formats before `na`
> 
> What do you think?



[DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-13 Thread Mick Semb Wever
If we do not recommend and do not test direct upgrades from 3.x to
5.x, we can clean up a fair bit by removing code related to sstable
formats m*, as Cassandra versions 4.x and  5.0 are all on sstable
formats n*.

We don't allow mixed-version streaming, so it's not possible today to
stream any such older sstable format between nodes. This
compatibility-break impacts only node-local and/or offline.

Some arguments raised to keep m* sstable formats are:
 - offline cluster upgrade, e.g. direct from 3.x to 5.0,
 - single-invocation sstableupgrade usage
 - third-party tools based on the above

Personally I am not in favour of keeping, or recommending users use,
code we don't test.

An _example_ of the code that can be cleaned up is in the patch
attached to the ticket:
CASSANDRA-18312 – Drop support for sstable formats before `na`

What do you think?