> why not implement backwards write compatibility?
+1 to this from a philosophical perspective. Keeping prior releases completely 
in the dark about new release sstable formats is a clean approach, and we 
should already have the code around to ser/deser the prior version's data on 
the next version.

On Wed, Feb 22, 2023, at 10:07 AM, Jeff Jirsa wrote:
> When people are serious about this requirement, they’ll build the downgrade 
> equivalents of the upgrade tests and run them automatically, often, so people 
> understand what the real gap is and when something new makes it break 
> 
> Until those tests exist, I think collectively we should all stop pretending 
> like this is dogma. Best effort is best effort. 
> 
> 
> 
>> On Feb 22, 2023, at 6:57 AM, Branimir Lambov <branimir.lam...@datastax.com> 
>> wrote:
>> 
>> > 1. Major SSTable changes should begin with forward-compatibility in a 
>> > prior release.
>> 
>> This requires "feature" changes, i.e. new non-trivial code for previous 
>> patch releases. It also entails porting over any further format modification.
>> 
>> Instead of this, in combination with your second point, why not implement 
>> backwards write compatibility? The opt-in is then clearer to define (i.e. 
>> upgrades start with e.g. a "4.1-compatible" settings set that includes file 
>> format compatibility and disabling of new features, new nodes start with 
>> "current" settings set). When the upgrade completes and the user is happy 
>> with the result, the settings set can be replaced.
>> 
>> Doesn't this achieve what you want (and we all agree is a worthy goal) with 
>> much less effort for everyone? Supporting backwards-compatible writing is 
>> trivial, and we even have a proof-of-concept in the stats metadata 
>> serializer. It also simplifies by a serious margin the amount of work and 
>> thinking one has to do when a format improvement is implemented -- e.g. the 
>> TTL patch can just address this in exactly the way the problem was addressed 
>> in earlier versions of the format, by capping to 2038, without any need to 
>> specify, obey or test any configuration flags.
>> 
>> >> It’s a commitment, and it requires every contributor to consider it as 
>> >> part of work they produce.
>> 
>> > But it shouldn't be a burden. Ability to downgrade is a testable problem, 
>> > so I see this work as a function of the suite of tests the project is 
>> > willing to agree on supporting.
>> 
>> I fully agree with this sentiment, and I feel that the current "try to not 
>> introduce breaking changes" approach is adding the burden, but not the 
>> benefits -- because the latter cannot be proven, and are most likely already 
>> broken.
>> 
>> Regards,
>> Branimir
>> 
>> On Wed, Feb 22, 2023 at 1:01 AM Abe Ratnofsky <a...@aber.io> wrote:
>>> Some interesting existing work on this subject is "Understanding and 
>>> Detecting Software Upgrade Failures in Distributed Systems" - 
>>> https://dl.acm.org/doi/10.1145/3477132.3483577 
>>> <https://urldefense.com/v3/__https://dl.acm.org/doi/10.1145/3477132.3483577__;!!PbtH5S7Ebw!ZUMhWOKjMaK62HKCGLYN0rAhZbbX8fOJkgCsfMgjYO5EgJQulefcb5pwH4q5oU5ylLl6W56W-NWm0FLO7w$>,
>>>  also summarized by Andrey Satarin here: 
>>> https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/
>>>  
>>> <https://urldefense.com/v3/__https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/__;!!PbtH5S7Ebw!ZUMhWOKjMaK62HKCGLYN0rAhZbbX8fOJkgCsfMgjYO5EgJQulefcb5pwH4q5oU5ylLl6W56W-NUfWWwFsA$>
>>> 
>>> They specifically tested Cassandra upgrades, and have a solid list of 
>>> defects that they found. They also describe their testing mechanism 
>>> DUPTester, which includes a component that confirms that the leftover state 
>>> from one version can start up on the next version. There is a wider scope 
>>> of upgrade defects highlighted in the paper, beyond SSTable version support.
>>> 
>>> I believe the project would benefit from expanding our test suite 
>>> similarly, by parametrizing more tests on upgrade version pairs.
>>> 
>>> Also, per Benedict's comment:
>>> 
>>> > It’s a commitment, and it requires every contributor to consider it as 
>>> > part of work they produce.
>>> 
>>> But it shouldn't be a burden. Ability to downgrade is a testable problem, 
>>> so I see this work as a function of the suite of tests the project is 
>>> willing to agree on supporting.
>>> 
>>> Specifically - I agree with Scott's proposal to emulate the HDFS 
>>> upgrade-then-finalize approach. I would also support automatic finalization 
>>> based on a time threshold or similar, to balance the priorities of safe and 
>>> straightforward upgrades. Users need to be aware of the range of SSTable 
>>> formats supported by a given version, and how to handle when their SSTables 
>>> wouldn't be supported by an upcoming upgrade.
>>> 
>>> --
>>> Abe
>> 
>> 
>> --
>> Branimir Lambov
>> e. branimir.lam...@datastax.com
>> w. www.datastax.com
>> 

Reply via email to