Re: [DISCUSS] Merging incremental feature work

Derek Chen-Becker Fri, 03 Feb 2023 09:48:28 -0800

I think Henrik has a lot of good points. However, I want to point out that
JDK upgrades are non-optional over the fullness of time, so it might be
worth carving out a specific process for that work. In a similar vein,
security patches also Just Need to Merge™, so I'm a little hesitant when I
see that discussed alongside CEPs. Perhaps the work is done on a feature
branch to *defer* impact to trunk, but at some point we have to pay the
piper and deal with whatever the merge fallout is, whether that's
complexity or performance that needs to be addressed.


Cheers,

Derek

On Fri, Feb 3, 2023 at 9:24 AM Ekaterina Dimitrova <e.dimitr...@gmail.com>
wrote:

> I am all in for incremental changes which are fine to get into release.
> In the case of JDK 17 I know dependencies which need to be updated for
> Java 17 but they still work with Java8 and 11 so it is fine to update them
> before the switch.
> So while some blockers are removed, I do the updates in parallel to trunk.
>
> I also like the points made around volunteers who got pulled and not
> blocking ourselves on not finished features which are partially committed.
>
> Best regards,
> Ekaterina
>
>
> On Fri, 3 Feb 2023 at 9:37, Josh McKenzie <jmcken...@apache.org> wrote:
>
>> *Suddenly, you get to be doubly managerial for making _two_
>> inconsequential decisions - the wrong one _and_ the right one.*
>>
>> Something to aspire to. :)
>>
>> On Fri, Feb 3, 2023, at 9:20 AM, Henrik Ingo wrote:
>>
>> I've been an unusually active debater recently, so it might be
>> appropriate to start with a reminder/disclaimer that I'm not actually a
>> Cassandra contributor in any way(*), but I choose to share some thoughts
>> where I feel that sharing experiences from other open source database
>> projects can be of use to the discussion,
>>
>> *) No. Even if I agree with the idea that there are many types of
>> contributions other than code, the only thing I have contributed is
>> opinions :-D
>>
>>
>> As I think I mentioned in an Accord related thread, there are two
>> desirable goals at odds here:
>> 1. It is indeed better to merge smaller increments of work into trunk.
>> And to do this more frequently.
>> 2. On the other hand if we want to have an always shippable trunk with a
>> fixed date for feature freeze, it implies that only complete, or at least
>> harmless/inactive units of work can be merged to trunk.
>>
>> To give a specific example, I'll tell the story of the MySQL
>> transactional storage engine called Falcon... (Cue Disney soundtrack
>> featuring a harp, and blurring image...)
>>
>> As Oracle acquired the only transactional storage engine for MySQL,
>> InnoDB, it became a strategic priority to develop a new, in house engine
>> that could replace InnoDB in functionality. This project was codenamed
>> Falcon. Since it was the top priority for MySQL 6.0, it was developed in
>> the "main" v6.0 branch, because it was the definition of MySQL 6. The
>> release date would be whenever Falcon v1 is ready. Over time other
>> features, such as partitioning, transactional backups... were also built
>> into the main feature branch, and they depended on Falcon or were
>> Falcon-only.
>>
>> There was only one problem: Falcon never worked. In the end the v6.0
>> development branch just had to be abandoned and to this date MySQL has
>> never released a version 6. They had to skip a major version number because
>> of this.
>>
>> (As an epilogue that at least I personally was very amused with: When
>> Oracle had to argue their case with the EU Commission that they would be
>> allowed to acquire MySQL, one of the commitments Oracle lawyer made was to
>> continue to develop *and release* MySQL version 6.0. I read the
>> almost-legally-binding statement, and thought yeah good luck with that!)
>>
>>
>> I wasn't involved in the active development, but possibly a similar
>> example could be the TPC architecture introduced in DSE 6.0. (A Cassandra
>> derived proprietary product that I worked on some time ago.) At least by
>> the time I was involved, I can't say that the performance resulting from
>> that work would have been a net positive to the full population of
>> workloads. But because it had been developed directly in the main 6.0
>> branch, and because it is so invasive and core to everything else, it also
>> wasn't possible to roll back its introduction.
>>
>> Finally, in an open source project, it's good to remember that we are all
>> volunteers here, in some sense, and sometimes it could happen that
>> development of a feature stops half way because its developer disappears
>> and nobody else picks it up.
>>
>>
>>
>> So, returning back to this day and this database... Basically what you
>> want to avoid is to paint yourself into a corner, and particularly the
>> wrong corner. So the way I would answer this question is that large bodies
>> of work should:
>>  - Refactoring that is a) harmless, and/or b) improves the codebase
>> anyway, should be merged early into trunk.
>>  - The main body of the new functionality should be developed in a
>> feature branch up until some kind of MVP stage. This means that by the time
>> it is proposed for merge, a) it has been tested to both be of good quality
>> and that it actually provides the benefit it set out to implement. This
>> means that merging it to trunk will be a net improvement, always.
>>  - After that first big MVP merge, additional functionality of course
>> could be developed directly against trunk.
>>  - For patches that are very clean and self contained, for example in
>> their own Java package, it doesn't really matter, because they are easy to
>> roll back if needed. They can be developed against trunk.
>>
>> So applying this to Josh's examples:
>>
>> 1) I assume JDK17 support is invasive, so that would suggest a feature
>> branch. However, the next question is, is there any risk involved in this
>> work (like Falcon for MySQL). Hypothetically it could be that Java 17 has
>> worse performance than Java 11, or some other blocking problem is
>> encountered. But in practice we probably estimate that this risk is small.
>> In such a case JDK17 support could indeed be developed with small patches
>> directly against trunk, but this would be an exception to the rule!
>>
>> 2) To take an example of an approved CEP, surprisingly enough, the
>> humongous Accord patch is actually very clean and self contained, and would
>> be easy to remove (or disable with a feature flag, which has been done). So
>> it could have been developed against trunk. (But I'm not sure that was
>> obvious in the beginning of development?)
>>
>> 3) The work on SSTable tries and Memtable tries was even explicitly split
>> into separate CEPs for the API refactor and the new functionality.
>>
>>
>> Perhaps Linus Torvalds said the above more succintly than me:
>>
>>
>>
>> *So the name of the game is to _avoid_ decisions, at least the big and
>> painful ones. Making small and non-consequential decisions is fine, and
>> makes you look like you know what you're doing, so what a kernel manager
>> needs to do is to turn the big and painful ones into small things where
>> nobody really cares.It helps to realize that the key difference between a
>> big decision and a small one is whether you can fix your decision
>> afterwards. Any decision can be made small by just always making sure that
>> if you were wrong (and you _will_ be wrong), you can always undo the damage
>> later by backtracking. Suddenly, you get to be doubly managerial for making
>> _two_ inconsequential decisions - the wrong one _and_ the right one.*
>>
>>
>>
>> https://www.openlife.cc/onlinebook/epilogue-linux-kernel-management-style-linus-torvalds
>>
>>
>> (I particularly like the last sentence!)
>>
>> henrik
>>
>>
>> On Fri, Feb 3, 2023 at 2:06 PM Josh McKenzie <jmcken...@apache.org>
>> wrote:
>>
>>
>> The topic of how we handle merging large complex bodies of work came up
>> recently with the CEP-15 merge and JDK17, and we've faced this question in
>> the past as well (CASSANDRA-8099 comes to mind).
>>
>> The times we've done large bodies of work separately from trunk and then
>> merged them in have their own benefits and costs, and the examples I can
>> think of where we've merged in work to trunk incrementally with something
>> flagged experimental have markedly different cost/benefits. Further, the
>> two approaches have shaped the *way* we approached work quite
>> differently with how we architected and tested things.
>>
>> My current thinking: I'd like to propose we all agree to move to merge
>> work into trunk incrementally if it's either:
>> 1) New JDK support
>> 2) An approved CEP
>>
>> The bar for merging anything into trunk should remain:
>> 1) 2 +1's from committers
>> 2) Green CI (presently circle or ASF, in the future ideally ASF or an ASF
>> analog env)
>>
>> I don't know if this is a generally held opinion and we just haven't
>> discussed it and switched our general behavior yet, or if this is more
>> controversial, so I won't burden this email with enumerating pros and cons
>> of the two approaches until I get a gauge of the community's temperature.
>>
>> So - what do we think?
>>
>>
>>
>> --
>>
>>
>>
>> *Henrik Ingo*
>>
>> *c*. +358 40 569 7354
>>
>> *w*. *www.datastax.com <http://www.datastax.com>*
>>
>> <https://www.facebook.com/datastax>  <https://twitter.com/datastax>
>> <https://www.linkedin.com/company/datastax/>
>> <https://github.com/datastax/>
>>
>>
>>

-- 
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+

Re: [DISCUSS] Merging incremental feature work

Reply via email to