Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

Berenguer Blasi Thu, 15 Feb 2024 01:27:24 -0800

On the merging failing tests discussion I _do_ spend the time looking ifmy patch did cause them or not and I certainly enforce that in thereviews I do. The current failures are a manageable number to checkagainst Butler/Jenkins/Circle/Jira so I was under the impressioneverybody else was also doing it.

Thanks for bringing up the CI discussion. I have been advocatinginternally to cut down circle CI usage for 1y. I am happy to see theconcern is shared. We also run the same dtest 4 times at least: vnodes,no vnodes,... cqlsh a number of times... unit tests the same... We'rewell beyond the 4.0 release when I remember I would see failures injunit-compression not to be found in the other variations. That wasmeaningful in those days. But I can't remember when did I recently finda failure _specific_ to a particular test flavor: cdc, compression,... Ithink it would be better ROI to let those super-rare (nowadays) becaught by nightly runs.


On 15/2/24 8:53, Jacek Lewandowski wrote:

I fully understand you. Although I have that luxury to use morecontainers, I simply feel that rerunning the same code with differentconfigurations which do not impact that code is just a waste ofresources and money.


- - -- --- ----- -------- -------------
Jacek Lewandowski

czw., 15 lut 2024 o 08:41 Štefan Miklošovič<stefan.mikloso...@gmail.com> napisał(a):


    By the way, I am not sure if it is all completely transparent and
    understood by everybody but let me guide you through a typical
    patch which is meant to be applied from 4.0 to trunk (4 branches)
    to see how it looks like.

    I do not have the luxury of running CircleCI on 100 containers, I
    have just 25. So what takes around 2.5h for 100 containers takes
    around 6-7 for 25. That is a typical java11_pre-commit_tests for
    trunk. Then I have to provide builds for java17_pre-commit_tests
    too, that takes around 3-4 hours because it just tests less, let's
    round it up to 10 hours for trunk.

    Then I need to do this for 5.0 as well, basically double the time
    because as I am writing this the difference is not too big between
    these two branches. So 20 hours.

    Then I need to build 4.1 and 4.0 too, 4.0 is very similar to 4.1
    when it comes to the number of tests, nevertheless, there are
    workflows for Java 8 and Java 11 for each so lets say this takes
    10 hours again. So together I'm 35.

    To schedule all the builds, trigger them, monitor their progress
    etc is work in itself. I am scripting this like crazy to not touch
    the UI in Circle at all and I made my custom scripts which call
    Circle API and it triggers the builds from the console to speed
    this up because as soon as a developer is meant to be clicking
    around all day, needing to tracking the progress, it gets old
    pretty quickly.

    Thank god this is just a patch from 4.0, when it comes to 3.0 and
    3.11 just add more hours to that.

    So all in all, a typical 4.0 - trunk patch is tested for two days
    at least, that's when all is nice and I do not need to rework it
    and rurun it again ... Does this all sound flexible and speedy
    enough for people?

    If we dropped the formal necessity to build various jvms it would
    significantly speed up the development.


    On Thu, Feb 15, 2024 at 8:10 AM Jacek Lewandowski
    <lewandowski.ja...@gmail.com> wrote:

            Excellent point, I was saying for some time that IMHO we
            can reduce to running in CI at least pre-commit:
            1) Build J11 2) build J17
            3) run tests with build 11 + runtime 11
            4) run tests with build 11 and runtime 17.


        Ekaterina, I was thinking more about:
        1) build J11
        2) build J17
        3) run tests with build J11 + runtime J11
        4) run smoke tests with build J17 and runtime J17

        Again, I don't see value in running build J11 and J17 runtime
        additionally to J11 runtime - just pick one unless we change
        something specific to JVM

        If we need to decide whether to test the latest or default, I
        think we should pick the latest because this is actually
        Cassandra 5.0 defined as a set of new features that will shine
        on the website.

        Also - we have configurations which test some features but
        they more like dimensions:
        - commit log compression
        - sstable compression
        - CDC
        - Trie memtables
        - Trie SSTable format
        - Extended deletion time
        ...

        Currently, with what we call the default configuration is
        tested with:
        - no compression, no CDC, no extended deletion time
        - *commit log compression + sstable compression*, no cdc, no
        extended deletion time
        - no compression, *CDC enabled*, no extended deletion time
        - no compression, no CDC, *enabled extended deletion time*

        This applies only to unit tests of course

        Then, are we going to test all of those scenarios with the
        "latest" configuration? I'm asking because the latest
        configuration is mostly about tries and UCS and has nothing to
        do with compression or CDC. Then why the default configuration
        should be tested more thoroughly than latest which enables
        essential Cassandra 5.0 features?

        I propose to significantly reduce that stuff. Let's
        distinguish the packages of tests that need to be run with CDC
        enabled / disabled, with commitlog compression enabled /
        disabled, tests that verify sstable formats (mostly io and
        index I guess), and leave other parameters set as with the
        latest configuration - this is the easiest way I think.

        For dtests we have vnodes/no-vnodes, offheap/onheap, and
        nothing about other stuff. To me running no-vnodes makes no
        sense because no-vnodes is just a special case of vnodes=1. On
        the other hand offheap/onheap buffers could be tested in unit
        tests. In short, I'd run dtests only with the default and
        latest configuration.

        Sorry for being too wordy,


        czw., 15 lut 2024 o 07:39 Štefan Miklošovič
        <stefan.mikloso...@gmail.com> napisał(a):

            Something along what Paulo is proposing makes sense to me.
            To sum it up, knowing what workflows we have now:

            java17_pre-commit_tests
            java11_pre-commit_tests
            java17_separate_tests
            java11_separate_tests

            We would have couple more, together like:

            java17_pre-commit_tests
            java17_pre-commit_tests-latest-yaml
            java11_pre-commit_tests
            java11_pre-commit_tests-latest-yaml
            java17_separate_tests
            java17_separate_tests-default-yaml
            java11_separate_tests
            java11_separate_tests-latest-yaml

            To go over Paulo's plan, his steps 1-3 for 5.0 would
            result in requiring just one workflow

            java11_pre-commit_tests

            when no configuration is touched and two workflows

            java11_pre-commit_tests
            java11_pre-commit_tests-latest-yaml

            when there is some configuration change.

            Now the term "some configuration change" is quite tricky
            and it is not always easy to evaluate if both default and
            latest yaml workflows need to be executed. It might happen
            that a change is of such a nature that it does not change
            the configuration but it is necessary to verify that it
            still works with both scenarios. -latest.yaml config might
            be such that a change would make sense to do in isolation
            for default config only but it would not work with
            -latest.yaml too. I don't know if this is just a
            theoretical problem or not but my gut feeling is that we
            would be safer if we just required both default and latest
            yaml workflows together.

            Even if we do, we basically replace "two jvms" builds for
            "two yamls" builds but I consider "two yamls" builds to be
            more valuable in general than "two jvms" builds. It would
            take basically the same amount of time, we would just
            reoriented our building matrix from different jvms to
            different yamls.

            For releases we would for sure need to just run it across
            jvms too.

            On Thu, Feb 15, 2024 at 7:05 AM Paulo Motta
            <pa...@apache.org> wrote:

                > Perhaps it is also a good opportunity to distinguish
                subsets of tests which make sense to run with a
                configuration matrix.

                Agree. I think we should define a “standard/golden”
                configuration for each branch and minimally require
                precommit tests for that configuration. Assignees and
                reviewers can determine if additional test variants
                are required based on the patch scope.

                Nightly and prerelease tests can be run to catch any
                issues outside the standard configuration based on the
                supported configuration matrix.

                On Wed, 14 Feb 2024 at 15:32 Jacek Lewandowski
                <lewandowski.ja...@gmail.com> wrote:

                    śr., 14 lut 2024 o 17:30 Josh McKenzie
                    <jmcken...@apache.org> napisał(a):

                        When we have failing tests people do not
                        spend the time to figure out if their logic
                        caused a regression and merge, making things
                        more unstable… so when we merge failing tests
                        that leads to people merging even more
                        failing tests...

                        What's the counter position to this Jacek /
                        Berenguer?


                    For how long are we going to deceive ourselves?
                    Are we shipping those features or not? Perhaps it
                    is also a good opportunity to distinguish subsets
                    of tests which make sense to run with a
                    configuration matrix.

                    If we don't add those tests to the pre-commit
                    pipeline, "people do not spend the time to figure
                    out if their logic caused a regression and merge,
                    making things more unstable…"
                    I think it is much more valuable to test those
                    various configurations rather than test against
                    j11 and j17 separately. I can see a really little
                    value in doing that.

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

Reply via email to