[ https://issues.apache.org/jira/browse/CASSANDRA-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17372929#comment-17372929 ]
Brandon Williams commented on CASSANDRA-16785: ---------------------------------------------- Ping [~edimitrova] > Improvements to CircleCI CI/CD pipeline > --------------------------------------- > > Key: CASSANDRA-16785 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16785 > Project: Cassandra > Issue Type: Improvement > Components: CI > Reporter: Johanna Griffin > Priority: Normal > Attachments: DataStax Config Review & Combined Dynamic Config Setup > Workflows (2021) (2).pdf > > > h1. About > DataStax is the company [associated with Apache’s open-source > Cassandra|https://stackoverflow.com/questions/24564725/apache-cassandra-vs-datastax-cassandra] > project. > > The DataStax team requested a configuration review to do a more modern and > general revamp of their pipeline for the 2021 config prepared with the Setup > Workflows feature (setup). > > GitHub Demo Repo for CircleCI Setup Workflows & Dynamic Configuration scripts > [here|https://github.com/CircleCI-Public/blog-dynamic-config-examples]. > > h1. Current Configuration & Workflow > > The current state of their configuration can be found at the [Cassandra > open-source GitHub > repository|https://github.com/datastax/cassandra/tree/ds-trunk/.circleci]. > > > Because Cassandra works with users contributing to the repo who are not a > part of the organization, they expect these users to be able to run tests on > CircleCI. > > There are three tiers: Medium, Large and X-Large test tiers that users may > qualify to run. Right now, the process to create a valid CircleCI > configuration to run is complex in that it involves copying the right config > to the .circleci/config.yml file based on what you’re attempting to run. > > Now: config.yml.LOWRES is the default config. MIDRES and HIGHRES are custom > configs for those who have access to premium CircleCI resources. > h1. Overview > h2. Goals and Desired Outcome > The team is looking for ways to modernize their workflow. Ideally, there is > one config.yml that will run various workflows based on the parameters > provided and contexts allowed to the team member. Updating the config master > should be done via GitHub and opening issues on the JIRA Board. > > Suggestions > # Currently the process is manual to trigger steps within the build. We'll > investigate the best ways to provide a more automotive experience to cut down > on developer time. > Solution: setup workflows, dynamic configuration/conditional workflows and > [restricted > contexts|https://circleci.com/docs/2.0/contexts/#running-workflows-with-a-restricted-context] > > # Easy User Management: The ability for their team to add users through the > UI instead of needing to submit a support ticket. > Solution: DevOps Customer Engineering worked with the product team in 2020 to > add shared organization management functionality to the UI. On our roadmap > for 2021 is to add users via the UI. See > # Tracking Test Results: Tracking test results across runs to get a > historical look at build times, flaky builds, etc. > h2. Q&A Pause > # Using the LOWRES config as an example which then can be translated to the > other config tiers, jobs with environment variables that could be stored in > [Contexts as environment variables and restricted or unrestricted based on > user > team/tier|https://circleci.com/docs/2.0/contexts/#restrict-a-context-to-a-security-group-or-groups] > or passed as a pipeline parameters: > > * Java 8 JVM Upgrade Distributed Tests (j8_jvm_upgrade_dtests) > * - ANT_HOME: /usr/share/ant > * - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64 > * - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * - LANG: en_US.UTF-8 > * - KEEP_TEST_DIR: true > * - DEFAULT_DIR: /home/cassandra/cassandra-dtest > * - PYTHONIOENCODING: utf-8 > * - PYTHONUNBUFFERED: true > * - CASS_DRIVER_NO_EXTENSIONS: true > * - CASS_DRIVER_NO_CYTHON: true > * - CASSANDRA_SKIP_SYNC: true > * - DTEST_REPO: git://github.com/apache/cassandra-dtest.git > * - DTEST_BRANCH: trunk > * - CCM_MAX_HEAP_SIZE: 1024M > * - CCM_HEAP_NEWSIZE: 256M > * - REPEATED_UTEST_TARGET: testsome > * - REPEATED_UTEST_CLASS: null > * - REPEATED_UTEST_METHODS: null > * - REPEATED_UTEST_COUNT: 100 > * - REPEATED_UTEST_STOP_ON_FAILURE: false > * - REPEATED_DTEST_NAME: null > * - REPEATED_DTEST_VNODES: false > * - REPEATED_DTEST_COUNT: 100 > * - REPEATED_DTEST_STOP_ON_FAILURE: false > * - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * Java 8 cqlsh Distributed Tests python 2 with VNodes > [j8_cqlsh-dtests-py2-with-vnodes] > * - ANT_HOME: /usr/share/ant > * - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64 > * - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * - LANG: en_US.UTF-8 > * - KEEP_TEST_DIR: true > * - DEFAULT_DIR: /home/cassandra/cassandra-dtest > * - PYTHONIOENCODING: utf-8 > * - PYTHONUNBUFFERED: true > * - CASS_DRIVER_NO_EXTENSIONS: true > * - CASS_DRIVER_NO_CYTHON: true > * - CASSANDRA_SKIP_SYNC: true > * - DTEST_REPO: git://github.com/apache/cassandra-dtest.git > * - DTEST_BRANCH: trunk > * - CCM_MAX_HEAP_SIZE: 1024M > * - CCM_HEAP_NEWSIZE: 256M > * - REPEATED_UTEST_TARGET: testsome > * - REPEATED_UTEST_CLASS: null > * - REPEATED_UTEST_METHODS: null > * - REPEATED_UTEST_COUNT: 100 > * - REPEATED_UTEST_STOP_ON_FAILURE: false > * - REPEATED_DTEST_NAME: null > * - REPEATED_DTEST_VNODES: false > * - REPEATED_DTEST_COUNT: 100 > * - REPEATED_DTEST_STOP_ON_FAILURE: false > * - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * Java 11 Unit Tests [j11_unit_tests] > * - ANT_HOME: /usr/share/ant > * - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64 > * - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64 > * - LANG: en_US.UTF-8 > * - KEEP_TEST_DIR: true > * - DEFAULT_DIR: /home/cassandra/cassandra-dtest > * - PYTHONIOENCODING: utf-8 > * - PYTHONUNBUFFERED: true > * - CASS_DRIVER_NO_EXTENSIONS: true > * - CASS_DRIVER_NO_CYTHON: true > * - CASSANDRA_SKIP_SYNC: true > * - DTEST_REPO: git://github.com/apache/cassandra-dtest.git > * - DTEST_BRANCH: trunk > * - CCM_MAX_HEAP_SIZE: 1024M > * - CCM_HEAP_NEWSIZE: 256M > * - REPEATED_UTEST_TARGET: testsome > * - REPEATED_UTEST_CLASS: null > * - REPEATED_UTEST_METHODS: null > * - REPEATED_UTEST_COUNT: 100 > * - REPEATED_UTEST_STOP_ON_FAILURE: false > * - REPEATED_DTEST_NAME: null > * - REPEATED_DTEST_VNODES: false > * - REPEATED_DTEST_COUNT: 100 > * - REPEATED_DTEST_STOP_ON_FAILURE: false > * - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64 > * - JDK_HOME: /usr/lib/jvm/java-11-openjdk-amd64 > * - CASSANDRA_USE_JDK11: true > * Java 8 cqlsh Distributed Tests python 3.8 NO VNodes > [j8_cqlsh-dtests-py38-no-vnodes] > * [etc] > h2. Image, Resource Class, and Executor Choice > A number of executors and resource sizes are defined, including default > options. Passing these along as part of a parameter allows for extra > flexibility. > _We recommend using our API to help get insights. A_ [_blog post > here_|https://circleci.com/blog/announcing-new-insights-endpoints-in-circleci-s-api-v2/] > _gives details on how it can be used, and may allow you some suggestions on > how to focus your efforts._ > _Reference for resources available can be found_ _here._ __ > h2. Parallelism > There are several jobs where parallelism is enabled and split by timings used > in line with our best practice recommendations. Consider increasing > parallelism as needed. > h2. Caching Strategies > > There are no [save_cache or restore_cache > steps|https://github.com/annapamma/spring-petclinic#caching-dependencies] in > your configuration. We normally suggest using a three-layer approach to > restoring cache if your team decides to implement this feature. Here is an > example from our documentation: > - restore_cache: > keys: > - gem-cache-v1-\{{ arch }}-{\{ .Branch }}-{\{ checksum > "Gemfile.lock" }} > - gem-cache-v1-\{{ arch }}-\{{ .Branch }} > - gem-cache-v1 > > This means that it attempts to load the cache in this order: > # A cache that matches the architecture, branch, and specific checksum of > the .lock file. > # A cache that matches the architecture and branch, but not the specific > checksum of the .lock file. > # A cache that simply matches the name defined, being the most generic > option. > > This allows branches to have their own caching options more efficiently and > may improve performance further. In your case, this would require a bit of > reworking. For a simpler but still efficient solution, I would recommend > something like the following: > > - restore_cache: > Keys: > - v3-npm-{{checksum "yarn.lock"}} > - v3-npm > > This allows at least some caching to be present even if the yarn file, as a > general example, has changed. > > _Adding an option to restore cache from a more “generic” source might improve > performance. Otherwise, everything is handled as per our best practices._ > > h2. Secrets Management > One way to limit access to running certain workflows is to create contexts > that are only available to certain teams. See > [https://circleci.com/docs/2.0/contexts/#restricting-a-context] > > If a user does not have access to the context used in the workflow, the > workflow will not run. > h2. Orbs > The Cassandra project lends itself to setting up custom Orbs nicely given the > amount of scripting within the config itself. > > You could aim to have every workflow essentially simply call on an > appropriate Orb, this is a very effective way of making use of them. Right > now your workflows essentially follow the same steps with only a few minor > variations that can be resolved with parameters. > > Suggestions for orb creation: > * Determine which unit tests to run > * Log environment information > * Run unit tests > > Please see if these work for your config: > * [Shallow Clone > Orb|https://circleci.com/developer/orbs/orb/guitarrapc/git-shallow-clone] > consideration if needed > * [Python Orb|https://circleci.com/developer/orbs/orb/circleci/python] and > creating your own orb if needed for configure virtualenv and python > Dependencies > > _A list of publicly accessible orbs can be found_ > [_here_|https://circleci.com/developer/orbs]_. An introduction to the > differences between private and public orbs, as well as instructions on how > to author an orb, can be found_ [_at this > link_|https://circleci.com/docs/2.0/orb-intro/#private-orbs-vs-public-orbs]_._ > h2. Reusable Config Elements > I highly suggest looking into Reusable Config Elements and Orbs based on the > repetitive nature of the Cassandra project config as it stands today. > More on using reusable configuration elements can be [found > here|https://circleci.com/docs/2.0/reusing-config/?section=configuration#authoring-reusable-commands]. > h3. Insights > h4. Internal DCE Dashboard - Alert prototyping > h2. Tracking Test Results via UI > Please add the step[ > store_test_results|https://circleci.com/docs/2.0/language-python/#upload-and-store-test-results] > to your config.yml file > Please reference the Insights homepage on the CircleCI Cloud application once > this is completed: > [https://app.circleci.com/pipelines/github/apache/cassandra] > Useful Links > * Project Optimisations: > [https://circleci.com/docs/2.0/optimizations/#section=projects] > * [Monitor and optimize your CI/CD pipeline with Insights from > CircleCI|https://circleci.com/blog/monitor-and-optimize-your-ci-cd-pipeline-with-insights-from-circleci/] > * [2020 State of DevOps Report: Presented by Puppet and > CircleCI|https://circleci.com/resources/state-of-devops-report-2020/] > h1. Summary > h3. *2021 Dynamic Configuration for Cassandra* > Our recommendation is to move into using the `setup` workflow key: [Dynamic > config|https://circleci.com/blog/building-cicd-pipelines-using-dynamic-config/] > lets you maintain multiple config.yml files in a single code repository, to > selectively identify and execute your primary config.yml files. This feature > offers a wide range of powerful capabilities to easily specify and execute a > variety of dynamic pipeline workloads. Please note: when using dynamic > configuration, a “continue” job from the > [continuation|https://circleci.com/developer/orbs/orb/circleci/continuation] > [orb|https://circleci.com/docs/2.0/orb-intro/] must be called at the end of > the setup workflow. > h3. *Automated Triggering of Builds* > See V2 API > Can trigger build through a [cron > job|https://circleci.com/docs/2.0/workflows/]. > > h3. *Tracking Test Results* > Insights API > DataStax Build Times: 15-20 minutes builds, also one an hour > Goal was moving it from an hour to 15-20 minutes > h2. Resources > Orb: [CircleCI Developer Hub - > circleci/continuation|https://circleci.com/developer/orbs/orb/circleci/continuation] > > Discuss Post: [Dynamic config via setup workflows now available to all > users|https://discuss.circleci.com/t/dynamic-config-via-setup-workflows-now-available-to-all-users/39909] > > Documentation: [Pipeline > Variables|https://circleci.com/docs/2.0/pipeline-variables/#conditional-workflows] > > > h2. Q&A -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org