[jira] [Updated] (CASSANDRA-16785) Improvements to CircleCI CI/CD pipeline

Johanna Griffin (Jira) Thu, 01 Jul 2021 09:54:05 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Johanna Griffin updated CASSANDRA-16785:
----------------------------------------
    Description: 
h1. About

DataStax is the company [associated with Apache’s open-source 
Cassandra|https://stackoverflow.com/questions/24564725/apache-cassandra-vs-datastax-cassandra]
 project. 

 

The DataStax team requested a configuration review to do a more modern and 
general revamp of their pipeline for the 2021 config prepared with the Setup 
Workflows feature (setup).

 

GitHub Demo Repo for CircleCI Setup Workflows & Dynamic Configuration scripts 
[here|https://github.com/CircleCI-Public/blog-dynamic-config-examples].

 
h1. Current Configuration & Workflow

 

The current state of their configuration can be found at the [Cassandra 
open-source GitHub 
repository|https://github.com/datastax/cassandra/tree/ds-trunk/.circleci].

 

 

Because Cassandra works with users contributing to the repo who are not a part 
of the organization, they expect these users to be able to run tests on 
CircleCI. 

 

There are three tiers: Medium, Large and X-Large test tiers that users may 
qualify to run. Right now, the process to create a valid CircleCI configuration 
to run is complex in that it involves copying the right config to the 
.circleci/config.yml file based on what you’re attempting to run. 

 

Now: config.yml.LOWRES is the default config. MIDRES and HIGHRES are custom 
configs for those who have access to premium CircleCI resources.
h1. Overview
h2. Goals and Desired Outcome

The team is looking for ways to modernize their workflow. Ideally, there is one 
config.yml that will run various workflows based on the parameters provided and 
contexts allowed to the team member. Updating the config master should be done 
via GitHub and opening issues on the JIRA Board. 

 

Suggestions
 # Currently the process is manual to trigger steps within the build. We'll 
investigate the best ways to provide a more automotive experience to cut down 
on developer time.

Solution: setup workflows, dynamic configuration/conditional workflows and 
[restricted 
contexts|https://circleci.com/docs/2.0/contexts/#running-workflows-with-a-restricted-context]
  
 # Easy User Management: The ability for their team to add users through the UI 
instead of needing to submit a support ticket. 

Solution: DevOps Customer Engineering worked with the product team in 2020 to 
add shared organization management functionality to the UI. On our roadmap for 
2021 is to add users via the UI. See 
 # Tracking Test Results: Tracking test results across runs to get a historical 
look at build times, flaky builds, etc.

h2. Q&A Pause
 # Using the LOWRES config as an example which then can be translated to the 
other config tiers, jobs with environment variables that could be stored in 
[Contexts as environment variables and restricted or unrestricted based on user 
team/tier|https://circleci.com/docs/2.0/contexts/#restrict-a-context-to-a-security-group-or-groups]
 or passed as a pipeline parameters:

 
 * Java 8 JVM Upgrade Distributed Tests (j8_jvm_upgrade_dtests)
 *     - ANT_HOME: /usr/share/ant
 *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - LANG: en_US.UTF-8
 *     - KEEP_TEST_DIR: true
 *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
 *     - PYTHONIOENCODING: utf-8
 *     - PYTHONUNBUFFERED: true
 *     - CASS_DRIVER_NO_EXTENSIONS: true
 *     - CASS_DRIVER_NO_CYTHON: true
 *     - CASSANDRA_SKIP_SYNC: true
 *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
 *     - DTEST_BRANCH: trunk
 *     - CCM_MAX_HEAP_SIZE: 1024M
 *     - CCM_HEAP_NEWSIZE: 256M
 *     - REPEATED_UTEST_TARGET: testsome
 *     - REPEATED_UTEST_CLASS: null
 *     - REPEATED_UTEST_METHODS: null
 *     - REPEATED_UTEST_COUNT: 100
 *     - REPEATED_UTEST_STOP_ON_FAILURE: false
 *     - REPEATED_DTEST_NAME: null
 *     - REPEATED_DTEST_VNODES: false
 *     - REPEATED_DTEST_COUNT: 100
 *     - REPEATED_DTEST_STOP_ON_FAILURE: false
 *     - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64

 * Java 8 cqlsh Distributed Tests python 2 with VNodes 
[j8_cqlsh-dtests-py2-with-vnodes]
 *     - ANT_HOME: /usr/share/ant
 *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - LANG: en_US.UTF-8
 *     - KEEP_TEST_DIR: true
 *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
 *     - PYTHONIOENCODING: utf-8
 *     - PYTHONUNBUFFERED: true
 *     - CASS_DRIVER_NO_EXTENSIONS: true
 *     - CASS_DRIVER_NO_CYTHON: true
 *     - CASSANDRA_SKIP_SYNC: true
 *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
 *     - DTEST_BRANCH: trunk
 *     - CCM_MAX_HEAP_SIZE: 1024M
 *     - CCM_HEAP_NEWSIZE: 256M
 *     - REPEATED_UTEST_TARGET: testsome
 *     - REPEATED_UTEST_CLASS: null
 *     - REPEATED_UTEST_METHODS: null
 *     - REPEATED_UTEST_COUNT: 100
 *     - REPEATED_UTEST_STOP_ON_FAILURE: false
 *     - REPEATED_DTEST_NAME: null
 *     - REPEATED_DTEST_VNODES: false
 *     - REPEATED_DTEST_COUNT: 100
 *     - REPEATED_DTEST_STOP_ON_FAILURE: false
 *     - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64

 * Java 11 Unit Tests [j11_unit_tests]
 *     - ANT_HOME: /usr/share/ant
 *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - LANG: en_US.UTF-8
 *     - KEEP_TEST_DIR: true
 *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
 *     - PYTHONIOENCODING: utf-8
 *     - PYTHONUNBUFFERED: true
 *     - CASS_DRIVER_NO_EXTENSIONS: true
 *     - CASS_DRIVER_NO_CYTHON: true
 *     - CASSANDRA_SKIP_SYNC: true
 *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
 *     - DTEST_BRANCH: trunk
 *     - CCM_MAX_HEAP_SIZE: 1024M
 *     - CCM_HEAP_NEWSIZE: 256M
 *     - REPEATED_UTEST_TARGET: testsome
 *     - REPEATED_UTEST_CLASS: null
 *     - REPEATED_UTEST_METHODS: null
 *     - REPEATED_UTEST_COUNT: 100
 *     - REPEATED_UTEST_STOP_ON_FAILURE: false
 *     - REPEATED_DTEST_NAME: null
 *     - REPEATED_DTEST_VNODES: false
 *     - REPEATED_DTEST_COUNT: 100
 *     - REPEATED_DTEST_STOP_ON_FAILURE: false
 *     - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JDK_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - CASSANDRA_USE_JDK11: true

 * Java 8 cqlsh Distributed Tests python 3.8 NO VNodes 
[j8_cqlsh-dtests-py38-no-vnodes]
 * [etc]

h2. Image, Resource Class, and Executor Choice

A number of executors and resource sizes are defined, including default 
options. Passing these along as part of a parameter allows for extra 
flexibility.

_We recommend using our API to help get insights. A_ [_blog post 
here_|https://circleci.com/blog/announcing-new-insights-endpoints-in-circleci-s-api-v2/]
 _gives details on how it can be used, and may allow you some suggestions on 
how to focus your efforts._

_Reference for resources available can be found_ _here._ __ 
h2. Parallelism

There are several jobs where parallelism is enabled and split by timings used 
in line with our best practice recommendations. Consider increasing parallelism 
as needed.
h2. Caching Strategies

 

There are no [save_cache or restore_cache 
steps|https://github.com/annapamma/spring-petclinic#caching-dependencies] in 
your configuration. We normally suggest using a three-layer approach to 
restoring cache if your team decides to implement this feature. Here is an 
example from our documentation:
 - restore_cache:

          keys:

            - gem-cache-v1-\{{ arch }}-{\{ .Branch }}-{\{ checksum 
"Gemfile.lock" }}

            - gem-cache-v1-\{{ arch }}-\{{ .Branch }}

            - gem-cache-v1

 

This means that it attempts to load the cache in this order:
 # A cache that matches the architecture, branch, and specific checksum of the 
.lock file.
 # A cache that matches the architecture and branch, but not the specific 
checksum of the .lock file.
 # A cache that simply matches the name defined, being the most generic option.

 

This allows branches to have their own caching options more efficiently and may 
improve performance further. In your case, this would require a bit of 
reworking. For a simpler but still efficient solution, I would recommend 
something like the following:

 

     - restore_cache:

          Keys:

 - v3-npm-{{checksum "yarn.lock"}}

            - v3-npm

 

This allows at least some caching to be present even if the yarn file, as a 
general example, has changed.

 

_Adding an option to restore cache from a more “generic” source might improve 
performance. Otherwise, everything is handled as per our best practices._

 
h2. Secrets Management

One way to limit access to running certain workflows is to create contexts that 
are only available to certain teams. See 
[https://circleci.com/docs/2.0/contexts/#restricting-a-context]

 

If a user does not have access to the context used in the workflow, the 
workflow will not run.
h2. Orbs

The Cassandra project lends itself to setting up custom Orbs nicely given the 
amount of scripting within the config itself. 

 

You could aim to have every workflow essentially simply call on an appropriate 
Orb, this is a very effective way of making use of them. Right now your 
workflows essentially follow the same steps with only a few minor variations 
that can be resolved with parameters.

 

Suggestions for orb creation:
 * Determine which unit tests to run
 * Log environment information
 * Run unit tests

 

Please see if these work for your config:
 * [Shallow Clone 
Orb|https://circleci.com/developer/orbs/orb/guitarrapc/git-shallow-clone] 
consideration if needed
 * [Python Orb|https://circleci.com/developer/orbs/orb/circleci/python] and 
creating your own orb if needed for configure virtualenv and python Dependencies

 

_A list of publicly accessible orbs can be found_ 
[_here_|https://circleci.com/developer/orbs]_. An introduction to the 
differences between private and public orbs, as well as instructions on how to 
author an orb, can be found_ [_at this 
link_|https://circleci.com/docs/2.0/orb-intro/#private-orbs-vs-public-orbs]_._
h2. Reusable Config Elements

I highly suggest looking into Reusable Config Elements and Orbs based on the 
repetitive nature of the Cassandra project config as it stands today.

More on using reusable configuration elements can be [found 
here|https://circleci.com/docs/2.0/reusing-config/?section=configuration#authoring-reusable-commands].
h3. Insights 
h4. Internal DCE Dashboard - Alert prototyping
h2. Tracking Test Results via UI 

Please add the step[ 
store_test_results|https://circleci.com/docs/2.0/language-python/#upload-and-store-test-results]
 to your config.yml file

Please reference the Insights homepage on the CircleCI Cloud application once 
this is completed:

[https://app.circleci.com/pipelines/github/apache/cassandra] 

Useful Links
 * Project Optimisations: 
[https://circleci.com/docs/2.0/optimizations/#section=projects] 
 * [Monitor and optimize your CI/CD pipeline with Insights from 
CircleCI|https://circleci.com/blog/monitor-and-optimize-your-ci-cd-pipeline-with-insights-from-circleci/]
 * [2020 State of DevOps Report: Presented by Puppet and 
CircleCI|https://circleci.com/resources/state-of-devops-report-2020/]

h1. Summary
h3. *2021 Dynamic Configuration for Cassandra*

Our recommendation is to move into using the `setup` workflow key: [Dynamic 
config|https://circleci.com/blog/building-cicd-pipelines-using-dynamic-config/] 
lets you maintain multiple config.yml files in a single code repository, to 
selectively identify and execute your primary config.yml files. This feature 
offers a wide range of powerful capabilities to easily specify and execute a 
variety of dynamic pipeline workloads. Please note: when using dynamic 
configuration, a “continue” job from the 
[continuation|https://circleci.com/developer/orbs/orb/circleci/continuation] 
[orb|https://circleci.com/docs/2.0/orb-intro/] must be called at the end of the 
setup workflow.
h3. *Automated Triggering of Builds*

See V2 API

Can trigger build through a [cron job|https://circleci.com/docs/2.0/workflows/].

 
h3. *Tracking Test Results*

Insights API

DataStax Build Times: 15-20 minutes builds, also one an hour 

Goal was moving it from an hour to 15-20 minutes
h2. Resources 

Orb: [CircleCI Developer Hub - 
circleci/continuation|https://circleci.com/developer/orbs/orb/circleci/continuation]
 

Discuss Post: [Dynamic config via setup workflows now available to all 
users|https://discuss.circleci.com/t/dynamic-config-via-setup-workflows-now-available-to-all-users/39909]
 

Documentation: [Pipeline 
Variables|https://circleci.com/docs/2.0/pipeline-variables/#conditional-workflows]
 

 
h2. Q&A

  was:
h1. About

DataStax is the company [associated with Apache’s open-source 
Cassandra|https://stackoverflow.com/questions/24564725/apache-cassandra-vs-datastax-cassandra]
 project. 

 

The DataStax team requested a configuration review to do a more modern and 
general revamp of their pipeline for the 2021 config prepared with the Setup 
Workflows feature (setup).

 

GitHub Demo Repo for CircleCI Setup Workflows & Dynamic Configuration scripts 
[here|https://github.com/CircleCI-Public/blog-dynamic-config-examples].

 
h1. Current Configuration & Workflow

 

The current state of their configuration can be found at the [Cassandra 
open-source GitHub 
repository|https://github.com/datastax/cassandra/tree/ds-trunk/.circleci].

 

 

Because Cassandra works with users contributing to the repo who are not a part 
of the organization, they expect these users to be able to run tests on 
CircleCI. 

 

There are three tiers: Medium, Large and X-Large test tiers that users may 
qualify to run. Right now, the process to create a valid CircleCI configuration 
to run is complex in that it involves copying the right config to the 
.circleci/config.yml file based on what you’re attempting to run. 

 

Now: config.yml.LOWRES is the default config. MIDRES and HIGHRES are custom 
configs for those who have access to premium CircleCI resources.













h1. Overview
h2. Goals and Desired Outcome

The team is looking for ways to modernize their workflow. Ideally, there is one 
config.yml that will run various workflows based on the parameters provided and 
contexts allowed to the team member. Updating the config master should be done 
via GitHub and opening issues on the JIRA Board. 

 

Suggestions
 #  Currently the process is manual to trigger steps within the build. We'll 
investigate the best ways to provide a more automotive experience to cut down 
on developer time.

Solution: setup workflows, dynamic configuration/conditional workflows and 
[restricted 
contexts|https://circleci.com/docs/2.0/contexts/#running-workflows-with-a-restricted-context]
  
 #  Easy User Management: The ability for their team to add users through the 
UI instead of needing to submit a support ticket. 

Solution: DevOps Customer Engineering worked with the product team in 2020 to 
add shared organization management functionality to the UI. On our roadmap for 
2021 is to add users via the UI. See 
 #  Tracking Test Results: Tracking test results across runs to get a 
historical look at build times, flaky builds, etc.








h2. Q&A Pause














 # Using the LOWRES config as an example which then can be translated to the 
other config tiers, jobs with environment variables that could be stored in 
[Contexts as environment variables and restricted or unrestricted based on user 
team/tier|https://circleci.com/docs/2.0/contexts/#restrict-a-context-to-a-security-group-or-groups]
 or passed as a pipeline parameters:

 
 * Java 8 JVM Upgrade Distributed Tests (j8_jvm_upgrade_dtests)
 *     - ANT_HOME: /usr/share/ant
 *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - LANG: en_US.UTF-8
 *     - KEEP_TEST_DIR: true
 *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
 *     - PYTHONIOENCODING: utf-8
 *     - PYTHONUNBUFFERED: true
 *     - CASS_DRIVER_NO_EXTENSIONS: true
 *     - CASS_DRIVER_NO_CYTHON: true
 *     - CASSANDRA_SKIP_SYNC: true
 *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
 *     - DTEST_BRANCH: trunk
 *     - CCM_MAX_HEAP_SIZE: 1024M
 *     - CCM_HEAP_NEWSIZE: 256M
 *     - REPEATED_UTEST_TARGET: testsome
 *     - REPEATED_UTEST_CLASS: null
 *     - REPEATED_UTEST_METHODS: null
 *     - REPEATED_UTEST_COUNT: 100
 *     - REPEATED_UTEST_STOP_ON_FAILURE: false
 *     - REPEATED_DTEST_NAME: null
 *     - REPEATED_DTEST_VNODES: false
 *     - REPEATED_DTEST_COUNT: 100
 *     - REPEATED_DTEST_STOP_ON_FAILURE: false
 *     - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64


 * Java 8 cqlsh Distributed Tests python 2 with VNodes 
[j8_cqlsh-dtests-py2-with-vnodes]
 *     - ANT_HOME: /usr/share/ant
 *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - LANG: en_US.UTF-8
 *     - KEEP_TEST_DIR: true
 *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
 *     - PYTHONIOENCODING: utf-8
 *     - PYTHONUNBUFFERED: true
 *     - CASS_DRIVER_NO_EXTENSIONS: true
 *     - CASS_DRIVER_NO_CYTHON: true
 *     - CASSANDRA_SKIP_SYNC: true
 *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
 *     - DTEST_BRANCH: trunk
 *     - CCM_MAX_HEAP_SIZE: 1024M
 *     - CCM_HEAP_NEWSIZE: 256M
 *     - REPEATED_UTEST_TARGET: testsome
 *     - REPEATED_UTEST_CLASS: null
 *     - REPEATED_UTEST_METHODS: null
 *     - REPEATED_UTEST_COUNT: 100
 *     - REPEATED_UTEST_STOP_ON_FAILURE: false
 *     - REPEATED_DTEST_NAME: null
 *     - REPEATED_DTEST_VNODES: false
 *     - REPEATED_DTEST_COUNT: 100
 *     - REPEATED_DTEST_STOP_ON_FAILURE: false
 *     - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64


 * Java 11 Unit Tests [j11_unit_tests]
 *     - ANT_HOME: /usr/share/ant
 *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 *     - LANG: en_US.UTF-8
 *     - KEEP_TEST_DIR: true
 *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
 *     - PYTHONIOENCODING: utf-8
 *     - PYTHONUNBUFFERED: true
 *     - CASS_DRIVER_NO_EXTENSIONS: true
 *     - CASS_DRIVER_NO_CYTHON: true
 *     - CASSANDRA_SKIP_SYNC: true
 *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
 *     - DTEST_BRANCH: trunk
 *     - CCM_MAX_HEAP_SIZE: 1024M
 *     - CCM_HEAP_NEWSIZE: 256M
 *     - REPEATED_UTEST_TARGET: testsome
 *     - REPEATED_UTEST_CLASS: null
 *     - REPEATED_UTEST_METHODS: null
 *     - REPEATED_UTEST_COUNT: 100
 *     - REPEATED_UTEST_STOP_ON_FAILURE: false
 *     - REPEATED_DTEST_NAME: null
 *     - REPEATED_DTEST_VNODES: false
 *     - REPEATED_DTEST_COUNT: 100
 *     - REPEATED_DTEST_STOP_ON_FAILURE: false
 *     - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - JDK_HOME: /usr/lib/jvm/java-11-openjdk-amd64
 *     - CASSANDRA_USE_JDK11: true


 * Java 8 cqlsh Distributed Tests python 3.8 NO VNodes 
[j8_cqlsh-dtests-py38-no-vnodes]
 * [etc]




h2. Image, Resource Class, and Executor Choice

A number of executors and resource sizes are defined, including default 
options. Passing these along as part of a parameter allows for extra 
flexibility.

_We recommend using our API to help get insights. A_ [_blog post 
here_|https://circleci.com/blog/announcing-new-insights-endpoints-in-circleci-s-api-v2/]
 _gives details on how it can be used, and may allow you some suggestions on 
how to focus your efforts._

_Reference for resources available can be found_ _here._ __ 
h2. Parallelism

There are several jobs where parallelism is enabled and split by timings used 
in line with our best practice recommendations. Consider increasing parallelism 
as needed.
h2. Caching Strategies

 

There are no [save_cache or restore_cache 
steps|https://github.com/annapamma/spring-petclinic#caching-dependencies] in 
your configuration. We normally suggest using a three-layer approach to 
restoring cache if your team decides to implement this feature. Here is an 
example from our documentation:





- restore_cache:

          keys:

            - gem-cache-v1-\{{ arch }}-\{{ .Branch }}-\{{ checksum 
"Gemfile.lock" }}

            - gem-cache-v1-\{{ arch }}-\{{ .Branch }}

            - gem-cache-v1

 

This means that it attempts to load the cache in this order:
 # A cache that matches the architecture, branch, and specific checksum of the 
.lock file.
 # A cache that matches the architecture and branch, but not the specific 
checksum of the .lock file.
 # A cache that simply matches the name defined, being the most generic option.

 

This allows branches to have their own caching options more efficiently and may 
improve performance further. In your case, this would require a bit of 
reworking. For a simpler but still efficient solution, I would recommend 
something like the following:

 

     - restore_cache:

          Keys:

 - v3-npm-\{{checksum "yarn.lock"}}

            - v3-npm

 

This allows at least some caching to be present even if the yarn file, as a 
general example, has changed.

 

_Adding an option to restore cache from a more “generic” source might improve 
performance. Otherwise, everything is handled as per our best practices._

 
h2. Secrets Management

One way to limit access to running certain workflows is to create contexts that 
are only available to certain teams. See 
https://circleci.com/docs/2.0/contexts/#restricting-a-context

 

If a user does not have access to the context used in the workflow, the 
workflow will not run.





































h2. Orbs

The Cassandra project lends itself to setting up custom Orbs nicely given the 
amount of scripting within the config itself. 

 

You could aim to have every workflow essentially simply call on an appropriate 
Orb, this is a very effective way of making use of them. Right now your 
workflows essentially follow the same steps with only a few minor variations 
that can be resolved with parameters.

 

Suggestions for orb creation:
 * Determine which unit tests to run
 * Log environment information
 * Run unit tests

 

Please see if these work for your config:
 * [Shallow Clone 
Orb|https://circleci.com/developer/orbs/orb/guitarrapc/git-shallow-clone] 
consideration if needed
 * [Python Orb|https://circleci.com/developer/orbs/orb/circleci/python] and 
creating your own orb if needed for configure virtualenv and python Dependencies

 

_A list of publicly accessible orbs can be found_ 
[_here_|https://circleci.com/developer/orbs]_. An introduction to the 
differences between private and public orbs, as well as instructions on how to 
author an orb, can be found_ [_at this 
link_|https://circleci.com/docs/2.0/orb-intro/#private-orbs-vs-public-orbs]_._
h2. Reusable Config Elements

I highly suggest looking into Reusable Config Elements and Orbs based on the 
repetitive nature of the Cassandra project config as it stands today.

More on using reusable configuration elements can be [found 
here|https://circleci.com/docs/2.0/reusing-config/?section=configuration#authoring-reusable-commands].




h3. Insights 
h4. Internal DCE Dashboard - Alert prototyping







h2. Tracking Test Results via UI 


Please add the step[ 
store_test_results|https://circleci.com/docs/2.0/language-python/#upload-and-store-test-results]
 to your config.yml file


Please reference the Insights homepage on the CircleCI Cloud application once 
this is completed:

https://app.circleci.com/insights/github/datastax





















Useful Links
 * Project Optimisations: 
[https://circleci.com/docs/2.0/optimizations/#section=projects] 
 * [Monitor and optimize your CI/CD pipeline with Insights from 
CircleCI|https://circleci.com/blog/monitor-and-optimize-your-ci-cd-pipeline-with-insights-from-circleci/]
 * [2020 State of DevOps Report: Presented by Puppet and 
CircleCI|https://circleci.com/resources/state-of-devops-report-2020/]











h1. Summary
h3. *2021 Dynamic Configuration for Cassandra*

Our recommendation is to move into using the `setup` workflow key: [Dynamic 
config|https://circleci.com/blog/building-cicd-pipelines-using-dynamic-config/] 
lets you maintain multiple config.yml files in a single code repository, to 
selectively identify and execute your primary config.yml files. This feature 
offers a wide range of powerful capabilities to easily specify and execute a 
variety of dynamic pipeline workloads. Please note: when using dynamic 
configuration, a “continue” job from the 
[continuation|https://circleci.com/developer/orbs/orb/circleci/continuation] 
[orb|https://circleci.com/docs/2.0/orb-intro/] must be called at the end of the 
setup workflow.
h3. *Automated Triggering of Builds*

See V2 API

Can trigger build through a [cron job|https://circleci.com/docs/2.0/workflows/].

 
h3. *Tracking Test Results*

Insights API

DataStax Build Times: 15-20 minutes builds, also one an hour 

Goal was moving it from an hour to 15-20 minutes
h2. Resources 

Orb: [CircleCI Developer Hub - 
circleci/continuation|https://circleci.com/developer/orbs/orb/circleci/continuation]
 

Discuss Post: [Dynamic config via setup workflows now available to all 
users|https://discuss.circleci.com/t/dynamic-config-via-setup-workflows-now-available-to-all-users/39909]
 

Documentation: [Pipeline 
Variables|https://circleci.com/docs/2.0/pipeline-variables/#conditional-workflows]
 

 
h2. Q&A


> Improvements to CircleCI CI/CD pipeline
> ---------------------------------------
>
>                 Key: CASSANDRA-16785
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16785
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CI
>            Reporter: Johanna Griffin
>            Priority: Normal
>         Attachments: DataStax Config Review & Combined Dynamic Config Setup 
> Workflows (2021) (2).pdf
>
>
> h1. About
> DataStax is the company [associated with Apache’s open-source 
> Cassandra|https://stackoverflow.com/questions/24564725/apache-cassandra-vs-datastax-cassandra]
>  project. 
>  
> The DataStax team requested a configuration review to do a more modern and 
> general revamp of their pipeline for the 2021 config prepared with the Setup 
> Workflows feature (setup).
>  
> GitHub Demo Repo for CircleCI Setup Workflows & Dynamic Configuration scripts 
> [here|https://github.com/CircleCI-Public/blog-dynamic-config-examples].
>  
> h1. Current Configuration & Workflow
>  
> The current state of their configuration can be found at the [Cassandra 
> open-source GitHub 
> repository|https://github.com/datastax/cassandra/tree/ds-trunk/.circleci].
>  
>  
> Because Cassandra works with users contributing to the repo who are not a 
> part of the organization, they expect these users to be able to run tests on 
> CircleCI. 
>  
> There are three tiers: Medium, Large and X-Large test tiers that users may 
> qualify to run. Right now, the process to create a valid CircleCI 
> configuration to run is complex in that it involves copying the right config 
> to the .circleci/config.yml file based on what you’re attempting to run. 
>  
> Now: config.yml.LOWRES is the default config. MIDRES and HIGHRES are custom 
> configs for those who have access to premium CircleCI resources.
> h1. Overview
> h2. Goals and Desired Outcome
> The team is looking for ways to modernize their workflow. Ideally, there is 
> one config.yml that will run various workflows based on the parameters 
> provided and contexts allowed to the team member. Updating the config master 
> should be done via GitHub and opening issues on the JIRA Board. 
>  
> Suggestions
>  # Currently the process is manual to trigger steps within the build. We'll 
> investigate the best ways to provide a more automotive experience to cut down 
> on developer time.
> Solution: setup workflows, dynamic configuration/conditional workflows and 
> [restricted 
> contexts|https://circleci.com/docs/2.0/contexts/#running-workflows-with-a-restricted-context]
>   
>  # Easy User Management: The ability for their team to add users through the 
> UI instead of needing to submit a support ticket. 
> Solution: DevOps Customer Engineering worked with the product team in 2020 to 
> add shared organization management functionality to the UI. On our roadmap 
> for 2021 is to add users via the UI. See 
>  # Tracking Test Results: Tracking test results across runs to get a 
> historical look at build times, flaky builds, etc.
> h2. Q&A Pause
>  # Using the LOWRES config as an example which then can be translated to the 
> other config tiers, jobs with environment variables that could be stored in 
> [Contexts as environment variables and restricted or unrestricted based on 
> user 
> team/tier|https://circleci.com/docs/2.0/contexts/#restrict-a-context-to-a-security-group-or-groups]
>  or passed as a pipeline parameters:
>  
>  * Java 8 JVM Upgrade Distributed Tests (j8_jvm_upgrade_dtests)
>  *     - ANT_HOME: /usr/share/ant
>  *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
>  *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  *     - LANG: en_US.UTF-8
>  *     - KEEP_TEST_DIR: true
>  *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
>  *     - PYTHONIOENCODING: utf-8
>  *     - PYTHONUNBUFFERED: true
>  *     - CASS_DRIVER_NO_EXTENSIONS: true
>  *     - CASS_DRIVER_NO_CYTHON: true
>  *     - CASSANDRA_SKIP_SYNC: true
>  *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
>  *     - DTEST_BRANCH: trunk
>  *     - CCM_MAX_HEAP_SIZE: 1024M
>  *     - CCM_HEAP_NEWSIZE: 256M
>  *     - REPEATED_UTEST_TARGET: testsome
>  *     - REPEATED_UTEST_CLASS: null
>  *     - REPEATED_UTEST_METHODS: null
>  *     - REPEATED_UTEST_COUNT: 100
>  *     - REPEATED_UTEST_STOP_ON_FAILURE: false
>  *     - REPEATED_DTEST_NAME: null
>  *     - REPEATED_DTEST_VNODES: false
>  *     - REPEATED_DTEST_COUNT: 100
>  *     - REPEATED_DTEST_STOP_ON_FAILURE: false
>  *     - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  *     - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  * Java 8 cqlsh Distributed Tests python 2 with VNodes 
> [j8_cqlsh-dtests-py2-with-vnodes]
>  *     - ANT_HOME: /usr/share/ant
>  *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
>  *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  *     - LANG: en_US.UTF-8
>  *     - KEEP_TEST_DIR: true
>  *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
>  *     - PYTHONIOENCODING: utf-8
>  *     - PYTHONUNBUFFERED: true
>  *     - CASS_DRIVER_NO_EXTENSIONS: true
>  *     - CASS_DRIVER_NO_CYTHON: true
>  *     - CASSANDRA_SKIP_SYNC: true
>  *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
>  *     - DTEST_BRANCH: trunk
>  *     - CCM_MAX_HEAP_SIZE: 1024M
>  *     - CCM_HEAP_NEWSIZE: 256M
>  *     - REPEATED_UTEST_TARGET: testsome
>  *     - REPEATED_UTEST_CLASS: null
>  *     - REPEATED_UTEST_METHODS: null
>  *     - REPEATED_UTEST_COUNT: 100
>  *     - REPEATED_UTEST_STOP_ON_FAILURE: false
>  *     - REPEATED_DTEST_NAME: null
>  *     - REPEATED_DTEST_VNODES: false
>  *     - REPEATED_DTEST_COUNT: 100
>  *     - REPEATED_DTEST_STOP_ON_FAILURE: false
>  *     - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  *     - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  * Java 11 Unit Tests [j11_unit_tests]
>  *     - ANT_HOME: /usr/share/ant
>  *     - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
>  *     - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
>  *     - LANG: en_US.UTF-8
>  *     - KEEP_TEST_DIR: true
>  *     - DEFAULT_DIR: /home/cassandra/cassandra-dtest
>  *     - PYTHONIOENCODING: utf-8
>  *     - PYTHONUNBUFFERED: true
>  *     - CASS_DRIVER_NO_EXTENSIONS: true
>  *     - CASS_DRIVER_NO_CYTHON: true
>  *     - CASSANDRA_SKIP_SYNC: true
>  *     - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
>  *     - DTEST_BRANCH: trunk
>  *     - CCM_MAX_HEAP_SIZE: 1024M
>  *     - CCM_HEAP_NEWSIZE: 256M
>  *     - REPEATED_UTEST_TARGET: testsome
>  *     - REPEATED_UTEST_CLASS: null
>  *     - REPEATED_UTEST_METHODS: null
>  *     - REPEATED_UTEST_COUNT: 100
>  *     - REPEATED_UTEST_STOP_ON_FAILURE: false
>  *     - REPEATED_DTEST_NAME: null
>  *     - REPEATED_DTEST_VNODES: false
>  *     - REPEATED_DTEST_COUNT: 100
>  *     - REPEATED_DTEST_STOP_ON_FAILURE: false
>  *     - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
>  *     - JDK_HOME: /usr/lib/jvm/java-11-openjdk-amd64
>  *     - CASSANDRA_USE_JDK11: true
>  * Java 8 cqlsh Distributed Tests python 3.8 NO VNodes 
> [j8_cqlsh-dtests-py38-no-vnodes]
>  * [etc]
> h2. Image, Resource Class, and Executor Choice
> A number of executors and resource sizes are defined, including default 
> options. Passing these along as part of a parameter allows for extra 
> flexibility.
> _We recommend using our API to help get insights. A_ [_blog post 
> here_|https://circleci.com/blog/announcing-new-insights-endpoints-in-circleci-s-api-v2/]
>  _gives details on how it can be used, and may allow you some suggestions on 
> how to focus your efforts._
> _Reference for resources available can be found_ _here._ __ 
> h2. Parallelism
> There are several jobs where parallelism is enabled and split by timings used 
> in line with our best practice recommendations. Consider increasing 
> parallelism as needed.
> h2. Caching Strategies
>  
> There are no [save_cache or restore_cache 
> steps|https://github.com/annapamma/spring-petclinic#caching-dependencies] in 
> your configuration. We normally suggest using a three-layer approach to 
> restoring cache if your team decides to implement this feature. Here is an 
> example from our documentation:
>  - restore_cache:
>           keys:
>             - gem-cache-v1-\{{ arch }}-{\{ .Branch }}-{\{ checksum 
> "Gemfile.lock" }}
>             - gem-cache-v1-\{{ arch }}-\{{ .Branch }}
>             - gem-cache-v1
>  
> This means that it attempts to load the cache in this order:
>  # A cache that matches the architecture, branch, and specific checksum of 
> the .lock file.
>  # A cache that matches the architecture and branch, but not the specific 
> checksum of the .lock file.
>  # A cache that simply matches the name defined, being the most generic 
> option.
>  
> This allows branches to have their own caching options more efficiently and 
> may improve performance further. In your case, this would require a bit of 
> reworking. For a simpler but still efficient solution, I would recommend 
> something like the following:
>  
>      - restore_cache:
>           Keys:
>  - v3-npm-{{checksum "yarn.lock"}}
>             - v3-npm
>  
> This allows at least some caching to be present even if the yarn file, as a 
> general example, has changed.
>  
> _Adding an option to restore cache from a more “generic” source might improve 
> performance. Otherwise, everything is handled as per our best practices._
>  
> h2. Secrets Management
> One way to limit access to running certain workflows is to create contexts 
> that are only available to certain teams. See 
> [https://circleci.com/docs/2.0/contexts/#restricting-a-context]
>  
> If a user does not have access to the context used in the workflow, the 
> workflow will not run.
> h2. Orbs
> The Cassandra project lends itself to setting up custom Orbs nicely given the 
> amount of scripting within the config itself. 
>  
> You could aim to have every workflow essentially simply call on an 
> appropriate Orb, this is a very effective way of making use of them. Right 
> now your workflows essentially follow the same steps with only a few minor 
> variations that can be resolved with parameters.
>  
> Suggestions for orb creation:
>  * Determine which unit tests to run
>  * Log environment information
>  * Run unit tests
>  
> Please see if these work for your config:
>  * [Shallow Clone 
> Orb|https://circleci.com/developer/orbs/orb/guitarrapc/git-shallow-clone] 
> consideration if needed
>  * [Python Orb|https://circleci.com/developer/orbs/orb/circleci/python] and 
> creating your own orb if needed for configure virtualenv and python 
> Dependencies
>  
> _A list of publicly accessible orbs can be found_ 
> [_here_|https://circleci.com/developer/orbs]_. An introduction to the 
> differences between private and public orbs, as well as instructions on how 
> to author an orb, can be found_ [_at this 
> link_|https://circleci.com/docs/2.0/orb-intro/#private-orbs-vs-public-orbs]_._
> h2. Reusable Config Elements
> I highly suggest looking into Reusable Config Elements and Orbs based on the 
> repetitive nature of the Cassandra project config as it stands today.
> More on using reusable configuration elements can be [found 
> here|https://circleci.com/docs/2.0/reusing-config/?section=configuration#authoring-reusable-commands].
> h3. Insights 
> h4. Internal DCE Dashboard - Alert prototyping
> h2. Tracking Test Results via UI 
> Please add the step[ 
> store_test_results|https://circleci.com/docs/2.0/language-python/#upload-and-store-test-results]
>  to your config.yml file
> Please reference the Insights homepage on the CircleCI Cloud application once 
> this is completed:
> [https://app.circleci.com/pipelines/github/apache/cassandra] 
> Useful Links
>  * Project Optimisations: 
> [https://circleci.com/docs/2.0/optimizations/#section=projects] 
>  * [Monitor and optimize your CI/CD pipeline with Insights from 
> CircleCI|https://circleci.com/blog/monitor-and-optimize-your-ci-cd-pipeline-with-insights-from-circleci/]
>  * [2020 State of DevOps Report: Presented by Puppet and 
> CircleCI|https://circleci.com/resources/state-of-devops-report-2020/]
> h1. Summary
> h3. *2021 Dynamic Configuration for Cassandra*
> Our recommendation is to move into using the `setup` workflow key: [Dynamic 
> config|https://circleci.com/blog/building-cicd-pipelines-using-dynamic-config/]
>  lets you maintain multiple config.yml files in a single code repository, to 
> selectively identify and execute your primary config.yml files. This feature 
> offers a wide range of powerful capabilities to easily specify and execute a 
> variety of dynamic pipeline workloads. Please note: when using dynamic 
> configuration, a “continue” job from the 
> [continuation|https://circleci.com/developer/orbs/orb/circleci/continuation] 
> [orb|https://circleci.com/docs/2.0/orb-intro/] must be called at the end of 
> the setup workflow.
> h3. *Automated Triggering of Builds*
> See V2 API
> Can trigger build through a [cron 
> job|https://circleci.com/docs/2.0/workflows/].
>  
> h3. *Tracking Test Results*
> Insights API
> DataStax Build Times: 15-20 minutes builds, also one an hour 
> Goal was moving it from an hour to 15-20 minutes
> h2. Resources 
> Orb: [CircleCI Developer Hub - 
> circleci/continuation|https://circleci.com/developer/orbs/orb/circleci/continuation]
>  
> Discuss Post: [Dynamic config via setup workflows now available to all 
> users|https://discuss.circleci.com/t/dynamic-config-via-setup-workflows-now-available-to-all-users/39909]
>  
> Documentation: [Pipeline 
> Variables|https://circleci.com/docs/2.0/pipeline-variables/#conditional-workflows]
>  
>  
> h2. Q&A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16785) Improvements to CircleCI CI/CD pipeline

Reply via email to