ekaterinadimitrova2 commented on code in PR #2554:
URL: https://github.com/apache/cassandra/pull/2554#discussion_r1298743261
##########
.build/config/README.md:
##########
@@ -0,0 +1,125 @@
+Declarative Test Suite Configuration
+-------------------------------------------
+
+Pipeline and test suite configurations are declarative so other CI
implementations can build
+durable, reactive systems based on changes to the upstream OSS C* CI.
Additions to `jobs.cfg` and
+`pipelines.cfg` can be picked up programmatically by CI implementations
+without requiring human intervention.
+
+Concepts
+---------------------
+
+### Pipeline
+A [pipeline](cassandra_ci.yaml) is a collection of jobs. For a given pipeline
to be considered
+successful,
+all
+jobs listed in the pipeline must run to completion without error using the
constraints, commands,
+and environment specified for the job in the config.
+
+### Job
+A [job](jobs.yaml) contains a collection of parameters that inform a CI system
on both what needs to
+run, how to run it, and the constraints of the environment in which it should
execute. We
+provide these limits to reflect what's available in our reference ASF CI
implementation so other
+CI environments are able to limit themselves to our resourcing upstream and
thus not destabilize
+ASF CI.
+
+Examples of jobs include unit tests, python dtests, in-jvm dtests, etc.
+
+Jobs include the following parameters:
+
+* `parent:` Another job defined in the file this job inherits parameters from,
potentially
+ overwriting any declared in duplication
+* `description:` Text based description of this job's purpose
+* `cmd:` The command a shell should run to execute the test job
+* `testlist:` A command that will create a text file listing all the test
files to be run for
+ this
+ suite
+* `env:` Space delimited list of environment variables to be set for this
suite. Duplicates for
+ params are allowed and later declarations should supercede former.
+* `cpu:` Max cpu count allowed for a testing suite
+* `memory:` Max memory (in GB) allowable for a suite
+* `storage:` Max allowable storage (in GB) allowable for a suite to access
+
+Jobs can be split up and parallelized in whatever manner best suits the
environment in which they're
+orchestraed.
Review Comment:
```suggestion
orchestrated.
```
##########
.build/config/README.md:
##########
@@ -0,0 +1,125 @@
+Declarative Test Suite Configuration
+-------------------------------------------
+
+Pipeline and test suite configurations are declarative so other CI
implementations can build
+durable, reactive systems based on changes to the upstream OSS C* CI.
Additions to `jobs.cfg` and
+`pipelines.cfg` can be picked up programmatically by CI implementations
+without requiring human intervention.
+
+Concepts
+---------------------
+
+### Pipeline
+A [pipeline](cassandra_ci.yaml) is a collection of jobs. For a given pipeline
to be considered
+successful,
+all
+jobs listed in the pipeline must run to completion without error using the
constraints, commands,
+and environment specified for the job in the config.
+
+### Job
+A [job](jobs.yaml) contains a collection of parameters that inform a CI system
on both what needs to
+run, how to run it, and the constraints of the environment in which it should
execute. We
+provide these limits to reflect what's available in our reference ASF CI
implementation so other
+CI environments are able to limit themselves to our resourcing upstream and
thus not destabilize
+ASF CI.
+
+Examples of jobs include unit tests, python dtests, in-jvm dtests, etc.
+
+Jobs include the following parameters:
+
+* `parent:` Another job defined in the file this job inherits parameters from,
potentially
+ overwriting any declared in duplication
+* `description:` Text based description of this job's purpose
+* `cmd:` The command a shell should run to execute the test job
+* `testlist:` A command that will create a text file listing all the test
files to be run for
+ this
+ suite
+* `env:` Space delimited list of environment variables to be set for this
suite. Duplicates for
+ params are allowed and later declarations should supercede former.
+* `cpu:` Max cpu count allowed for a testing suite
+* `memory:` Max memory (in GB) allowable for a suite
+* `storage:` Max allowable storage (in GB) allowable for a suite to access
+
+Jobs can be split up and parallelized in whatever manner best suits the
environment in which they're
+orchestraed.
+
+Configuration Files
+---------------------
+
+[pipelines.cfg](./cassandra_ci.yaml): Contains pipelines for CI jobs for
Apache Cassandra
+
+[jobs.cfg](./jobs.yaml): Contains reference CI jobs for Apache Cassandra
+
+Existing Pipelines
+---------------------
+
+As outlined in the `pipelines.cfg` file, we primarily have 3 pipelines:
+### pre-commit:
+* must run and pass on the lowest supported JDK before a committer merges any
code
+### post-commit:
+* will run on the upstream ASF repo after a commit is merged, matrixed across
more axes and including configurations expected to fail or diverge only rarely
+### nightly:
+* run nightly. Longer term, infra, very stable areas of code.
+
+Adding a new job to CI
+---------------------
+
+To add a new job to CI, you need to do 2 things:
+1. Determine which pipeline it will be a part of. Add the job name to that
pipeline (or create a
+new pipeline with that job)
+
+2. Add a new entry to [jobs.cfg](./jobs.yaml). For example:
+```
+job:my-new-job
+ parent:base
+ description:new test suite that does important new things
+ cmd:ant new_job_name
+ testlist:find test/new_test_type -name '*Test.java' | sort
+ memory:12
+ cpu:4
+ storage:20
+ env:PARAM_ONE=val1 PARAM_TWO=val2 PARAM_THREE=val3
+ env:PARAM_FOUR=val4 PARAM_FIVE=val5
+```
+
+**NOTE**:
+
+You will also need to ensure the necessary values exist in
[build.xml](../../build.xml) (timeouts,
+etc).
+For now, there is duplication between the declarative declaration of test
suites here and `build.
+xml`
+
+Building a Testing Environment
+-------------------------------------
+[ci_config_parser.sh](./ci_config_parser.sh) contains several methods to parse
out pipelines, jobs,
+and
+job parameters:
+
+* `populate_pipelines`: populates a global array named `pipelines` with the
names of all valid
+ pipelines from the given input file
+* `populate_jobs`: populates all the required jobs for a given pipeline.
Useful for determining
+ / breaking down and iterating through jobs needed for a given pipeline
+* `parse_job_params`: populates some key global variables (see details in
[ci_config_parser.sh](.
+ /ci_config_parser.sh) that can be used to build out constraints, commands,
and details in a
+ programmatic CI pipeline config builder.
+
+The workflow for building CI programmatically from the config might look
something like this:
+* `populate_pipelines` to determine what pipelines you need to build out
+* For each pipeline:
+ 1. `populate_jobs` to determine which jobs you need to write out config for
+ 2. for each job:
+ 1. `clear_job_params` to ensure nothing is left over from previous runs
+ 2. `parse_job_params` to set up the params needed for the job
+ 2. Write out the current job's params in whatever CI config format
you're using in your
Review Comment:
```suggestion
3. Write out the current job's params in whatever CI config format
you're using in your
```
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
Review Comment:
Maybe add some big title so it is easy to see where license stops, docs
start?
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
+
+# CI consists of:
+# 1. job: a set of commands to run against a list of files containing tests
+# 2. pipeline: a list of jobs that can be run in arbitrary order
+# pipelines contain a list of JDK's they have to be run across to
certify correctness
+
+#-----------------------------------------------------------------------------
+# IMPLEMENTATION REQUIRED PARAMETERS:
+#-----------------------------------------------------------------------------
+# We do not provide a mechanism to transform the contents of $TEST_LIST_FILE
into $TEST_SPLIT_FILE. Implementations
+# must provide that mechanism and set that environment variable or "job->run:"
operations will fail, unable to find a test split.
+#
+# EXPECTED FLOW ON AN AGENT:
+# 1. Populate contents of $TEST_LIST_FILE for a given job using
"job->test_list_cmd:" piped through "job->TEST_FILTER:"
+# 2. Split up $TEST_LIST_FILE using "job->num_split_cmd:"
+# 3. Populate $TEST_SPLIT_FILE with a given split (CI implementation specific)
+# 3. Execute "job->run:" to run the given $TEST_SPLIT_FILE
Review Comment:
```suggestion
# 4. Execute "job->run:" to run the given $TEST_SPLIT_FILE
```
##########
.build/config/assert.sh:
##########
@@ -0,0 +1,266 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Borrowed from https://github.com/torokmark/assert.sh/blob/main/assert.sh
+
+#####################################################################
+##
+## title: Assert Extension
+##
+## description:
+## Assert extension of shell (bash, ...)
+## with the common assert functions
+## Function list based on:
+## http://junit.sourceforge.net/javadoc/org/junit/Assert.html
+## Log methods : inspired by
+## - https://natelandau.com/bash-scripting-utilities/
+## author: Mark Torok
+##
+## date: 07. Dec. 2016
+##
+## license: MIT
+##
+#####################################################################
+
+. functions.sh
+
+if command -v tput &>/dev/null && tty -s; then
+ RED=$(tput setaf 1)
+ GREEN=$(tput setaf 2)
+ MAGENTA=$(tput setaf 5)
+ NORMAL=$(tput sgr0)
+ BOLD=$(tput bold)
+else
+ RED=$(echo -en "\e[31m")
+ GREEN=$(echo -en "\e[32m")
+ MAGENTA=$(echo -en "\e[35m")
+ NORMAL=$(echo -en "\e[00m")
+ BOLD=$(echo -en "\e[01m")
+fi
+
+log_header() {
Review Comment:
I haven't looked in detail the script, but I like the idea. Seems neat.
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
+
+# CI consists of:
+# 1. job: a set of commands to run against a list of files containing tests
+# 2. pipeline: a list of jobs that can be run in arbitrary order
+# pipelines contain a list of JDK's they have to be run across to
certify correctness
+
+#-----------------------------------------------------------------------------
+# IMPLEMENTATION REQUIRED PARAMETERS:
+#-----------------------------------------------------------------------------
+# We do not provide a mechanism to transform the contents of $TEST_LIST_FILE
into $TEST_SPLIT_FILE. Implementations
+# must provide that mechanism and set that environment variable or "job->run:"
operations will fail, unable to find a test split.
+#
+# EXPECTED FLOW ON AN AGENT:
Review Comment:
```suggestion
-----------------------------------------------------------------------------
# EXPECTED FLOW ON AN AGENT:
-----------------------------------------------------------------------------
```
##########
.build/config/functions.sh:
##########
@@ -0,0 +1,53 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+##############################################################################
+# Helper functions for use in our build scripting
+##############################################################################
+
+# Confirm that a given variable exists
+# $1: Message to print on error
+# $2: Variable to check for definition
+check_argument() {
Review Comment:
Maybe I would change the sh script name to Validations or something,
functions doesn't ring a bell to me :-)
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
+
+# CI consists of:
+# 1. job: a set of commands to run against a list of files containing tests
+# 2. pipeline: a list of jobs that can be run in arbitrary order
+# pipelines contain a list of JDK's they have to be run across to
certify correctness
+
+#-----------------------------------------------------------------------------
+# IMPLEMENTATION REQUIRED PARAMETERS:
+#-----------------------------------------------------------------------------
+# We do not provide a mechanism to transform the contents of $TEST_LIST_FILE
into $TEST_SPLIT_FILE. Implementations
+# must provide that mechanism and set that environment variable or "job->run:"
operations will fail, unable to find a test split.
+#
+# EXPECTED FLOW ON AN AGENT:
+# 1. Populate contents of $TEST_LIST_FILE for a given job using
"job->test_list_cmd:" piped through "job->TEST_FILTER:"
+# 2. Split up $TEST_LIST_FILE using "job->num_split_cmd:"
+# 3. Populate $TEST_SPLIT_FILE with a given split (CI implementation specific)
+# 3. Execute "job->run:" to run the given $TEST_SPLIT_FILE
+
+#-----------------------------------------------------------------------------
+# SOURCES
+#-----------------------------------------------------------------------------
+# You can configure the different sources you're using for your CI stack here;
we default to HEAD on a given branch
+# and you should print out what SHA you checked out and built against for
reproducibility in a subsequent investigation.
+repos:
+ cassandra:
+ url: https://github.com/apache/cassandra
+ branch: trunk
+ sha: HEAD
+ python_dtest:
+ url: &python_dtest_url https://github.com/apache/cassandra-dtest
+ branch: &python_dtest_branch trunk
+ sha: HEAD
+ cassandra-harry:
+ url: https://github.com/apache/cassandra-harry
+ branch: trunk
+ sha: HEAD
+
+
+#-----------------------------------------------------------------------------
+# PIPELINES
+#-----------------------------------------------------------------------------
+pipelines:
+ # All jobs in the pre-commit pipeline must run within constraints and pass
+ # before a commit is merged upstream. Committers are expected to validate
+ # and sign off on this if using non-reference CI environments.
+ #
+ # Failure to do so can lead to commits being reverted.
+ - name: pre-commit
+ jdk:
+ - 11
+ jobs:
+ - unit
+ - jvm-dtest
+ - python-dtest
+ - dtest
+ - dtest-large
+ - dtest-upgrade
+ - dtest-upgrade-large
+ - long-test
+ - cqlsh-test
+
+ # The post-commit pipeline is a larger set of tests that include all
supported JDKs.
+ # We expect different JDKs and variations on test suites to fail very rarely.
+ #
+ # Failures in these tests will be made visible on JIRA tickets shortly after
+ # test run on reference CI and committers are expected to prioritize
+ # rectifying any failures introduced by their work.
+ - name: post-commit
+ jdk:
+ - 11
+ - 17
Review Comment:
We added java.supported only in 5.0+
##########
.build/config/README.md:
##########
@@ -0,0 +1,125 @@
+Declarative Test Suite Configuration
+-------------------------------------------
+
+Pipeline and test suite configurations are declarative so other CI
implementations can build
+durable, reactive systems based on changes to the upstream OSS C* CI.
Additions to `jobs.cfg` and
+`pipelines.cfg` can be picked up programmatically by CI implementations
+without requiring human intervention.
+
+Concepts
+---------------------
+
+### Pipeline
+A [pipeline](cassandra_ci.yaml) is a collection of jobs. For a given pipeline
to be considered
+successful,
+all
+jobs listed in the pipeline must run to completion without error using the
constraints, commands,
+and environment specified for the job in the config.
+
+### Job
+A [job](jobs.yaml) contains a collection of parameters that inform a CI system
on both what needs to
+run, how to run it, and the constraints of the environment in which it should
execute. We
+provide these limits to reflect what's available in our reference ASF CI
implementation so other
+CI environments are able to limit themselves to our resourcing upstream and
thus not destabilize
+ASF CI.
+
+Examples of jobs include unit tests, python dtests, in-jvm dtests, etc.
+
+Jobs include the following parameters:
+
+* `parent:` Another job defined in the file this job inherits parameters from,
potentially
+ overwriting any declared in duplication
+* `description:` Text based description of this job's purpose
+* `cmd:` The command a shell should run to execute the test job
+* `testlist:` A command that will create a text file listing all the test
files to be run for
+ this
+ suite
+* `env:` Space delimited list of environment variables to be set for this
suite. Duplicates for
+ params are allowed and later declarations should supercede former.
+* `cpu:` Max cpu count allowed for a testing suite
+* `memory:` Max memory (in GB) allowable for a suite
+* `storage:` Max allowable storage (in GB) allowable for a suite to access
+
+Jobs can be split up and parallelized in whatever manner best suits the
environment in which they're
+orchestraed.
+
+Configuration Files
+---------------------
+
+[pipelines.cfg](./cassandra_ci.yaml): Contains pipelines for CI jobs for
Apache Cassandra
+
+[jobs.cfg](./jobs.yaml): Contains reference CI jobs for Apache Cassandra
+
+Existing Pipelines
+---------------------
+
+As outlined in the `pipelines.cfg` file, we primarily have 3 pipelines:
+### pre-commit:
+* must run and pass on the lowest supported JDK before a committer merges any
code
+### post-commit:
+* will run on the upstream ASF repo after a commit is merged, matrixed across
more axes and including configurations expected to fail or diverge only rarely
+### nightly:
+* run nightly. Longer term, infra, very stable areas of code.
+
+Adding a new job to CI
+---------------------
+
+To add a new job to CI, you need to do 2 things:
+1. Determine which pipeline it will be a part of. Add the job name to that
pipeline (or create a
+new pipeline with that job)
+
+2. Add a new entry to [jobs.cfg](./jobs.yaml). For example:
+```
+job:my-new-job
+ parent:base
+ description:new test suite that does important new things
+ cmd:ant new_job_name
+ testlist:find test/new_test_type -name '*Test.java' | sort
+ memory:12
+ cpu:4
+ storage:20
+ env:PARAM_ONE=val1 PARAM_TWO=val2 PARAM_THREE=val3
+ env:PARAM_FOUR=val4 PARAM_FIVE=val5
+```
+
+**NOTE**:
+
+You will also need to ensure the necessary values exist in
[build.xml](../../build.xml) (timeouts,
+etc).
+For now, there is duplication between the declarative declaration of test
suites here and `build.
+xml`
+
+Building a Testing Environment
+-------------------------------------
+[ci_config_parser.sh](./ci_config_parser.sh) contains several methods to parse
out pipelines, jobs,
+and
+job parameters:
+
+* `populate_pipelines`: populates a global array named `pipelines` with the
names of all valid
+ pipelines from the given input file
+* `populate_jobs`: populates all the required jobs for a given pipeline.
Useful for determining
+ / breaking down and iterating through jobs needed for a given pipeline
+* `parse_job_params`: populates some key global variables (see details in
[ci_config_parser.sh](.
+ /ci_config_parser.sh) that can be used to build out constraints, commands,
and details in a
+ programmatic CI pipeline config builder.
+
+The workflow for building CI programmatically from the config might look
something like this:
+* `populate_pipelines` to determine what pipelines you need to build out
+* For each pipeline:
+ 1. `populate_jobs` to determine which jobs you need to write out config for
+ 2. for each job:
+ 1. `clear_job_params` to ensure nothing is left over from previous runs
+ 2. `parse_job_params` to set up the params needed for the job
+ 2. Write out the current job's params in whatever CI config format
you're using in your
+ env (circle, jenkinsfile, etc)
+
+As new entries are added to [pipelines.cfg](./cassandra_ci.yaml) and
[jobs.cfg](./jobs.yaml), your
+scripts should pick those up and integrate them into your configuration
environment.
+
+Testing the in-tree config parsing scripts
+---------------------------------------------
+Currently testing is manual on the first addition of this declarative
structure. As we integrate
+it into our reference CI, we will integrate testing in as a new target.
+
+To run tests, execute [test_config.sh](./test/test_config.sh) from a terminal
and inspect the
+output.
Review Comment:
I am not sure I understand this section. The title is about parsing, then we
talk about new task and provide script name
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
+
+# CI consists of:
+# 1. job: a set of commands to run against a list of files containing tests
+# 2. pipeline: a list of jobs that can be run in arbitrary order
+# pipelines contain a list of JDK's they have to be run across to
certify correctness
+
+#-----------------------------------------------------------------------------
+# IMPLEMENTATION REQUIRED PARAMETERS:
+#-----------------------------------------------------------------------------
+# We do not provide a mechanism to transform the contents of $TEST_LIST_FILE
into $TEST_SPLIT_FILE. Implementations
+# must provide that mechanism and set that environment variable or "job->run:"
operations will fail, unable to find a test split.
Review Comment:
I would expect there is some default split that can be used if you do not
have your own implementation?
##########
.build/config/ci_config_parser.sh:
##########
@@ -0,0 +1,174 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+. functions.sh
+
+# This script relies on yq: https://github.com/mikefarah/yq
+# License: MIT: https://github.com/jmckenzie-dev/yq/blob/master/LICENSE
+# Plain binary install: wget
https://github.com/mikefarah/yq/releases/download/${VERSION}/${BINARY} -O
/usr/bin/yq &&\ chmod +x /usr/bin/yq
+# Brew install: "brew install yq"
+
+# Text array of all known pipelines found in processed config file
+export pipelines=()
+
+# Text array of all known jobs found in the jobs config
+export pipeline_jobs=()
+
+# The keys for various properties as defined in the test jobs.yaml file
+KEY_PARENT="parent"
+KEY_CMD="cmd"
+KEY_ENV="env"
+KEY_TESTLIST_CMD="testlist"
+KEY_MEM="memory"
+KEY_CPU="cpu"
+KEY_STORAGE="storage"
Review Comment:
probably we can add the unit as suffix?
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
+
+# CI consists of:
+# 1. job: a set of commands to run against a list of files containing tests
+# 2. pipeline: a list of jobs that can be run in arbitrary order
+# pipelines contain a list of JDK's they have to be run across to
certify correctness
+
+#-----------------------------------------------------------------------------
+# IMPLEMENTATION REQUIRED PARAMETERS:
+#-----------------------------------------------------------------------------
+# We do not provide a mechanism to transform the contents of $TEST_LIST_FILE
into $TEST_SPLIT_FILE. Implementations
+# must provide that mechanism and set that environment variable or "job->run:"
operations will fail, unable to find a test split.
+#
+# EXPECTED FLOW ON AN AGENT:
+# 1. Populate contents of $TEST_LIST_FILE for a given job using
"job->test_list_cmd:" piped through "job->TEST_FILTER:"
+# 2. Split up $TEST_LIST_FILE using "job->num_split_cmd:"
+# 3. Populate $TEST_SPLIT_FILE with a given split (CI implementation specific)
+# 3. Execute "job->run:" to run the given $TEST_SPLIT_FILE
+
+#-----------------------------------------------------------------------------
+# SOURCES
+#-----------------------------------------------------------------------------
+# You can configure the different sources you're using for your CI stack here;
we default to HEAD on a given branch
+# and you should print out what SHA you checked out and built against for
reproducibility in a subsequent investigation.
+repos:
+ cassandra:
+ url: https://github.com/apache/cassandra
+ branch: trunk
+ sha: HEAD
+ python_dtest:
+ url: &python_dtest_url https://github.com/apache/cassandra-dtest
+ branch: &python_dtest_branch trunk
+ sha: HEAD
+ cassandra-harry:
+ url: https://github.com/apache/cassandra-harry
+ branch: trunk
+ sha: HEAD
+
+
+#-----------------------------------------------------------------------------
+# PIPELINES
+#-----------------------------------------------------------------------------
+pipelines:
+ # All jobs in the pre-commit pipeline must run within constraints and pass
+ # before a commit is merged upstream. Committers are expected to validate
+ # and sign off on this if using non-reference CI environments.
+ #
+ # Failure to do so can lead to commits being reverted.
+ - name: pre-commit
+ jdk:
+ - 11
+ jobs:
+ - unit
+ - jvm-dtest
+ - python-dtest
+ - dtest
+ - dtest-large
+ - dtest-upgrade
+ - dtest-upgrade-large
+ - long-test
+ - cqlsh-test
+
+ # The post-commit pipeline is a larger set of tests that include all
supported JDKs.
+ # We expect different JDKs and variations on test suites to fail very rarely.
+ #
+ # Failures in these tests will be made visible on JIRA tickets shortly after
+ # test run on reference CI and committers are expected to prioritize
+ # rectifying any failures introduced by their work.
+ - name: post-commit
+ jdk:
+ - 11
+ - 17
+ jobs:
+ - unit
+ - unit-cdc
+ - compression
+ - test-oa
+ - test-system-keyspace-directory
+ - test-tries
+ - jvm-dtest
+ - jvm-dtest-upgrade
+ - dtest
+ - dtest-novnode
+ - dtest-offheap
+ - dtest-large
+ - dtest-large-novnode
+ - dtest-upgrade
+ - dtest-upgrade-large
+ - long-test
+ - cqlsh-test
+
+ # These are longer-term, much more rarely changing pieces of infrastructure
or
+ # testing. We expect these to fail even more rarely than post-commit.
+ - name: nightly
+ jdk:
+ - 11
+ - 17
+ jobs:
+ - stress-test
+ - fqltool-test
+ - test-burn
+
+#-----------------------------------------------------------------------------
+# RESOURCE LIMITS, ALIASES, AND DEFAULT ENV VARS
+#-----------------------------------------------------------------------------
+# Downstream test orchestration needs to use <= the following values when
running tests.
+# Increasing these values indicates a change in resource allocation
https://ci-cassandra.apache.org/ and should not be done downstream.
+small_executor: &small_executor {cpu: 4, memory: 1g, storage: 5g}
+medium_executor: &medium_executor {cpu: 4, memory: 6g, storage: 25g}
+large_executor: &large_executor {cpu: 4, memory: 16g, storage: 50g}
+
+# On test addition or change, we repeat the job many times to try and suss out
flakes. Instead of having it be bespoke
+# per job, we want to provide some general guidelines for folks to default to
and provide guidance on each test suite.
+repeat_default: &repeat_many 500
+repeat_less: &repeat_moderate 100
+repeat_tiny: &repeat_few 25
+
+# Default to at least one split
+default_split_num: &default_split_num let NUM_SPLITS=$(( $(wc -l <
"$TEST_LIST_FILE") / $SPLIT_SIZE )); if [ "$NUM_SPLITS" -eq 0 ]; then
NUM_SPLITS=1; fi
Review Comment:
I do not understand... shouldn't we have this as default and give a chance
to make the num_splits a variable instead of const? 1 by default, whatever you
feel like otherwise... And you can plug your own way of doing splits, as it was
mentioned earlier.
##########
.build/config/cassandra_ci.yaml:
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Contains definitions of all pipelines and jobs (test suites) in Apache
Cassandra's CI.
+
+# CI consists of:
+# 1. job: a set of commands to run against a list of files containing tests
+# 2. pipeline: a list of jobs that can be run in arbitrary order
+# pipelines contain a list of JDK's they have to be run across to
certify correctness
+
+#-----------------------------------------------------------------------------
+# IMPLEMENTATION REQUIRED PARAMETERS:
+#-----------------------------------------------------------------------------
+# We do not provide a mechanism to transform the contents of $TEST_LIST_FILE
into $TEST_SPLIT_FILE. Implementations
+# must provide that mechanism and set that environment variable or "job->run:"
operations will fail, unable to find a test split.
+#
+# EXPECTED FLOW ON AN AGENT:
+# 1. Populate contents of $TEST_LIST_FILE for a given job using
"job->test_list_cmd:" piped through "job->TEST_FILTER:"
+# 2. Split up $TEST_LIST_FILE using "job->num_split_cmd:"
+# 3. Populate $TEST_SPLIT_FILE with a given split (CI implementation specific)
+# 3. Execute "job->run:" to run the given $TEST_SPLIT_FILE
+
+#-----------------------------------------------------------------------------
+# SOURCES
+#-----------------------------------------------------------------------------
+# You can configure the different sources you're using for your CI stack here;
we default to HEAD on a given branch
+# and you should print out what SHA you checked out and built against for
reproducibility in a subsequent investigation.
+repos:
+ cassandra:
+ url: https://github.com/apache/cassandra
+ branch: trunk
+ sha: HEAD
+ python_dtest:
+ url: &python_dtest_url https://github.com/apache/cassandra-dtest
+ branch: &python_dtest_branch trunk
+ sha: HEAD
+ cassandra-harry:
+ url: https://github.com/apache/cassandra-harry
+ branch: trunk
+ sha: HEAD
+
+
+#-----------------------------------------------------------------------------
+# PIPELINES
+#-----------------------------------------------------------------------------
+pipelines:
+ # All jobs in the pre-commit pipeline must run within constraints and pass
+ # before a commit is merged upstream. Committers are expected to validate
+ # and sign off on this if using non-reference CI environments.
+ #
+ # Failure to do so can lead to commits being reverted.
+ - name: pre-commit
+ jdk:
+ - 11
+ jobs:
+ - unit
+ - jvm-dtest
+ - python-dtest
+ - dtest
+ - dtest-large
+ - dtest-upgrade
+ - dtest-upgrade-large
+ - long-test
+ - cqlsh-test
+
+ # The post-commit pipeline is a larger set of tests that include all
supported JDKs.
+ # We expect different JDKs and variations on test suites to fail very rarely.
+ #
+ # Failures in these tests will be made visible on JIRA tickets shortly after
+ # test run on reference CI and committers are expected to prioritize
+ # rectifying any failures introduced by their work.
+ - name: post-commit
+ jdk:
+ - 11
+ - 17
+ jobs:
+ - unit
+ - unit-cdc
+ - compression
+ - test-oa
+ - test-system-keyspace-directory
+ - test-tries
+ - jvm-dtest
+ - jvm-dtest-upgrade
+ - dtest
+ - dtest-novnode
+ - dtest-offheap
+ - dtest-large
+ - dtest-large-novnode
+ - dtest-upgrade
+ - dtest-upgrade-large
+ - long-test
+ - cqlsh-test
+
+ # These are longer-term, much more rarely changing pieces of infrastructure
or
+ # testing. We expect these to fail even more rarely than post-commit.
+ - name: nightly
+ jdk:
+ - 11
+ - 17
+ jobs:
+ - stress-test
+ - fqltool-test
+ - test-burn
+
+#-----------------------------------------------------------------------------
+# RESOURCE LIMITS, ALIASES, AND DEFAULT ENV VARS
+#-----------------------------------------------------------------------------
+# Downstream test orchestration needs to use <= the following values when
running tests.
+# Increasing these values indicates a change in resource allocation
https://ci-cassandra.apache.org/ and should not be done downstream.
+small_executor: &small_executor {cpu: 4, memory: 1g, storage: 5g}
+medium_executor: &medium_executor {cpu: 4, memory: 6g, storage: 25g}
+large_executor: &large_executor {cpu: 4, memory: 16g, storage: 50g}
+
+# On test addition or change, we repeat the job many times to try and suss out
flakes. Instead of having it be bespoke
+# per job, we want to provide some general guidelines for folks to default to
and provide guidance on each test suite.
+repeat_default: &repeat_many 500
+repeat_less: &repeat_moderate 100
+repeat_tiny: &repeat_few 25
+
+# Default to at least one split
+default_split_num: &default_split_num let NUM_SPLITS=$(( $(wc -l <
"$TEST_LIST_FILE") / $SPLIT_SIZE )); if [ "$NUM_SPLITS" -eq 0 ]; then
NUM_SPLITS=1; fi
+
+# These env vars are required for tests to complete successfully given the
run: commands, however downstream implementations
+# are welcome to change them as needed to setup their env
+default_env_vars: &default_env_vars
+ ANT_HOME: /usr/share/ant
+ KEEP_TEST_DIR: true
+ CASSANDRA_DIR: /home/cassandra/cassandra
+ CASSANDRA_DTEST_DIR: /home/cassandra/cassandra-dtest
+ CCM_CONFIG_DIR: ${DIST_DIR}/.ccm
+ TMPDIR: "$(mktemp -d)"
+ DIST_DIR: "${CASSANDRA_DIR}/build"
+ # Default to test.timeout as found in build.xml; should parse out of there
in building local env and set this env var based on job
+ TEST_TIMEOUT: 480000
+ # Whether the repeated test iterations should stop on the first failure by
default.
+ REPEATED_TESTS_STOP_ON_FAILURE: false
+
+# Anything specified in the required env vars SHOULD NOT BE CHANGED except for
ASF CI; these are expected to have a
+# material impact on test correctness and changes to them on a downstream
system will likely destabilize our reference
+# CI implementation
+required_env_vars: &required_env_vars
+ LANG: en_US.UTF-8
+ PYTHONIOENCODING: "utf-8"
+ PYTHONUNBUFFERED: true
+ CASS_DRIVER_NO_EXTENSIONS: true
+ CASS_DRIVER_NO_CYTHON: true
+ #Skip all syncing to disk to avoid performance issues in flaky CI
environments
+ CASSANDRA_SKIP_SYNC: true
+ CCM_MAX_HEAP_SIZE: "1024M"
+ CCM_HEAP_NEWSIZE: "512M"
+ PYTEST_OPTS: "-vv --log-cli-level=DEBUG
--junit-xml=${DIST_DIR}/test/output/nosetests.xml
--junit-prefix=${DTEST_TARGET} -s"
+
+#-----------------------------------------------------------------------------
+# JOBS
+#
+# By convention, anything in caps in the .yaml should be exported to an env
var during the test run cycle.
+#
+# Parameters:
+# job: the name
+# resources: cpu: memory: storage: max allowable for the suite.
+# SPLIT_SIZE: This indicates how many tests to include in a given split. Can
raise or lower as needed in your env.
+# REPEAT_COUNT: Number of times to repeat a test when multiplexing. *Do not
lower below upstream config default.*
+# env:
+# TYPE: The type of test; this should translate into tests found under
${CASSANDRA_DIR}/test/${type} in $TEST_SPLIT_FILE
+# TEST_FILTER: filter to run after test_list_cmd to narrow down tests
(splits, upgrade vs. non, etc)
+# test_list_cmd: command to run in shell to generate full list of tests to
run. By default, randomizes test list by file name
+# num_split_cmd: Calculation that populates NUM_SPLITS based on suite,
count, weighting.
+# *If you make changes to this value, they must be >= the default value*
+# run: command to run in shell to execute tests
+#
+#-----------------------------------------------------------------------------
+
+#-----------------------------------------------------------------------------
+# Single node JVM tests
+#-----------------------------------------------------------------------------
+jobs:
+ - &job_unit
+ name: unit
+ resources: *medium_executor
+ REPEAT_COUNT: *repeat_many
+ SPLIT_SIZE: 20
+ env:
+ <<: *default_env_vars
+ <<: *required_env_vars
+ ANT_TEST_OPTS: -Dno-build-test=true
+ # type lines up with the various targets in build.xml for usage by the
<testclasslist> target
+ TYPE: unit
+ TEST_FILTER: ""
+ TEST_LIST_FILE: ${DIST_DIR}/test_list.txt
+ test_list_cmd: find "test/${type}" -name "*Test.java" ${TEST_FILTER:-} |
sed "s;^test/${type}/;;" | sort -R > ${TEST_LIST_FILE}
+ num_split_cmd: *default_split_num
Review Comment:
And what do we do instead?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]