[ 
https://issues.apache.org/jira/browse/CASSANDRA-20311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-20311:
-------------------------------------------
    Description: 
Using CASSANDRA-20157, analyse 
#  what splits are problematic
#  what test types are configured with too many splits, or not enough


*Analyse Process*

Download the jenkins consoleText logs we want to analyse
{noformat}
for i in $(seq $first_build $last_build) ; do 
  wget -q https://ci-cassandra.apache.org/job/Cassandra-5.0/$i/consoleText 
  bash -c "grep '] Time ' consoleText | awk -F']' '{print $2}' | sort -u > 
Cassandra-5.0_${i}_timings.txt"
  rm consoleText
done
{noformat}
Alternative, if using nightlies.a.o webdav mount (the consoleText.xz files have 
been manually downloaded and put into place).
{noformat}
for i in $(seq $first_build $last_build) ; do 
  bash -c "xzgrep '] Time ' 
<nightlies_webdav_mount>/cassandra/Cassandra-5.0/${i}/consoleText.xz | awk 
-F']' '{print $2}' | sort -u > Cassandra-5.0_${i}_timings.txt"
done
{noformat}

(1)
{code}
for i in $(seq $first_build $last_build) ; do grep " 01:0" 
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | sort | uniq -c | 
sort -rn
{code}
Target those individual splits that have timed out (duration of 1 hour) the 
most.

(2a)
For test types that have too few splits (are timing out at the 1 hour mark too 
often).
{code}
for i in $(seq $first_build $last_build) ; do grep " 01:0" 
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print $1}' 
| sort | uniq -c | sort -rn
{code}

(2b)
For test types that have too many splits (are finishing faster than ten 
minutes).
{code}
for i in $(seq $first_build $last_build) ; do grep " 00:0" 
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print $1}' 
| sort | uniq -c | sort -rn
{code}
This output will be weighted by those test types that have more splits (but 
that's ok because that's where we can save time / improve throughput most).


*Results 5.0*
builds 352 - 385
(1)
{noformat}
   5 test-cdc_jdk11_python_3.8_no_amd64_2_8:
   4 test-latest_jdk17_python_3.8_no_amd64_2_8:
   4 dtest-upgrade_jdk11_python_3.8_no_amd64_40_64:
   4 dtest-upgrade_jdk11_python_3.8_no_amd64_37_64:
   3 test-system-keyspace-directory_jdk11_python_3.8_no_amd64_7_8:
   3 test-oa_jdk17_python_3.8_no_amd64_5_8:
   3 test-oa_jdk11_python_3.8_no_amd64_3_8:
   3 test-compression_jdk17_python_3.8_no_amd64_8_8:
   3 test-compression_jdk17_python_3.8_no_amd64_7_8:
   3 test-compression_jdk17_python_3.8_no_amd64_2_8:
   3 test-cdc_jdk11_python_3.8_no_amd64_4_8:
   3 dtest-upgrade_jdk11_python_3.8_no_amd64_54_64:
   3 dtest-upgrade_jdk11_python_3.8_no_amd64_53_64:
   3 dtest-upgrade_jdk11_python_3.8_no_amd64_28_64:
   3 dtest-upgrade-novnode_jdk11_python_3.8_no_amd64_10_64:
   3 dtest-upgrade-large_jdk11_python_3.8_no_amd64_12_32:
…
{noformat}
(format here is `<test_type>-<jdk>-<python>-<cython>-<arch>-<split>-<splits>:`)
(2a)
{noformat}
  54 dtest-upgrade
  43 dtest-upgrade-novnode
  18 test-oa
  18 test-compression
  15 test-cdc
  15 dtest-upgrade-large
  14 test-system-keyspace-directory
  14 test-latest
  14 dtest-upgrade-novnode-large
…
{noformat}
(2b)
{noformat}
 535 dtest-latest
 495 dtest
 341 long-test
 306 dtest-novnode
 167 jvm-dtest-upgrade
 102 cqlsh-test
  90 dtest-upgrade-large
  90 dtest-large-latest
  89 dtest-large
  77 dtest-upgrade-novnode-large
  49 stress-test
  49 fqltool-test
  34 dtest-large-novnode
  24 simulator-dtest
…
{noformat}

*Results trunk*
builds 1988 - 2021
(1)
{noformat}
{noformat}
(2a)
{noformat}
{noformat}
(2b)
{noformat}
{noformat}


  was:
Using CASSANDRA-20157, analyse 
#  what splits are problematic
#  what test types are configured with too many splits, or not enough


*Analyse Process*

Download the jenkins consoleText logs we want to analyse
{noformat}
for i in $(seq $first_build $last_build) ; do 
  wget -q https://ci-cassandra.apache.org/job/Cassandra-5.0/$i/consoleText 
  bash -c "grep '] Time ' consoleText | awk -F']' '{print $2}' | sort -u > 
Cassandra-5.0_${i}_timings.txt"
  rm consoleText
done
{noformat}
Alternative, if using nightlies.a.o webdav mount (the consoleText.xz files have 
been manually downloaded and put into place).
{noformat}
for i in $(seq $first_build $last_build) ; do 
  bash -c "xzgrep '] Time ' 
<nightlies_webdav_mount>/cassandra/Cassandra-5.0/${i}/consoleText.xz | awk 
-F']' '{print $2}' | sort -u > Cassandra-5.0_${i}_timings.txt"
done
{noformat}

(1)
{code}
for i in $(seq $first_build $last_build) ; do grep " 01:0" 
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | sort | uniq -c | 
sort -rn
{code}
Target those individual splits that have timed out (duration of 1 hour) the 
most.

(2a)
For test types that have too few splits (are timing out at the 1 hour mark too 
often).
{code}
for i in $(seq $first_build $last_build) ; do grep " 01:0" 
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print $1}' 
| sort | uniq -c | sort -rn
{code}

(2b)
For test types that have too many splits (are finishing faster than ten 
minutes).
{code}
for i in $(seq $first_build $last_build) ; do grep " 00:0" 
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print $1}' 
| sort | uniq -c | sort -rn
{code}
This output will be weighted by those test types that have more splits (but 
that's ok because that's where we can save time / improve throughput most).


*Results 5.0*
(1)
{noformat}
   5 test-cdc_jdk11_python_3.8_no_amd64_2_8:
   4 test-latest_jdk17_python_3.8_no_amd64_2_8:
   4 dtest-upgrade_jdk11_python_3.8_no_amd64_40_64:
   4 dtest-upgrade_jdk11_python_3.8_no_amd64_37_64:
   3 test-system-keyspace-directory_jdk11_python_3.8_no_amd64_7_8:
   3 test-oa_jdk17_python_3.8_no_amd64_5_8:
   3 test-oa_jdk11_python_3.8_no_amd64_3_8:
   3 test-compression_jdk17_python_3.8_no_amd64_8_8:
   3 test-compression_jdk17_python_3.8_no_amd64_7_8:
   3 test-compression_jdk17_python_3.8_no_amd64_2_8:
   3 test-cdc_jdk11_python_3.8_no_amd64_4_8:
   3 dtest-upgrade_jdk11_python_3.8_no_amd64_54_64:
   3 dtest-upgrade_jdk11_python_3.8_no_amd64_53_64:
   3 dtest-upgrade_jdk11_python_3.8_no_amd64_28_64:
   3 dtest-upgrade-novnode_jdk11_python_3.8_no_amd64_10_64:
   3 dtest-upgrade-large_jdk11_python_3.8_no_amd64_12_32:
…
{noformat}
(format here is `<test_type>-<jdk>-<python>-<cython>-<arch>-<split>-<splits>:`)
(2a)
{noformat}
  54 dtest-upgrade
  43 dtest-upgrade-novnode
  18 test-oa
  18 test-compression
  15 test-cdc
  15 dtest-upgrade-large
  14 test-system-keyspace-directory
  14 test-latest
  14 dtest-upgrade-novnode-large
…
{noformat}
(2b)
{noformat}
 535 dtest-latest
 495 dtest
 341 long-test
 306 dtest-novnode
 167 jvm-dtest-upgrade
 102 cqlsh-test
  90 dtest-upgrade-large
  90 dtest-large-latest
  89 dtest-large
  77 dtest-upgrade-novnode-large
  49 stress-test
  49 fqltool-test
  34 dtest-large-novnode
  24 simulator-dtest
…
{noformat}

*Results trunk*
(1)
{noformat}
{noformat}
(2a)
{noformat}
{noformat}
(2b)
{noformat}
{noformat}



> Adjust  5.0 and trunk Jenkinsfile's splits configuration
> --------------------------------------------------------
>
>                 Key: CASSANDRA-20311
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20311
>             Project: Apache Cassandra
>          Issue Type: Task
>          Components: CI
>            Reporter: Michael Semb Wever
>            Assignee: Michael Semb Wever
>            Priority: Normal
>
> Using CASSANDRA-20157, analyse 
> #  what splits are problematic
> #  what test types are configured with too many splits, or not enough
> *Analyse Process*
> Download the jenkins consoleText logs we want to analyse
> {noformat}
> for i in $(seq $first_build $last_build) ; do 
>   wget -q https://ci-cassandra.apache.org/job/Cassandra-5.0/$i/consoleText 
>   bash -c "grep '] Time ' consoleText | awk -F']' '{print $2}' | sort -u > 
> Cassandra-5.0_${i}_timings.txt"
>   rm consoleText
> done
> {noformat}
> Alternative, if using nightlies.a.o webdav mount (the consoleText.xz files 
> have been manually downloaded and put into place).
> {noformat}
> for i in $(seq $first_build $last_build) ; do 
>   bash -c "xzgrep '] Time ' 
> <nightlies_webdav_mount>/cassandra/Cassandra-5.0/${i}/consoleText.xz | awk 
> -F']' '{print $2}' | sort -u > Cassandra-5.0_${i}_timings.txt"
> done
> {noformat}
> (1)
> {code}
> for i in $(seq $first_build $last_build) ; do grep " 01:0" 
> Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | sort | uniq -c | 
> sort -rn
> {code}
> Target those individual splits that have timed out (duration of 1 hour) the 
> most.
> (2a)
> For test types that have too few splits (are timing out at the 1 hour mark 
> too often).
> {code}
> for i in $(seq $first_build $last_build) ; do grep " 01:0" 
> Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print 
> $1}' | sort | uniq -c | sort -rn
> {code}
> (2b)
> For test types that have too many splits (are finishing faster than ten 
> minutes).
> {code}
> for i in $(seq $first_build $last_build) ; do grep " 00:0" 
> Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print 
> $1}' | sort | uniq -c | sort -rn
> {code}
> This output will be weighted by those test types that have more splits (but 
> that's ok because that's where we can save time / improve throughput most).
> *Results 5.0*
> builds 352 - 385
> (1)
> {noformat}
>    5 test-cdc_jdk11_python_3.8_no_amd64_2_8:
>    4 test-latest_jdk17_python_3.8_no_amd64_2_8:
>    4 dtest-upgrade_jdk11_python_3.8_no_amd64_40_64:
>    4 dtest-upgrade_jdk11_python_3.8_no_amd64_37_64:
>    3 test-system-keyspace-directory_jdk11_python_3.8_no_amd64_7_8:
>    3 test-oa_jdk17_python_3.8_no_amd64_5_8:
>    3 test-oa_jdk11_python_3.8_no_amd64_3_8:
>    3 test-compression_jdk17_python_3.8_no_amd64_8_8:
>    3 test-compression_jdk17_python_3.8_no_amd64_7_8:
>    3 test-compression_jdk17_python_3.8_no_amd64_2_8:
>    3 test-cdc_jdk11_python_3.8_no_amd64_4_8:
>    3 dtest-upgrade_jdk11_python_3.8_no_amd64_54_64:
>    3 dtest-upgrade_jdk11_python_3.8_no_amd64_53_64:
>    3 dtest-upgrade_jdk11_python_3.8_no_amd64_28_64:
>    3 dtest-upgrade-novnode_jdk11_python_3.8_no_amd64_10_64:
>    3 dtest-upgrade-large_jdk11_python_3.8_no_amd64_12_32:
> …
> {noformat}
> (format here is 
> `<test_type>-<jdk>-<python>-<cython>-<arch>-<split>-<splits>:`)
> (2a)
> {noformat}
>   54 dtest-upgrade
>   43 dtest-upgrade-novnode
>   18 test-oa
>   18 test-compression
>   15 test-cdc
>   15 dtest-upgrade-large
>   14 test-system-keyspace-directory
>   14 test-latest
>   14 dtest-upgrade-novnode-large
> …
> {noformat}
> (2b)
> {noformat}
>  535 dtest-latest
>  495 dtest
>  341 long-test
>  306 dtest-novnode
>  167 jvm-dtest-upgrade
>  102 cqlsh-test
>   90 dtest-upgrade-large
>   90 dtest-large-latest
>   89 dtest-large
>   77 dtest-upgrade-novnode-large
>   49 stress-test
>   49 fqltool-test
>   34 dtest-large-novnode
>   24 simulator-dtest
> …
> {noformat}
> *Results trunk*
> builds 1988 - 2021
> (1)
> {noformat}
> {noformat}
> (2a)
> {noformat}
> {noformat}
> (2b)
> {noformat}
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to