hi! a few of us have been continuously iterating on the dtest-on-pytest branch now since the 2nd and we’ve run the dtests close to 600 times in ci. ariel has been working his way thru a formal review (three cheers for ariel!)
flaky tests are a real thing and despite a few dozen totally green test runs, the vast majority of runs are still reliably hitting roughly 1-3 test failures. in a world where we can now run the dtests in 20 minutes instead of 13 hours it’s now at least possible to keep finding these flaky tests and fixing them one by one... i haven’t gotten a huge amount of feedback overall and i really want to hear it! ultimately this work is driven by the desire to 1) have *all* our tests run on *every* commit; 2) be able to trust the results; 3) make our testing story so amazing that even the most casual weekend warrior who wants to work on the project can (and will want to!) use it. i’m *not* a python guy (although lucky i know and work with many who are). thankfully i’ve been able to defer to them for much of this largely python based effort.... i’m sure there are a few more people working on the project who do consider themselves python experts and i’d especially appreciate your feedback! finally, a lot of my effort was focused around improving the end users experience (getting bootstrapped, running the tests, improving the debugability story, etc). i’d really appreciate it if people could try running the pytest branch and following the install instructions to figure out what could be improved on. any existing behavior i’ve inadvertently now removed that’s going to make someone’s life miserable? 😅 thanks! looking forward to hearing any and all feedback from the community! best, kjellman On Jan 3, 2018, at 8:08 AM, Michael Kjellman <mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote: no, i’m not. i just figured i should target python 3.6 if i was doing this work in the first place. the current Ubuntu LTS was pulling in a pretty old version. any concerns with using 3.6? On Jan 3, 2018, at 1:51 AM, Stefan Podkowinski <s...@apache.org<mailto:s...@apache.org>> wrote: The latest updates to your branch fixed the logging issue, thanks! Tests now seem to execute fine locally using pytest. I was looking at the dockerfile and noticed that you explicitly use python 3.6 there. Are you aware of any issues with older python3 versions, e.g. 3.5? Do I have to use 3.6 as well locally and do we have to do the same for jenkins? On 02.01.2018 22:42, Michael Kjellman wrote: I reproduced the NOTSET log issue locally... got a fix.. i'll push a commit up in a moment. On Jan 2, 2018, at 11:24 AM, Michael Kjellman <mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote: Comments Inline: Thanks for giving this a go!! On Jan 2, 2018, at 6:10 AM, Stefan Podkowinski <s...@apache.org<mailto:s...@apache.org>> wrote: I was giving this a try today with some mixed results. First of all, running pytest locally would fail with an "ccmlib.common.ArgumentError: Unknown log level NOTSET" error for each test. Although I created a new virtualenv for that as described in the readme (thanks for updating!) and use both of your dtest and cassandra branches. But I haven't patched ccm as described in the ticket, maybe that's why? Can you publish a patched ccm branch to gh? 99% sure this is an issue parsing the logging level passed to pytest to the python logger... could you paste the exact command you're using to invoke pytest? should be a small change - i'm sure i just missed a invocation case. The updated circle.yml is now using docker, which seems to be a good idea to reduce clutter in the yaml file and gives us more control over the test environment. Can you add the Dockerfile to the .circleci directory as well? I couldn't find it when I was trying to solve the pytest error mentioned above. This is already tracked in a separate repo: https://github.com/mkjellman/cassandra-test-docker/blob/master/Dockerfile Next thing I did was to push your trunk_circle branch to my gh repo to start a circleCI run. Finishing all dtests in 15 minutes sounds exciting, but requires a paid tier plan to get that kind of parallelization. Looks like the dtests have even been deliberately disabled for non-paid accounts, so I couldn't test this any further. the plan of action (i already already mentioned this in previous emails) is to get dtests working for the free circieci oss accounts as well. part of this work (already included in this pytest effort) is to have fixtures that look at the system resources and dynamically include tests as possible. Running dtests from the pytest branch on builds.apache.org<http://builds.apache.org> did not work either. At least the run_dtests.py arguments will need to be updated in cassandra-builds. We currently only use a single cassandra-dtest.sh script for all builds. Maybe we should create a new job template that would use an updated script with the wip-pytest dtest branch, to make this work and testable in parallel. yes, i didn't touch cassandra-builds yet.. focused on getting circleci and local runs working first... once we're happy with that and stable we can make the changes to jenkins configs pretty easily... On 21.12.2017 11:13, Michael Kjellman wrote: I just created https://issues.apache.org/jira/browse/CASSANDRA-14134 which includes tons of details (and a patch available for review) with my efforts to migrate dtests from nosetest to pytest (which ultimately ended up also including porting the ode from python 2.7 to python 3). I'd love if people could pitch in in any way to help get this reviewed and committed so we can reduce the natural drift that will occur with a huge patch like this against the changes going into master. I apologize for sending this so close to the holidays, but I really have been working non-stop trying to get things into a completed and stable state. The latest CircleCI runs I did took roughly 15 minutes to run all the dtests with only 6 failures remaining (when run with vnodes) and 12 failures remaining (when run without vnodes). For comparison the last ASF Jenkins Dtest job to successfully complete took nearly 10 hours (9:51) and we had 36 test failures. Of note, while I was working on this and trying to determine a baseline for the existing tests I found that the ASF Jenkins jobs were incorrectly configured due to a typo. The no-vnodes job is actually running with vnodes (meaning the no-vnodes job is identical to the with-vnodes ASF Jenkins job). There are some bootstrap tests that will 100% reliably hang both nosetest and pytest on test cleanup, however this test only runs in the no-vnodes configuration. I've debugged and fixed a lot of these cases across many test cases over the past few weeks and I no longer know of any tests that can hang CI. Thanks and I'm optimistic about making testing great for the project and most importantly for the OSS C* community! best, kjellman Some highlights that I quickly thought of (in no particular order): {also included in the JIRA} -Migrate dtests from executing using the nosetest framework to pytest -Port the entire code base from Python 2.7 to Python 3.6 -Update run_dtests.py to work with pytest -Add --dtest-print-tests-only option to run_dtests.py to get easily parsable list of all available collected tests -Update README.md for executing the dtests with pytest -Add new debugging tips section to README.md to help with some basics of debugging python3 and pytest -Migrate all existing Enviornment Variable usage as a means to control dtest operation modes to argparse command line options with documented help on each toggles intended usage -Migration of old unitTest and nose based test structure to modern pytest fixture approach -Automatic detection of physical system resources to automatically determine if @pytest.mark.resource_intensive annotated tests should be collected and run on the system where they are being executed -new pytest fixture replacements for @since and @pytest.mark.upgrade_test annotations -Migration to python logging framework -Upgrade thrift bindings to latest version with full python3 compatibility -Remove deprecated cql and pycassa dependencies and migrate any remaining tests to fully remove those dependencies -Fixed dozens of tests that would hang the pytest framework forever when run in CI enviornments -Ran code nearly 300 times in CircleCI during the migration and to find, identify, and fix any tests capable of hanging CI -Upgrade Tests do not yet run in CI and still need additional migration work (although all upgrade test classes compile successfully) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-unsubscr...@cassandra.apache.org> For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:dev-h...@cassandra.apache.org> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-unsubscr...@cassandra.apache.org> For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:dev-h...@cassandra.apache.org> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-unsubscr...@cassandra.apache.org> For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:dev-h...@cassandra.apache.org> B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB��[��X��ܚX�KK[XZ[�]�][��X��ܚX�P�\��[��K�\X�K�ܙ�B��܈Y][ۘ[��[X[��K[XZ[�]�Z[�\��[��K�\X�K�ܙ�B�B