Re: [Patch Available for Review!] CASSANDRA-14134: Migrate dtests to use pytest and python3

Michael Kjellman Wed, 10 Jan 2018 13:54:06 -0800

i had done some limited testing on the medium size an didn't see quite as bad 
behavior you were seeing... :\

i added a test fixture
(sufficient_system_resources_for_resource_intensive_tests) that just currently
does a very very basic check free memory check and deselects tests annotated
with the @pytest.mark.resource_intensive annotation if the current system
doesn't have enough resources.

my short/medium term thinking was that we could expand on this and dynamically
skip tests for whatever physical resource constraints we're working with --
with the ultimate goal to dynamically run as many tests reliably as possible
given what we have.

Any chance you'd mind changing your circleci config to set CCM_MAX_HEAP_SIZE
under resource_constrained_env_vars to 769MB and kicking off another run to get
us a baseline? I see a ton of the failures were from tests that run stress to
pre-fill the cluster for the test.. do you know if we have a way to control the
heap settings of stress when it's invoked via ccm.node as we do in the dtests?

On Jan 10, 2018, at 1:04 PM, Stefan Podkowinski
<s...@apache.org<mailto:s...@apache.org>> wrote:

I was giving this another try today to see how long it would take to
finish on a oss account. But I've canceled the job after some hours as
tests started to fail almost constantly.

https://circleci.com/gh/spodkowinski/cassandra/176

Looks like the 2CPU/4096MB (medium) limit for each container isn't
really adequate for dtests. Yours seem to be running on xlarge.

On 10.01.18 21:05, Michael Kjellman wrote:
plan of action is to continue running everything on asf jenkins.

in additional all developers (just like today) will be free to run the unit
tests and as many of the dtests as possible against their local test branches
in circleci. circleci offers a free OSS account with 4 containers. while it
will be slow, it will run. additionally anyone who wants more speed is
obviously free to upgrade their account.

does that plan resolve any concerns you have?

On Jan 10, 2018, at 12:01 PM, Josh McKenzie <jmcken...@apache.org> wrote:

1) have *all* our tests run on *every* commit
Have we discussed the cost / funding aspect of this? I know we as a project
have run into infra-donation cost issues in the past with differentiating
between ASF as a whole and cassandra as a project, so not sure how that'd
work in terms of sponsors funding circleci containers just for this
project's use, for instance.

This is a huge improvement in runtime (understatement of the day award...)
so great work on that front.

On Tue, Jan 9, 2018 at 11:04 PM, Nate McCall <zznat...@gmail.com> wrote:

Making these tests more accessible and reliable is super huge. There
are a lot of folks in our community who are not well versed with
python (myself included). I wholly support *any* efforts we can make
for the dtest process to be easy.

Thanks a bunch for taking this on. I think it will pay off quickly.

On Wed, Jan 10, 2018 at 4:55 PM, Michael Kjellman <kjell...@apple.com>
wrote:
hi!

a few of us have been continuously iterating on the dtest-on-pytest
branch now since the 2nd and we’ve run the dtests close to 600 times in ci.
ariel has been working his way thru a formal review (three cheers for
ariel!)
flaky tests are a real thing and despite a few dozen totally green test
runs, the vast majority of runs are still reliably hitting roughly 1-3 test
failures. in a world where we can now run the dtests in 20 minutes instead
of 13 hours it’s now at least possible to keep finding these flaky tests
and fixing them one by one...
i haven’t gotten a huge amount of feedback overall and i really want to
hear it! ultimately this work is driven by the desire to 1) have *all* our
tests run on *every* commit; 2) be able to trust the results; 3) make our
testing story so amazing that even the most casual weekend warrior who
wants to work on the project can (and will want to!) use it.
i’m *not* a python guy (although lucky i know and work with many who
are). thankfully i’ve been able to defer to them for much of this largely
python based effort.... i’m sure there are a few more people working on the
project who do consider themselves python experts and i’d especially
appreciate your feedback!
finally, a lot of my effort was focused around improving the end users
experience (getting bootstrapped, running the tests, improving the
debugability story, etc). i’d really appreciate it if people could try
running the pytest branch and following the install instructions to figure
out what could be improved on. any existing behavior i’ve inadvertently now
removed that’s going to make someone’s life miserable? 😅
thanks! looking forward to hearing any and all feedback from the
community!
best,
kjellman

On Jan 3, 2018, at 8:08 AM, Michael Kjellman <
mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote:
no, i’m not. i just figured i should target python 3.6 if i was doing
this work in the first place. the current Ubuntu LTS was pulling in a
pretty old version. any concerns with using 3.6?
On Jan 3, 2018, at 1:51 AM, Stefan Podkowinski <s...@apache.org<mailto:
s...@apache.org>> wrote:
The latest updates to your branch fixed the logging issue, thanks! Tests
now seem to execute fine locally using pytest.

I was looking at the dockerfile and noticed that you explicitly use
python 3.6 there. Are you aware of any issues with older python3
versions, e.g. 3.5? Do I have to use 3.6 as well locally and do we have
to do the same for jenkins?

On 02.01.2018 22:42, Michael Kjellman wrote:
I reproduced the NOTSET log issue locally... got a fix.. i'll push a
commit up in a moment.
On Jan 2, 2018, at 11:24 AM, Michael Kjellman <
mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote:
Comments Inline: Thanks for giving this a go!!

On Jan 2, 2018, at 6:10 AM, Stefan Podkowinski <s...@apache.org<mailto:
s...@apache.org>> wrote:
I was giving this a try today with some mixed results. First of all,
running pytest locally would fail with an "ccmlib.common.ArgumentError:
Unknown log level NOTSET" error for each test. Although I created a new
virtualenv for that as described in the readme (thanks for updating!)
and use both of your dtest and cassandra branches. But I haven't patched
ccm as described in the ticket, maybe that's why? Can you publish a
patched ccm branch to gh?

99% sure this is an issue parsing the logging level passed to pytest to
the python logger... could you paste the exact command you're using to
invoke pytest? should be a small change - i'm sure i just missed a
invocation case.

The updated circle.yml is now using docker, which seems to be a good
idea to reduce clutter in the yaml file and gives us more control over
the test environment. Can you add the Dockerfile to the .circleci
directory as well? I couldn't find it when I was trying to solve the
pytest error mentioned above.

This is already tracked in a separate repo:
https://github.com/mkjellman/cassandra-test-docker/blob/master/Dockerfile
Next thing I did was to push your trunk_circle branch to my gh repo to
start a circleCI run. Finishing all dtests in 15 minutes sounds
exciting, but requires a paid tier plan to get that kind of
parallelization. Looks like the dtests have even been deliberately
disabled for non-paid accounts, so I couldn't test this any further.

the plan of action (i already already mentioned this in previous emails)
is to get dtests working for the free circieci oss accounts as well. part
of this work (already included in this pytest effort) is to have fixtures
that look at the system resources and dynamically include tests as possible.

Running dtests from the pytest branch on builds.apache.org<http://
builds.apache.org> did not work
either. At least the run_dtests.py arguments will need to be updated in
cassandra-builds. We currently only use a single cassandra-dtest.sh
script for all builds. Maybe we should create a new job template that
would use an updated script with the wip-pytest dtest branch, to make
this work and testable in parallel.

yes, i didn't touch cassandra-builds yet.. focused on getting circleci
and local runs working first... once we're happy with that and stable we
can make the changes to jenkins configs pretty easily...

On 21.12.2017 11:13, Michael Kjellman wrote:
I just created https://issues.apache.org/jira/browse/CASSANDRA-14134
which includes tons of details (and a patch available for review) with my
efforts to migrate dtests from nosetest to pytest (which ultimately ended
up also including porting the ode from python 2.7 to python 3).
I'd love if people could pitch in in any way to help get this reviewed
and committed so we can reduce the natural drift that will occur with a
huge patch like this against the changes going into master. I apologize for
sending this so close to the holidays, but I really have been working
non-stop trying to get things into a completed and stable state.
The latest CircleCI runs I did took roughly 15 minutes to run all the
dtests with only 6 failures remaining (when run with vnodes) and 12
failures remaining (when run without vnodes). For comparison the last ASF
Jenkins Dtest job to successfully complete took nearly 10 hours (9:51) and
we had 36 test failures. Of note, while I was working on this and trying to
determine a baseline for the existing tests I found that the ASF Jenkins
jobs were incorrectly configured due to a typo. The no-vnodes job is
actually running with vnodes (meaning the no-vnodes job is identical to the
with-vnodes ASF Jenkins job). There are some bootstrap tests that will 100%
reliably hang both nosetest and pytest on test cleanup, however this test
only runs in the no-vnodes configuration. I've debugged and fixed a lot of
these cases across many test cases over the past few weeks and I no longer
know of any tests that can hang CI.
Thanks and I'm optimistic about making testing great for the project and
most importantly for the OSS C* community!
best,
kjellman

Some highlights that I quickly thought of (in no particular order):
{also included in the JIRA}
-Migrate dtests from executing using the nosetest framework to pytest
-Port the entire code base from Python 2.7 to Python 3.6
-Update run_dtests.py to work with pytest
-Add --dtest-print-tests-only option to run_dtests.py to get easily
parsable list of all available collected tests
-Update README.md for executing the dtests with pytest
-Add new debugging tips section to README.md to help with some basics of
debugging python3 and pytest
-Migrate all existing Enviornment Variable usage as a means to control
dtest operation modes to argparse command line options with documented help
on each toggles intended usage
-Migration of old unitTest and nose based test structure to modern
pytest fixture approach
-Automatic detection of physical system resources to automatically
determine if @pytest.mark.resource_intensive annotated tests should be
collected and run on the system where they are being executed
-new pytest fixture replacements for @since and
@pytest.mark.upgrade_test annotations
-Migration to python logging framework
-Upgrade thrift bindings to latest version with full python3
compatibility
-Remove deprecated cql and pycassa dependencies and migrate any
remaining tests to fully remove those dependencies
-Fixed dozens of tests that would hang the pytest framework forever when
run in CI enviornments
-Ran code nearly 300 times in CircleCI during the migration and to find,
identify, and fix any tests capable of hanging CI
-Upgrade Tests do not yet run in CI and still need additional migration
work (although all upgrade test classes compile successfully)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-
unsubscr...@cassandra.apache.org>
For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:
dev-h...@cassandra.apache.org>
B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB�
� [��X��ܚX�K K[XZ[ � ]�][��X��ܚX�P �\��[� �K�\ X� K�ܙ�B��܈ Y ] [ۘ[
��[X[� � K[XZ[ � ]�Z [ �\��[� �K�\ X� K�ܙ�B�B

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [Patch Available for Review!] CASSANDRA-14134: Migrate dtests to use pytest and python3

Reply via email to