Re: [Patch Available for Review!] CASSANDRA-14134: Migrate dtests to use pytest and python3

Michael Kjellman Wed, 10 Jan 2018 14:04:03 -0800

another thought is to have the 
sufficient_system_resources_for_resource_intensive_tests fixture dynamically 
figure out the number of threads to run stress with. seems reasonable we should 
significantly lower our concurrency dynamically when we are resource 
constrained.


> On Jan 10, 2018, at 1:53 PM, Michael Kjellman <mkjell...@internalcircle.com> 
> wrote:
> 
> i had done some limited testing on the medium size an didn't see quite as bad 
> behavior you were seeing... :\
> 
> i added a test fixture 
> (sufficient_system_resources_for_resource_intensive_tests) that just 
> currently does a very very basic check free memory check and deselects tests 
> annotated with the @pytest.mark.resource_intensive annotation if the current 
> system doesn't have enough resources.
> 
> my short/medium term thinking was that we could expand on this and 
> dynamically skip tests for whatever physical resource constraints we're 
> working with -- with the ultimate goal to dynamically run as many tests 
> reliably as possible given what we have.
> 
> Any chance you'd mind changing your circleci config to set CCM_MAX_HEAP_SIZE 
> under resource_constrained_env_vars to 769MB and kicking off another run to 
> get us a baseline? I see a ton of the failures were from tests that run 
> stress to pre-fill the cluster for the test.. do you know if we have a way to 
> control the heap settings of stress when it's invoked via ccm.node as we do 
> in the dtests?
> 
> On Jan 10, 2018, at 1:04 PM, Stefan Podkowinski 
> <s...@apache.org<mailto:s...@apache.org>> wrote:
> 
> I was giving this another try today to see how long it would take to
> finish on a oss account. But I've canceled the job after some hours as
> tests started to fail almost constantly.
> 
> https://circleci.com/gh/spodkowinski/cassandra/176
> 
> Looks like the 2CPU/4096MB (medium) limit for each container isn't
> really adequate for dtests. Yours seem to be running on xlarge.
> 
> 
> On 10.01.18 21:05, Michael Kjellman wrote:
> plan of action is to continue running everything on asf jenkins.
> 
> in additional all developers (just like today) will be free to run the unit 
> tests and as many of the dtests as possible against their local test branches 
> in circleci. circleci offers a free OSS account with 4 containers. while it 
> will be slow, it will run. additionally anyone who wants more speed is 
> obviously free to upgrade their account.
> 
> does that plan resolve any concerns you have?
> 
> On Jan 10, 2018, at 12:01 PM, Josh McKenzie <jmcken...@apache.org> wrote:
> 
> 1) have *all* our tests run on *every* commit
> Have we discussed the cost / funding aspect of this? I know we as a project
> have run into infra-donation cost issues in the past with differentiating
> between ASF as a whole and cassandra as a project, so not sure how that'd
> work in terms of sponsors funding circleci containers just for this
> project's use, for instance.
> 
> This is a huge improvement in runtime (understatement of the day award...)
> so great work on that front.
> 
> 
> 
> On Tue, Jan 9, 2018 at 11:04 PM, Nate McCall <zznat...@gmail.com> wrote:
> 
> Making these tests more accessible and reliable is super huge. There
> are a lot of folks in our community who are not well versed with
> python (myself included). I wholly support *any* efforts we can make
> for the dtest process to be easy.
> 
> Thanks a bunch for taking this on. I think it will pay off quickly.
> 
> On Wed, Jan 10, 2018 at 4:55 PM, Michael Kjellman <kjell...@apple.com>
> wrote:
> hi!
> 
> a few of us have been continuously iterating on the dtest-on-pytest
> branch now since the 2nd and we’ve run the dtests close to 600 times in ci.
> ariel has been working his way thru a formal review (three cheers for
> ariel!)
> flaky tests are a real thing and despite a few dozen totally green test
> runs, the vast majority of runs are still reliably hitting roughly 1-3 test
> failures. in a world where we can now run the dtests in 20 minutes instead
> of 13 hours it’s now at least possible to keep finding these flaky tests
> and fixing them one by one...
> i haven’t gotten a huge amount of feedback overall and i really want to
> hear it! ultimately this work is driven by the desire to 1) have *all* our
> tests run on *every* commit; 2) be able to trust the results; 3) make our
> testing story so amazing that even the most casual weekend warrior who
> wants to work on the project can (and will want to!) use it.
> i’m *not* a python guy (although lucky i know and work with many who
> are). thankfully i’ve been able to defer to them for much of this largely
> python based effort.... i’m sure there are a few more people working on the
> project who do consider themselves python experts and i’d especially
> appreciate your feedback!
> finally, a lot of my effort was focused around improving the end users
> experience (getting bootstrapped, running the tests, improving the
> debugability story, etc). i’d really appreciate it if people could try
> running the pytest branch and following the install instructions to figure
> out what could be improved on. any existing behavior i’ve inadvertently now
> removed that’s going to make someone’s life miserable? 😅
> thanks! looking forward to hearing any and all feedback from the
> community!
> best,
> kjellman
> 
> 
> 
> On Jan 3, 2018, at 8:08 AM, Michael Kjellman <
> mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote:
> no, i’m not. i just figured i should target python 3.6 if i was doing
> this work in the first place. the current Ubuntu LTS was pulling in a
> pretty old version. any concerns with using 3.6?
> On Jan 3, 2018, at 1:51 AM, Stefan Podkowinski <s...@apache.org<mailto:
> s...@apache.org>> wrote:
> The latest updates to your branch fixed the logging issue, thanks! Tests
> now seem to execute fine locally using pytest.
> 
> I was looking at the dockerfile and noticed that you explicitly use
> python 3.6 there. Are you aware of any issues with older python3
> versions, e.g. 3.5? Do I have to use 3.6 as well locally and do we have
> to do the same for jenkins?
> 
> 
> On 02.01.2018 22:42, Michael Kjellman wrote:
> I reproduced the NOTSET log issue locally... got a fix.. i'll push a
> commit up in a moment.
> On Jan 2, 2018, at 11:24 AM, Michael Kjellman <
> mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote:
> Comments Inline: Thanks for giving this a go!!
> 
> On Jan 2, 2018, at 6:10 AM, Stefan Podkowinski <s...@apache.org<mailto:
> s...@apache.org>> wrote:
> I was giving this a try today with some mixed results. First of all,
> running pytest locally would fail with an "ccmlib.common.ArgumentError:
> Unknown log level NOTSET" error for each test. Although I created a new
> virtualenv for that as described in the readme (thanks for updating!)
> and use both of your dtest and cassandra branches. But I haven't patched
> ccm as described in the ticket, maybe that's why? Can you publish a
> patched ccm branch to gh?
> 
> 99% sure this is an issue parsing the logging level passed to pytest to
> the python logger... could you paste the exact command you're using to
> invoke pytest? should be a small change - i'm sure i just missed a
> invocation case.
> 
> The updated circle.yml is now using docker, which seems to be a good
> idea to reduce clutter in the yaml file and gives us more control over
> the test environment. Can you add the Dockerfile to the .circleci
> directory as well? I couldn't find it when I was trying to solve the
> pytest error mentioned above.
> 
> This is already tracked in a separate repo:
> https://github.com/mkjellman/cassandra-test-docker/blob/master/Dockerfile
> Next thing I did was to push your trunk_circle branch to my gh repo to
> start a circleCI run. Finishing all dtests in 15 minutes sounds
> exciting, but requires a paid tier plan to get that kind of
> parallelization. Looks like the dtests have even been deliberately
> disabled for non-paid accounts, so I couldn't test this any further.
> 
> the plan of action (i already already mentioned this in previous emails)
> is to get dtests working for the free circieci oss accounts as well. part
> of this work (already included in this pytest effort) is to have fixtures
> that look at the system resources and dynamically include tests as possible.
> 
> Running dtests from the pytest branch on builds.apache.org<http://
> builds.apache.org> did not work
> either. At least the run_dtests.py arguments will need to be updated in
> cassandra-builds. We currently only use a single cassandra-dtest.sh
> script for all builds. Maybe we should create a new job template that
> would use an updated script with the wip-pytest dtest branch, to make
> this work and testable in parallel.
> 
> yes, i didn't touch cassandra-builds yet.. focused on getting circleci
> and local runs working first... once we're happy with that and stable we
> can make the changes to jenkins configs pretty easily...
> 
> 
> 
> On 21.12.2017 11:13, Michael Kjellman wrote:
> I just created https://issues.apache.org/jira/browse/CASSANDRA-14134
> which includes tons of details (and a patch available for review) with my
> efforts to migrate dtests from nosetest to pytest (which ultimately ended
> up also including porting the ode from python 2.7 to python 3).
> I'd love if people could pitch in in any way to help get this reviewed
> and committed so we can reduce the natural drift that will occur with a
> huge patch like this against the changes going into master. I apologize for
> sending this so close to the holidays, but I really have been working
> non-stop trying to get things into a completed and stable state.
> The latest CircleCI runs I did took roughly 15 minutes to run all the
> dtests with only 6 failures remaining (when run with vnodes) and 12
> failures remaining (when run without vnodes). For comparison the last ASF
> Jenkins Dtest job to successfully complete took nearly 10 hours (9:51) and
> we had 36 test failures. Of note, while I was working on this and trying to
> determine a baseline for the existing tests I found that the ASF Jenkins
> jobs were incorrectly configured due to a typo. The no-vnodes job is
> actually running with vnodes (meaning the no-vnodes job is identical to the
> with-vnodes ASF Jenkins job). There are some bootstrap tests that will 100%
> reliably hang both nosetest and pytest on test cleanup, however this test
> only runs in the no-vnodes configuration. I've debugged and fixed a lot of
> these cases across many test cases over the past few weeks and I no longer
> know of any tests that can hang CI.
> Thanks and I'm optimistic about making testing great for the project and
> most importantly for the OSS C* community!
> best,
> kjellman
> 
> Some highlights that I quickly thought of (in no particular order):
> {also included in the JIRA}
> -Migrate dtests from executing using the nosetest framework to pytest
> -Port the entire code base from Python 2.7 to Python 3.6
> -Update run_dtests.py to work with pytest
> -Add --dtest-print-tests-only option to run_dtests.py to get easily
> parsable list of all available collected tests
> -Update README.md for executing the dtests with pytest
> -Add new debugging tips section to README.md to help with some basics of
> debugging python3 and pytest
> -Migrate all existing Enviornment Variable usage as a means to control
> dtest operation modes to argparse command line options with documented help
> on each toggles intended usage
> -Migration of old unitTest and nose based test structure to modern
> pytest fixture approach
> -Automatic detection of physical system resources to automatically
> determine if @pytest.mark.resource_intensive annotated tests should be
> collected and run on the system where they are being executed
> -new pytest fixture replacements for @since and
> @pytest.mark.upgrade_test annotations
> -Migration to python logging framework
> -Upgrade thrift bindings to latest version with full python3
> compatibility
> -Remove deprecated cql and pycassa dependencies and migrate any
> remaining tests to fully remove those dependencies
> -Fixed dozens of tests that would hang the pytest framework forever when
> run in CI enviornments
> -Ran code nearly 300 times in CircleCI during the migration and to find,
> identify, and fix any tests capable of hanging CI
> -Upgrade Tests do not yet run in CI and still need additional migration
> work (although all upgrade test classes compile successfully)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-
> unsubscr...@cassandra.apache.org>
> For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:
> dev-h...@cassandra.apache.org>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-
> unsubscr...@cassandra.apache.org>
> For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:
> dev-h...@cassandra.apache.org>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org<mailto:dev-
> unsubscr...@cassandra.apache.org>
> For additional commands, e-mail: dev-h...@cassandra.apache.org<mailto:
> dev-h...@cassandra.apache.org>
> B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB�
> � [��X��ܚX�K  K[XZ[ �  ]�][��X��ܚX�P �\��[� �K�\ X� K�ܙ�B��܈ Y  ] [ۘ[
> ��[X[� �  K[XZ[ �  ]�Z [   �\��[� �K�\ X� K�ܙ�B�B
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [Patch Available for Review!] CASSANDRA-14134: Migrate dtests to use pytest and python3

Reply via email to