> we should at least write extensive documentation on how to use/modify in-jvm > dtest framework before deprecating python dtests. We should have this for all our testing frameworks period, in-jvm dtest, python dtest, and ccm. They're woefully under-documented IMO.
On Tue, Mar 29, 2022, at 6:11 AM, Paulo Motta wrote: > To elaborate a bit on the steep learning curve point, when mentoring new > contributors on a couple of occasions I told them to "just write a python > dtest" because we had no idea on how to test that functionality on in-jvm > tests while the python dtest was fairly straightforward to implement (I can't > recall exactly what feature was it but I can dig if necessary). > > While we might be already familiar with the in-jvm dtest framework due to our > exposure to it, we shouldn't neglect that there is a significant learning > curve associated with it for new contributors which IMO is much lower for > pyhton dtests. So we should at least write extensive documentation on how to > use/modify in-jvm dtest framework before deprecating python dtests. > > Em ter., 29 de mar. de 2022 às 06:58, Paulo Motta <pauloricard...@gmail.com> > escreveu: >> > They use a separate implementation of instance initialization and thus >> > they test the test server rather than the real node. >> >> I also have this concern. When adding a new service on CASSANDRA-16789 we >> had to explicitly modify the in-jvm dtest server to match the behavior from >> the actual server [1] (this is just a minor example but I remember having to >> do something similar on other tickets). >> >> Besides having a steep learning curve since users need to be familiar with >> the in-jvm dtest framework in order to add new functionality not supported >> by it, this is potentially unsafe, since the implementations can diverge >> without being caught by tests. >> >> Is there any way we could avoid duplicating functionality on the test server >> and use the same initialization code on in-jvm dtests? >> >> [1] - >> https://github.com/apache/cassandra/commit/ad249424814836bd00f47931258ad58bfefb24fd#diff-321b52220c5bd0aaadf275a845143eb208c889c2696ba0d48a5fc880551131d8R735 >> >> Em ter., 29 de mar. de 2022 às 04:22, Benjamin Lerer <ble...@apache.org> >> escreveu: >>>> They use a separate implementation of instance initialization and thus >>>> they test the test server rather than the real node. >>> >>> This is actually my main concern. What is the real gap between the in-JVM >>> tests server instance and a server as run by python DTests? >>> >>> Le mar. 29 mars 2022 à 00:08, bened...@apache.org <bened...@apache.org> a >>> écrit : >>>> > Other than that, it can be problematic to test upgrades when the >>>> > starting version must run with a different Java version than the end >>>> > release____ >>>> __ __ >>>> python upgrade tests seem to be particularly limited (from a quick skim, >>>> primarily testing major upgrade points that are now long in the past), so >>>> I’m not sure how much of a penalty this is today in practice - but it >>>> might well become a problem.____ >>>> __ __ >>>> There’s several questions to answer, namely how many versions we want >>>> to:____ >>>> __ __ >>>> - test upgrades across____ >>>> - maintain backwards compatibility of the in-jvm dtest api across____ >>>> - support a given JVM for____ >>>> __ __ >>>> However, if we need to, we can probably use RMI to transparently support >>>> multiple JVMs for tests that require it. Since we already use >>>> serialization to cross the ClassLoader boundary it might not even be very >>>> difficult.____ >>>> __ __ >>>> __ __ >>>> *From: *Jacek Lewandowski <lewandowski.ja...@gmail.com> >>>> *Date: *Monday, 28 March 2022 at 22:30 >>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>>> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests____ >>>> Although I like in-jvm DTests for many scenarios, I can see that they do >>>> not test the production code as it is. They use a separate implementation >>>> of instance initialization and thus they test the test server rather than >>>> the real node. Other than that, it can be problematic to test upgrades >>>> when the starting version must run with a different Java version than the >>>> end release. One more thing I've been observing sometimes is high >>>> consumption of metaspace, which does not seem to be cleaned after >>>> individual test cases. Given each started instance uses a dedicated class >>>> loader there is some amount of trash left and when there are a couple of >>>> multi-node test cases in a single test class, it sometimes happens that >>>> the test fail with out of memory in metaspace error.____ >>>> __ __ >>>> Thanks,____ >>>> Jacek____ >>>> __ __ >>>> On Mon, Mar 28, 2022 at 10:06 PM David Capwell <dcapw...@apple.com> >>>> wrote:____ >>>>> I am back and the work for trunk to support vnode is at the last stage of >>>>> review; I had not planned to backport the changes to other branches (aka, >>>>> older branches would only support single token), so if someone would like >>>>> to pick up this work it is rather LHF after 17332 goes in (see trunk >>>>> patch GH PR: trunk <https://github.com/apache/cassandra/pull/1432>).____ >>>>> __ __ >>>>> I am in favor of deprecating python dtests, and agree we should figure >>>>> out what the gaps are (once vnode support is merged) so we can either >>>>> shrink them or special case to unfreeze (such as startup changes being >>>>> allowed).____ >>>>> >>>>> ____ >>>>>> On Mar 14, 2022, at 6:13 AM, Josh McKenzie <jmcken...@apache.org> >>>>>> wrote:____ >>>>>> __ __ >>>>>> vnode support for in-jvm dtests is in flight and fairly >>>>>> straightforward:____ >>>>>> __ __ >>>>>> https://issues.apache.org/jira/browse/CASSANDRA-17332____ >>>>>> __ __ >>>>>> David's OOO right now but I suspect we can get this in in April some >>>>>> time.____ >>>>>> __ __ >>>>>> On Mon, Mar 14, 2022, at 8:36 AM, bened...@apache.org wrote:____ >>>>>>> This is the limitation I mentioned. I think this is solely a question >>>>>>> of supplying an initial config that uses vnodes, i.e. that specifies >>>>>>> multiple tokens for each node. It is not really a limitation – I >>>>>>> believe a dtest could be written today using vnodes, by overriding the >>>>>>> config’s tokens. It does look like the token handling has been >>>>>>> refactored since the initial implementation to make this a little >>>>>>> uglier than should be necessary.____ >>>>>>> ____ >>>>>>> We should make this trivial, anyway, and perhaps offer a way to run all >>>>>>> of the dtests with vnodes (and suitably annotating those that cannot be >>>>>>> run with vnodes). This should be quite easy.____ >>>>>>> ____ >>>>>>> ____ >>>>>>> *From: *Andrés de la Peña <adelap...@apache.org> >>>>>>> *Date: *Monday, 14 March 2022 at 12:28 >>>>>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>>>>>> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests____ >>>>>>> >>>>>>> Last time I checked there wasn't support for vnodes on in-jvm dtests, >>>>>>> which seems an important limitation.____ >>>>>>> ____ >>>>>>> On Mon, 14 Mar 2022 at 12:24, bened...@apache.org <bened...@apache.org> >>>>>>> wrote:____ >>>>>>>> I am strongly in favour of deprecating python dtests in all cases >>>>>>>> where they are currently superseded by in-jvm dtests. They are >>>>>>>> environmentally more challenging to work with, causing many problems >>>>>>>> on local and remote machines. They are harder to debug, slower, >>>>>>>> flakier, and mostly less sophisticated.____ >>>>>>>> ____ >>>>>>>> > all focus on getting the in-jvm framework robust enough to cover >>>>>>>> > edge-cases____ >>>>>>>> ____ >>>>>>>> Would be great to collect gaps. I think it’s just vnodes, which is by >>>>>>>> no means a fundamental limitation? There may also be some stuff to do >>>>>>>> startup/shutdown and environmental scripts, that may be a niche we >>>>>>>> retain something like python dtests for.____ >>>>>>>> ____ >>>>>>>> > people aren’t familiar____ >>>>>>>> ____ >>>>>>>> I would be interested to hear from these folk to understand their >>>>>>>> concerns or problems using in-jvm dtests, if there is a cohort holding >>>>>>>> off for this reason____ >>>>>>>> ____ >>>>>>>> > This is going to require documentation work from some of the >>>>>>>> > original authors____ >>>>>>>> ____ >>>>>>>> I think a collection of template-like tests we can point people to >>>>>>>> would be a cheap initial effort. Cutting and pasting an existing test >>>>>>>> with the required functionality, then editing to suit, should get most >>>>>>>> people off to a quick start who aren’t familiar.____ >>>>>>>> ____ >>>>>>>> > Labor and process around revving new releases of the in-jvm dtest >>>>>>>> > API____ >>>>>>>> ____ >>>>>>>> I think we need to revisit how we do this, as it is currently broken. >>>>>>>> We should consider either using ASF snapshots until we cut new >>>>>>>> releases of C* itself, or else using git subprojects. This will also >>>>>>>> become a problem for Accord’s integration over time, and perhaps other >>>>>>>> subprojects in future, so it is worth better solving this.____ >>>>>>>> ____ >>>>>>>> I think this has been made worse than necessary by moving too many >>>>>>>> implementation details to the shared API project – some should be >>>>>>>> retained within the C* tree, with the API primarily serving as the >>>>>>>> shared API itself to ensure cross-version compatibility. However, this >>>>>>>> is far from a complete explanation of (or solution to) the problem.____ >>>>>>>> ____ >>>>>>>> ____ >>>>>>>> ____ >>>>>>>> *From: *Josh McKenzie <jmcken...@apache.org> >>>>>>>> *Date: *Monday, 14 March 2022 at 12:11 >>>>>>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>>>>>>> *Subject: *[DISCUSS] Should we deprecate / freeze python dtests____ >>>>>>>> >>>>>>>> I've been wrestling with the python dtests recently and that led to >>>>>>>> some discussions with other contributors about whether we as a project >>>>>>>> should be writing new tests in the python dtest framework or the >>>>>>>> in-jvm framework. This discussion has come up tangentially on some >>>>>>>> other topics, including the lack of documentation / expertise on the >>>>>>>> in-jvm framework dis-incentivizing some folks from authoring new tests >>>>>>>> there vs. the difficulty debugging and maintaining timer-based, >>>>>>>> sleep-based non-deterministic python dtests, etc.____ >>>>>>>> ____ >>>>>>>> I don't know of a place where we've formally discussed this and made a >>>>>>>> project-wide call on where we expect new distributed tests to be >>>>>>>> written; if I've missed an email about this someone please link on the >>>>>>>> thread here (and stop reading! ;))____ >>>>>>>> ____ >>>>>>>> At this time we don't specify a preference for where you write new >>>>>>>> multi-node distributed tests on our "development/testing" portion of >>>>>>>> the site and documentation: >>>>>>>> https://cassandra.apache.org/_/development/testing.html____ >>>>>>>> ____ >>>>>>>> The primary tradeoffs as I understand them for moving from >>>>>>>> python-based multi-node testing to jdk-based are:____ >>>>>>>> Pros:____ >>>>>>>> 1. Better debugging functionality (breakpoints, IDE integration, >>>>>>>> etc)____ >>>>>>>> 2. Integration with simulator____ >>>>>>>> 3. More deterministic runtime (anecdotally; python dtests _should_ be >>>>>>>> deterministic but in practice they prove to be very prone to >>>>>>>> environmental disruption)____ >>>>>>>> 4. Test time visibility to internals of cassandra____ >>>>>>>> Cons:____ >>>>>>>> 1. The framework is not as mature as the python dtest framework (some >>>>>>>> functionality missing)____ >>>>>>>> 2. Labor and process around revving new releases of the in-jvm dtest >>>>>>>> API____ >>>>>>>> 3. People aren't familiar with it yet and there's a learning curve____ >>>>>>>> ____ >>>>>>>> So my bid here: I personally think we as a project should freeze >>>>>>>> writing new tests in the python dtest framework and all focus on >>>>>>>> getting the in-jvm framework robust enough to cover edge-cases that >>>>>>>> might still be causing new tests to be written in the python >>>>>>>> framework. This is going to require documentation work from some of >>>>>>>> the original authors of the in-jvm framework as well as folks >>>>>>>> currently familiar with it and effort from those of us not yet >>>>>>>> intimately familiar with the API to get to know it, however I believe >>>>>>>> the long-term benefits to the project will be well worth it.____ >>>>>>>> ____ >>>>>>>> We could institute a pre-commit check that warns on a commit >>>>>>>> increasing our raw count of python dtests to help provide >>>>>>>> process-based visibility to this change in direction for the project's >>>>>>>> testing.____ >>>>>>>> ____ >>>>>>>> So: what do we think?____ >>>>> __ __