Re: [DISCUSS] Should we deprecate / freeze python dtests

Ekaterina Dimitrova Tue, 29 Mar 2022 11:12:39 -0700

One thing that we can add to docs is for people how to update the in-jvm
framework and test their patches before asking for in-jvm api release. The
assumption is those won’t be many updates needed I think, but it is good to
be documented.


On Tue, 29 Mar 2022 at 13:51, David Capwell <dcapw...@apple.com> wrote:

> They use a separate implementation of instance initialization and thus
> they test the test server rather than the real node.
>
>
> I think we can get rid of this by extending CassandraDaemon, just need to
> add a few hooks to mock out gossip/internode/client (for cases where the
> mocks are desired), and when mocks are not desired just run the real logic.
>
> Too many times I have had to make the 2 more in-line, and this is hard to
> maintain… we should fix this and feel this is 100% fixable
>
> we shouldn't neglect that there is a significant learning curve associated
> with it for new contributors which IMO is much lower for pyhton dtests
>
>
> I am curious about this comment.  When I first joined I learned jvm-dtest
> within an hour and started walking Repair code in a debugger (and this was
> way before the improvements that let us do things like nodetool)… python
> dtest took weeks to get working correctly (still having issues with the
> MBean library we use… so have to comment out error handling to get some
> tests to pass)….
>
> Maybe we could have some example docs showing how to do the same in both
> tools?  Honestly Cluster.build(3).withConfig(c ->
> c.with(Feature.values())).start() matches 95% of python dtest tests (the
> withConfig logic is a bit cryptic), so don’t think the docs would be too
> much work
>
>
> On Mar 29, 2022, at 5:48 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>
> we should at least write extensive documentation on how to use/modify
> in-jvm dtest framework before deprecating python dtests.
>
> We should have this for all our testing frameworks period, in-jvm dtest,
> python dtest, and ccm. They're woefully under-documented IMO.
>
> On Tue, Mar 29, 2022, at 6:11 AM, Paulo Motta wrote:
>
> To elaborate a bit on the steep learning curve point, when mentoring new
> contributors on a couple of occasions I told them to "just write a python
> dtest" because we had no idea on how to test that functionality on in-jvm
> tests while the python dtest was fairly straightforward to implement (I
> can't recall exactly what feature was it but I can dig if necessary).
>
> While we might be already familiar with the in-jvm dtest framework due to
> our exposure to it, we shouldn't neglect that there is a significant
> learning curve associated with it for new contributors which IMO is much
> lower for pyhton dtests. So we should at least write extensive
> documentation on how to use/modify in-jvm dtest framework before
> deprecating python dtests.
>
> Em ter., 29 de mar. de 2022 às 06:58, Paulo Motta <
> pauloricard...@gmail.com> escreveu:
>
> > They use a separate implementation of instance initialization and thus
> they test the test server rather than the real node.
>
> I also have this concern. When adding a new service on CASSANDRA-16789 we
> had to explicitly modify the in-jvm dtest server to match the behavior from
> the actual server [1] (this is just a minor example but I remember having
> to do something similar on other tickets).
>
> Besides having a steep learning curve since users need to be familiar with
> the in-jvm dtest framework in order to add new functionality not supported
> by it, this is potentially unsafe, since the implementations can diverge
> without being caught by tests.
>
> Is there any way we could avoid duplicating functionality on the test
> server and use the same initialization code on in-jvm dtests?
>
> [1] -
> https://github.com/apache/cassandra/commit/ad249424814836bd00f47931258ad58bfefb24fd#diff-321b52220c5bd0aaadf275a845143eb208c889c2696ba0d48a5fc880551131d8R735
>
> Em ter., 29 de mar. de 2022 às 04:22, Benjamin Lerer <ble...@apache.org>
> escreveu:
>
> They use a separate implementation of instance initialization and thus
> they test the test server rather than the real node.
>
>
> This is actually my main concern. What is the real gap between the in-JVM
> tests server instance and a server as run by python DTests?
>
> Le mar. 29 mars 2022 à 00:08, bened...@apache.org <bened...@apache.org> a
> écrit :
>
> > Other than that, it can be problematic to test upgrades when the
> starting version must run with a different Java version than the end release
>
>
>
> python upgrade tests seem to be particularly limited (from a quick skim,
> primarily testing major upgrade points that are now long in the past), so
> I’m not sure how much of a penalty this is today in practice - but it might
> well become a problem.
>
>
>
> There’s several questions to answer, namely how many versions we want to:
>
>
>
> - test upgrades across
>
> - maintain backwards compatibility of the in-jvm dtest api across
>
> - support a given JVM for
>
>
>
> However, if we need to, we can probably use RMI to transparently support
> multiple JVMs for tests that require it. Since we already use serialization
> to cross the ClassLoader boundary it might not even be very difficult.
>
>
>
>
>
> *From: *Jacek Lewandowski <lewandowski.ja...@gmail.com>
> *Date: *Monday, 28 March 2022 at 22:30
> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests
>
> Although I like in-jvm DTests for many scenarios, I can see that they do
> not test the production code as it is. They use a separate
> implementation of instance initialization and thus they test the test
> server rather than the real node. Other than that, it can be problematic to
> test upgrades when the starting version must run with a different Java
> version than the end release. One more thing I've been observing sometimes
> is high consumption of metaspace, which does not seem to be cleaned after
> individual test cases. Given each started instance uses a dedicated class
> loader there is some amount of trash left and when there are a couple of
> multi-node test cases in a single test class, it sometimes happens that the
> test fail with out of memory in metaspace error.
>
>
>
> Thanks,
>
> Jacek
>
>
>
> On Mon, Mar 28, 2022 at 10:06 PM David Capwell <dcapw...@apple.com> wrote:
>
> I am back and the work for trunk to support vnode is at the last stage of
> review; I had not planned to backport the changes to other branches (aka,
> older branches would only support single token), so if someone would like
> to pick up this work it is rather LHF after 17332 goes in (see trunk patch GH
> PR: trunk <https://github.com/apache/cassandra/pull/1432>).
>
>
>
> I am in favor of deprecating python dtests, and agree we should figure out
> what the gaps are (once vnode support is merged) so we can either shrink
> them or special case to unfreeze (such as startup changes being allowed).
>
>
> On Mar 14, 2022, at 6:13 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>
>
>
> vnode support for in-jvm dtests is in flight and fairly straightforward:
>
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-17332
>
>
>
> David's OOO right now but I suspect we can get this in in April some time.
>
>
>
> On Mon, Mar 14, 2022, at 8:36 AM, bened...@apache.org wrote:
>
> This is the limitation I mentioned. I think this is solely a question of
> supplying an initial config that uses vnodes, i.e. that specifies multiple
> tokens for each node. It is not really a limitation – I believe a dtest
> could be written today using vnodes, by overriding the config’s tokens. It
> does look like the token handling has been refactored since the initial
> implementation to make this a little uglier than should be necessary.
>
>
>
> We should make this trivial, anyway, and perhaps offer a way to run all of
> the dtests with vnodes (and suitably annotating those that cannot be run
> with vnodes). This should be quite easy.
>
>
>
>
>
> *From: *Andrés de la Peña <adelap...@apache.org>
> *Date: *Monday, 14 March 2022 at 12:28
> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests
>
> Last time I checked there wasn't support for vnodes on in-jvm dtests,
> which seems an important limitation.
>
>
>
> On Mon, 14 Mar 2022 at 12:24, bened...@apache.org <bened...@apache.org>
> wrote:
>
> I am strongly in favour of deprecating python dtests in all cases where
> they are currently superseded by in-jvm dtests. They are environmentally
> more challenging to work with, causing many problems on local and remote
> machines. They are harder to debug, slower, flakier, and mostly less
> sophisticated.
>
>
>
> > all focus on getting the in-jvm framework robust enough to cover
> edge-cases
>
>
>
> Would be great to collect gaps. I think it’s just vnodes, which is by no
> means a fundamental limitation? There may also be some stuff to do
> startup/shutdown and environmental scripts, that may be a niche we retain
> something like python dtests for.
>
>
>
> > people aren’t familiar
>
>
>
> I would be interested to hear from these folk to understand their concerns
> or problems using in-jvm dtests, if there is a cohort holding off for this
> reason
>
>
>
> > This is going to require documentation work from some of the original
> authors
>
>
>
> I think a collection of template-like tests we can point people to would
> be a cheap initial effort. Cutting and pasting an existing test with the
> required functionality, then editing to suit, should get most people off to
> a quick start who aren’t familiar.
>
>
>
> > Labor and process around revving new releases of the in-jvm dtest API
>
>
>
> I think we need to revisit how we do this, as it is currently broken. We
> should consider either using ASF snapshots until we cut new releases of C*
> itself, or else using git subprojects. This will also become a problem for
> Accord’s integration over time, and perhaps other subprojects in future, so
> it is worth better solving this.
>
>
>
> I think this has been made worse than necessary by moving too many
> implementation details to the shared API project – some should be retained
> within the C* tree, with the API primarily serving as the shared API itself
> to ensure cross-version compatibility. However, this is far from a complete
> explanation of (or solution to) the problem.
>
>
>
>
>
>
>
> *From: *Josh McKenzie <jmcken...@apache.org>
> *Date: *Monday, 14 March 2022 at 12:11
> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
> *Subject: *[DISCUSS] Should we deprecate / freeze python dtests
>
> I've been wrestling with the python dtests recently and that led to some
> discussions with other contributors about whether we as a project should be
> writing new tests in the python dtest framework or the in-jvm framework.
> This discussion has come up tangentially on some other topics, including
> the lack of documentation / expertise on the in-jvm framework
> dis-incentivizing some folks from authoring new tests there vs. the
> difficulty debugging and maintaining timer-based, sleep-based
> non-deterministic python dtests, etc.
>
>
>
> I don't know of a place where we've formally discussed this and made a
> project-wide call on where we expect new distributed tests to be written;
> if I've missed an email about this someone please link on the thread here
> (and stop reading! ;))
>
>
>
> At this time we don't specify a preference for where you write new
> multi-node distributed tests on our "development/testing" portion of the
> site and documentation:
> https://cassandra.apache.org/_/development/testing.html
>
>
>
> The primary tradeoffs as I understand them for moving from python-based
> multi-node testing to jdk-based are:
>
> Pros:
>
>    1. Better debugging functionality (breakpoints, IDE integration, etc)
>    2. Integration with simulator
>    3. More deterministic runtime (anecdotally; python dtests _should_ be
>    deterministic but in practice they prove to be very prone to environmental
>    disruption)
>    4. Test time visibility to internals of cassandra
>
> Cons:
>
>    1. The framework is not as mature as the python dtest framework (some
>    functionality missing)
>    2. Labor and process around revving new releases of the in-jvm dtest
>    API
>    3. People aren't familiar with it yet and there's a learning curve
>
>
>
> So my bid here: I personally think we as a project should freeze writing
> new tests in the python dtest framework and all focus on getting the in-jvm
> framework robust enough to cover edge-cases that might still be causing new
> tests to be written in the python framework. This is going to require
> documentation work from some of the original authors of the in-jvm
> framework as well as folks currently familiar with it and effort from those
> of us not yet intimately familiar with the API to get to know it, however I
> believe the long-term benefits to the project will be well worth it.
>
>
>
> We could institute a pre-commit check that warns on a commit increasing
> our raw count of python dtests to help provide process-based visibility to
> this change in direction for the project's testing.
>
>
>
> So: what do we think?
>
>
>

Re: [DISCUSS] Should we deprecate / freeze python dtests

Reply via email to