Re: [DISCUSSION] Ignite integration testing framework.

Denis Magda Thu, 02 Jul 2020 07:47:56 -0700

Folks,

Please share the summary of that Slack conversation here for records once
you find common ground.


-
Denis


On Thu, Jul 2, 2020 at 3:22 AM Nikolay Izhikov <[email protected]> wrote:

> Igniters.
>
> All who are interested in integration testing framework discussion are
> welcome into slack channel -
> https://join.slack.com/share/zt-fk2ovehf-TcomEAwiXaPzLyNKZbmfzw?cdn_fallback=2
>
>
>
> > 2 июля 2020 г., в 13:06, Anton Vinogradov <[email protected]> написал(а):
> >
> > Max,
> > Thanks for joining us.
> >
> > > 1. tiden can deploy artifacts by itself, while ducktape relies on
> > > dependencies being deployed by external scripts.
> > No. It is important to distinguish development, deploy, and
> orchestration.
> > All-in-one solutions have extremely limited usability.
> > As to Ducktests:
> > Docker is responsible for deployments during development.
> > CI/CD is responsible for deployments during release and nightly checks.
> It's up to the team to chose AWS, VM, BareMetal, and even OS.
> > Ducktape is responsible for orchestration.
> >
> > > 2. tiden can execute actions over remote nodes in real parallel
> fashion,
> > >while ducktape internally does all actions sequentially.
> > No. Ducktape may start any service in parallel. See Pme-free benchmark
> [1] for details.
> >
> > > if we used ducktape solution we would have to instead prepare some
> > > deployment scripts to pre-initialize Sberbank hosts, for example, with
> > > Ansible or Chef.
> > Sure, because a way of deploy depends on infrastructure.
> > How can we be sure that OS we use and the restrictions we have will be
> compatible with Tiden?
> >
> > > You have solved this deficiency with docker by putting all dependencies
> > > into one uber-image ...
> > and
> > > I guess we all know about docker hyped ability to run over distributed
> > >virtual networks.
> > It is very important not to confuse the test's development (docker image
> you're talking about) and real deployment.
> >
> > > If we had stopped and started 5 nodes one-by-one, as ducktape does
> > All actions can be performed in parallel.
> > See how Ducktests [2] starts cluster in parallel for example.
> >
> > [1]
> https://github.com/apache/ignite/pull/7967/files#diff-59adde2a2ab7dc17aea6c65153dfcda7R84
> > [2]
> https://github.com/apache/ignite/pull/7967/files#diff-d6a7b19f30f349d426b8894a40389cf5R79
> >
> > On Thu, Jul 2, 2020 at 1:00 PM Nikolay Izhikov <[email protected]>
> wrote:
> > Hello, Maxim.
> >
> > > 1. tiden can deploy artifacts by itself, while ducktape relies on
> dependencies being deployed by external scripts
> >
> > Why do you think that maintaining deploy scripts coupled with the
> testing framework is an advantage?
> > I thought we want to see and maintain deployment scripts separate from
> the testing framework.
> >
> > > 2. tiden can execute actions over remote nodes in real parallel
> fashion, while ducktape internally does all actions sequentially.
> >
> > Can you, please, clarify, what actions do you have in mind?
> > And why we want to execute them concurrently?
> > Ignite node start, Client application execution can be done concurrently
> with the ducktape approach.
> >
> > > If we used ducktape solution we would have to instead prepare some
> deployment scripts to pre-initialize Sberbank hosts, for example, with
> Ansible or Chef
> >
> > We shouldn’t take some user approach as an argument in this discussion.
> Let’s discuss a general approach for all users of the Ignite. Anyway, what
> is wrong with the external deployment script approach?
> >
> > We, as a community, should provide several ways to run integration tests
> out-of-the-box AND the ability to customize deployment regarding the user
> landscape.
> >
> > > You have solved this deficiency with docker by putting all
> dependencies into one uber-image and that looks like simple and elegant
> solution however, that effectively limits you to single-host testing.
> >
> > Docker image should be used only by the Ignite developers to test
> something locally.
> > It’s not intended for some real-world testing.
> >
> > The main issue with the Tiden that I see, it tested and maintained as a
> closed source solution.
> > This can lead to the hard to solve problems when we start using and
> maintaining it as an open-source solution.
> > Like, how many developers used Tiden? And how many of developers were
> not authors of the Tiden itself?
> >
> >
> > > 2 июля 2020 г., в 12:30, Max Shonichev <[email protected]>
> написал(а):
> > >
> > > Anton, Nikolay,
> > >
> > > Let's agree on what we are arguing about: whether it is about "like or
> don't like" or about technical properties of suggested solutions.
> > >
> > > If it is about likes and dislikes, then the whole discussion is
> meaningless. However, I hope together we can analyse pros and cons
> carefully.
> > >
> > > As far as I can understand now, two main differences between ducktape
> and tiden is that:
> > >
> > > 1. tiden can deploy artifacts by itself, while ducktape relies on
> dependencies being deployed by external scripts.
> > >
> > > 2. tiden can execute actions over remote nodes in real parallel
> fashion, while ducktape internally does all actions sequentially.
> > >
> > > As for me, these are very important properties for distributed testing
> framework.
> > >
> > > First property let us easily reuse tiden in existing infrastructures,
> for example, during Zookeeper IEP testing at Sberbank site we used the same
> tiden scripts that we use in our lab, the only change was putting a list of
> hosts into config.
> > >
> > > If we used ducktape solution we would have to instead prepare some
> deployment scripts to pre-initialize Sberbank hosts, for example, with
> Ansible or Chef.
> > >
> > >
> > > You have solved this deficiency with docker by putting all
> dependencies into one uber-image and that looks like simple and elegant
> solution,
> > > however, that effectively limits you to single-host testing.
> > >
> > > I guess we all know about docker hyped ability to run over distributed
> virtual networks. We used to go that way, but quickly found that it is more
> of the hype than real work. In real environments, there are problems with
> routing, DNS, multicast and broadcast traffic, and many others, that turn
> docker-based distributed solution into a fragile hard-to-maintain monster.
> > >
> > > Please, if you believe otherwise, perform a run of your PoC over at
> least two physical hosts and share results with us.
> > >
> > > If you consider that one physical docker host is enough, please, don't
> overlook that we want to run real scale scenarios, with 50-100 cache
> groups, persistence enabled and a millions of keys loaded.
> > >
> > > Practical limit for such configurations is 4-6 nodes per single
> physical host. Otherwise, tests become flaky due to resource starvation.
> > >
> > > Please, if you believe otherwise, perform at least a 10 of runs of
> your PoC with other tests running at TC (we're targeting TeamCity, right?)
> and share results so we could check if the numbers are reproducible.
> > >
> > > I stress this once more: functional integration tests are OK to run in
> Docker and CI, but running benchmarks in Docker is a big NO GO.
> > >
> > >
> > > Second property let us write tests that require real-parallel actions
> over hosts.
> > >
> > > For example, agreed scenario for PME benchmarkduring "PME optimization
> stream" was as follows:
> > >
> > >  - 10 server nodes, preloaded with 1M of keys
> > >  - 4 client nodes perform transactional load  (client nodes physically
> separated from server nodes)
> > >  - during load:
> > >  -- 5 server nodes stopped in parallel
> > >  -- after 1 minute, all 5 nodes are started in parallel
> > >  - load stopped, logs are analysed for exchange times.
> > >
> > > If we had stopped and started 5 nodes one-by-one, as ducktape does,
> then partition map exchange merge would not happen and we could not have
> measured PME optimizations for that case.
> > >
> > >
> > > These are limitations of ducktape that we believe as a more important
> > > argument "against" than you provide "for".
> > >
> > >
> > >
> > >
> > > On 30.06.2020 14:58, Anton Vinogradov wrote:
> > >> Folks,
> > >> First, I've created PR [1] with ducktests improvements
> > >> PR contains the following changes
> > >> - Pme-free switch proof-benchmark (2.7.6 vs master)
> > >> - Ability to check (compare with) previous releases (eg. 2.7.6 & 2.8)
> > >> - Global refactoring
> > >> -- benchmarks javacode simplification
> > >> -- services python and java classes code deduplication
> > >> -- fail-fast checks for java and python (eg. application should
> explicitly write it finished with success)
> > >> -- simple results extraction from tests and benchmarks
> > >> -- javacode now configurable from tests/benchmarks
> > >> -- proper SIGTERM handling at javacode (eg. it may finish last
> operation and log results)
> > >> -- docker volume now marked as delegated to increase execution speed
> for mac & win users
> > >> -- Ignite cluster now start in parallel (start speed-up)
> > >> -- Ignite can be configured at test/benchmark
> > >> - full and module assembly scripts added
> > > Great job done! But let me remind one of Apache Ignite principles:
> > > week of thinking save months of development.
> > >
> > >
> > >> Second, I'd like to propose to accept ducktests [2] (ducktape
> integration) as a target "PoC check & real topology benchmarking tool".
> > >> Ducktape pros
> > >> - Developed for distributed system by distributed system developers.
> > > So does Tiden
> > >
> > >> - Developed since 2014, stable.
> > > Tiden is also pretty stable, and development start date is not a good
> argument, for example pytest is since 2004, pytest-xdist (plugin for
> distributed testing) is since 2010, but we don't see it as a alternative at
> all.
> > >
> > >> - Proven usability by usage at Kafka.
> > > Tiden is proven usable by usage at GridGain and Sberbank deployments.
> > > Core, storage, sql and tx teams use benchmark results provided by
> Tiden on a daily basis.
> > >
> > >> - Dozens of dozens tests and benchmarks at Kafka as a great example
> pack.
> > > We'll donate some of our suites to Ignite as I've mentioned in
> previous letter.
> > >
> > >> - Built-in Docker support for rapid development and checks.
> > > False, there's no specific 'docker support' in ducktape itself, you
> just wrap it in docker by yourself, because ducktape is lacking deployment
> abilities.
> > >
> > >> - Great for CI automation.
> > > False, there's no specific CI-enabled features in ducktape. Tiden, on
> the other hand, provide generic xUnit reporting format, which is supported
> by both TeamCity and Jenkins. Also, instead of using private keys, Tiden
> can use SSH agent, which is also great for CI, because both
> > > TeamCity and Jenkins store keys in secret storage available only for
> ssh-agent and only for the time of the test.
> > >
> > >
> > >> > As an additional motivation, at least 3 teams
> > >> - IEP-45 team (to check crash-recovery speed-up (discovery and Zabbix
> speed-up))
> > >> - Ignite SE Plugins team (to check plugin's features does not
> slow-down or broke AI features)
> > >> - Ignite SE QA team (to append already developed smoke/load/failover
> tests to AI codebase)
> > >
> > > Please, before recommending your tests to other teams, provide proofs
> > > that your tests are reproducible in real environment.
> > >
> > >
> > >> now, wait for ducktest merge to start checking cases they working on
> in AI way.
> > >> Thoughts?
> > > Let us together review both solutions, we'll try to run your tests in
> our lab, and you'll try to at least checkout tiden and see if same tests
> can be implemented with it?
> > >
> > >
> > >
> > >> [1] https://github.com/apache/ignite/pull/7967
> > >> [2] https://github.com/apache/ignite/tree/ignite-ducktape
> > >> On Tue, Jun 16, 2020 at 12:22 PM Nikolay Izhikov <[email protected]
> <mailto:[email protected]>> wrote:
> > >>    Hello, Maxim.
> > >>    Thank you for so detailed explanation.
> > >>    Can we put the content of this discussion somewhere on the wiki?
> > >>    So It doesn’t get lost.
> > >>    I divide the answer in several parts. From the requirements to the
> > >>    implementation.
> > >>    So, if we agreed on the requirements we can proceed with the
> > >>    discussion of the implementation.
> > >>    1. Requirements:
> > >>    The main goal I want to achieve is *reproducibility* of the tests.
> > >>    I’m sick and tired with the zillions of flaky, rarely failed, and
> > >>    almost never failed tests in Ignite codebase.
> > >>    We should start with the simplest scenarios that will be as
> reliable
> > >>    as steel :)
> > >>    I want to know for sure:
> > >>       - Is this PR makes rebalance quicker or not?
> > >>       - Is this PR makes PME quicker or not?
> > >>    So, your description of the complex test scenario looks as a next
> > >>    step to me.
> > >>    Anyway, It’s cool we already have one.
> > >>    The second goal is to have a strict test lifecycle as we have in
> > >>    JUnit and similar frameworks.
> > >>     > It covers production-like deployment and running a scenarios
> over
> > >>    a single database instance.
> > >>    Do you mean «single cluster» or «single host»?
> > >>    2. Existing tests:
> > >>     > A Combinator suite allows to run set of operations concurrently
> > >>    over given database instance.
> > >>     > A Consumption suite allows to run a set production-like actions
> > >>    over given set of Ignite/GridGain versions and compare test metrics
> > >>    across versions
> > >>     > A Yardstick suite
> > >>     > A Stress suite that simulates hardware environment degradation
> > >>     > An Ultimate, DR and Compatibility suites that performs
> functional
> > >>    regression testing
> > >>     > Regression
> > >>    Great news that we already have so many choices for testing!
> > >>    Mature test base is a big +1 for Tiden.
> > >>    3. Comparison:
> > >>     > Criteria: Test configuration
> > >>     > Ducktape: single JSON string for all tests
> > >>     > Tiden: any number of YaML config files, command line option for
> > >>    fine-grained test configuration, ability to select/modify tests
> > >>    behavior based on Ignite version.
> > >>    1. Many YAML files can be hard to maintain.
> > >>    2. In ducktape, you can set parameters via «—parameters» option.
> > >>    Please, take a look at the doc [1]
> > >>     > Criteria: Cluster control
> > >>     > Tiden: additionally can address cluster as a whole and execute
> > >>    remote commands in parallel.
> > >>    It seems we implement this ability in the PoC, already.
> > >>     > Criteria: Test assertions
> > >>     > Tiden: simple asserts, also few customized assertion helpers.
> > >>     > Ducktape: simple asserts.
> > >>    Can you, please, be more specific.
> > >>    What helpers do you have in mind?
> > >>    Ducktape has an asserts that waits for logfile messages or some
> > >>    process finish.
> > >>     > Criteria: Test reporting
> > >>     > Ducktape: limited to its own text/HTML format
> > >>    Ducktape have
> > >>    1. Text reporter
> > >>    2. Customizable HTML reporter
> > >>    3. JSON reporter.
> > >>    We can show JSON with the any template or tool.
> > >>     > Criteria: Provisioning and deployment
> > >>     > Ducktape: can provision subset of hosts from cluster for test
> > >>    needs. However, that means, that test can’t be scaled without test
> > >>    code changes. Does not do any deploy, relies on external means,
> e.g.
> > >>    pre-packaged in docker image, as in PoC.
> > >>    This is not true.
> > >>    1. We can set explicit test parameters(node number) via parameters.
> > >>    We can increase client count of cluster size without test code
> changes.
> > >>    2. We have many choices for the test environment. These choices are
> > >>    tested and used in other projects:
> > >>             * docker
> > >>             * vagrant
> > >>             * private cloud(ssh access)
> > >>             * ec2
> > >>    Please, take a look at Kafka documentation [2]
> > >>     > I can continue more on this, but it should be enough for now:
> > >>    We need to go deeper! :)
> > >>    [1]
> > >>
> https://ducktape-docs.readthedocs.io/en/latest/run_tests.html#options
> > >>    [2]
> https://github.com/apache/kafka/tree/trunk/tests#ec2-quickstart
> > >>     > 9 июня 2020 г., в 17:25, Max A. Shonichev <[email protected]
> > >>    <mailto:[email protected]>> написал(а):
> > >>     >
> > >>     > Greetings, Nikolay,
> > >>     >
> > >>     > First of all, thank you for you great effort preparing PoC of
> > >>    integration testing to Ignite community.
> > >>     >
> > >>     > It’s a shame Ignite did not have at least some such tests yet,
> > >>    however, GridGain, as a major contributor to Apache Ignite had a
> > >>    profound collection of in-house tools to perform integration and
> > >>    performance testing for years already and while we slowly consider
> > >>    sharing our expertise with the community, your initiative makes us
> > >>    drive that process a bit faster, thanks a lot!
> > >>     >
> > >>     > I reviewed your PoC and want to share a little about what we do
> > >>    on our part, why and how, hope it would help community take proper
> > >>    course.
> > >>     >
> > >>     > First I’ll do a brief overview of what decisions we made and
> what
> > >>    we do have in our private code base, next I’ll describe what we
> have
> > >>    already donated to the public and what we plan public next, then
> > >>    I’ll compare both approaches highlighting deficiencies in order to
> > >>    spur public discussion on the matter.
> > >>     >
> > >>     > It might seem strange to use Python to run Bash to run Java
> > >>    applications because that introduces IT industry best of breed’ –
> > >>    the Python dependency hell – to the Java application code base. The
> > >>    only strangest decision one can made is to use Maven to run Docker
> > >>    to run Bash to run Python to run Bash to run Java, but desperate
> > >>    times call for desperate measures I guess.
> > >>     >
> > >>     > There are Java-based solutions for integration testing exists,
> > >>    e.g. Testcontainers [1], Arquillian [2], etc, and they might go
> well
> > >>    for Ignite community CI pipelines by them selves. But we also
> wanted
> > >>    to run performance tests and benchmarks, like the dreaded PME
> > >>    benchmark, and this is solved by totally different set of tools in
> > >>    Java world, e.g. Jmeter [3], OpenJMH [4], Gatling [5], etc.
> > >>     >
> > >>     > Speaking specifically about benchmarking, Apache Ignite
> community
> > >>    already has Yardstick [6], and there’s nothing wrong with writing
> > >>    PME benchmark using Yardstick, but we also wanted to be able to run
> > >>    scenarios like this:
> > >>     > - put an X load to a Ignite database;
> > >>     > - perform an Y set of operations to check how Ignite copes with
> > >>    operations under load.
> > >>     >
> > >>     > And yes, we also wanted applications under test be deployed
> ‘like
> > >>    in a production’, e.g. distributed over a set of hosts. This arises
> > >>    questions about provisioning and nodes affinity which I’ll cover in
> > >>    detail later.
> > >>     >
> > >>     > So we decided to put a little effort to build a simple tool to
> > >>    cover different integration and performance scenarios, and our QA
> > >>    lab first attempt was PoC-Tester [7], currently open source for all
> > >>    but for reporting web UI. It’s a quite simple to use 95% Java-based
> > >>    tool targeted to be run on a pre-release QA stage.
> > >>     >
> > >>     > It covers production-like deployment and running a scenarios
> over
> > >>    a single database instance. PoC-Tester scenarios consists of a
> > >>    sequence of tasks running sequentially or in parallel. After all
> > >>    tasks complete, or at any time during test, user can run logs
> > >>    collection task, logs are checked against exceptions and a summary
> > >>    of found issues and task ops/latency statistics is generated at the
> > >>    end of scenario. One of the main PoC-Tester features is its
> > >>    fire-and-forget approach to task managing. That is, you can deploy
> a
> > >>    grid and left it running for weeks, periodically firing some tasks
> > >>    onto it.
> > >>     >
> > >>     > During earliest stages of PoC-Tester development it becomes
> quite
> > >>    clear that Java application development is a tedious process and
> > >>    architecture decisions you take during development are slow and
> hard
> > >>    to change.
> > >>     > For example, scenarios like this
> > >>     > - deploy two instances of GridGain with master-slave data
> > >>    replication configured;
> > >>     > - put a load on master;
> > >>     > - perform checks on slave,
> > >>     > or like this:
> > >>     > - preload a 1Tb of data by using your favorite tool of choice to
> > >>    an Apache Ignite of version X;
> > >>     > - run a set of functional tests running Apache Ignite version Y
> > >>    over preloaded data,
> > >>     > do not fit well in the PoC-Tester workflow.
> > >>     >
> > >>     > So, this is why we decided to use Python as a generic scripting
> > >>    language of choice.
> > >>     >
> > >>     > Pros:
> > >>     > - quicker prototyping and development cycles
> > >>     > - easier to find DevOps/QA engineer with Python skills than one
> > >>    with Java skills
> > >>     > - used extensively all over the world for DevOps/CI pipelines
> and
> > >>    thus has rich set of libraries for all possible integration uses
> cases.
> > >>     >
> > >>     > Cons:
> > >>     > - Nightmare with dependencies. Better stick to specific
> > >>    language/libraries version.
> > >>     >
> > >>     > Comparing alternatives for Python-based testing framework we
> have
> > >>    considered following requirements, somewhat similar to what you’ve
> > >>    mentioned for Confluent [8] previously:
> > >>     > - should be able run locally or distributed (bare metal or in
> the
> > >>    cloud)
> > >>     > - should have built-in deployment facilities for applications
> > >>    under test
> > >>     > - should separate test configuration and test code
> > >>     > -- be able to easily reconfigure tests by simple configuration
> > >>    changes
> > >>     > -- be able to easily scale test environment by simple
> > >>    configuration changes
> > >>     > -- be able to perform regression testing by simple switching
> > >>    artifacts under test via configuration
> > >>     > -- be able to run tests with different JDK version by simple
> > >>    configuration changes
> > >>     > - should have human readable reports and/or reporting tools
> > >>    integration
> > >>     > - should allow simple test progress monitoring, one does not
> want
> > >>    to run 6-hours test to find out that application actually crashed
> > >>    during first hour.
> > >>     > - should allow parallel execution of test actions
> > >>     > - should have clean API for test writers
> > >>     > -- clean API for distributed remote commands execution
> > >>     > -- clean API for deployed applications start / stop and other
> > >>    operations
> > >>     > -- clean API for performing check on results
> > >>     > - should be open source or at least source code should allow
> ease
> > >>    change or extension
> > >>     >
> > >>     > Back at that time we found no better alternative than to write
> > >>    our own framework, and here goes Tiden [9] as GridGain framework of
> > >>    choice for functional integration and performance testing.
> > >>     >
> > >>     > Pros:
> > >>     > - solves all the requirements above
> > >>     > Cons (for Ignite):
> > >>     > - (currently) closed GridGain source
> > >>     >
> > >>     > On top of Tiden we’ve built a set of test suites, some of which
> > >>    you might have heard already.
> > >>     >
> > >>     > A Combinator suite allows to run set of operations concurrently
> > >>    over given database instance. Proven to find at least 30+ race
> > >>    conditions and NPE issues.
> > >>     >
> > >>     > A Consumption suite allows to run a set production-like actions
> > >>    over given set of Ignite/GridGain versions and compare test metrics
> > >>    across versions, like heap/disk/CPU consumption, time to perform
> > >>    actions, like client PME, server PME, rebalancing time, data
> > >>    replication time, etc.
> > >>     >
> > >>     > A Yardstick suite is a thin layer of Python glue code to run
> > >>    Apache Ignite pre-release benchmarks set. Yardstick itself has a
> > >>    mediocre deployment capabilities, Tiden solves this easily.
> > >>     >
> > >>     > A Stress suite that simulates hardware environment degradation
> > >>    during testing.
> > >>     >
> > >>     > An Ultimate, DR and Compatibility suites that performs
> functional
> > >>    regression testing of GridGain Ultimate Edition features like
> > >>    snapshots, security, data replication, rolling upgrades, etc.
> > >>     >
> > >>     > A Regression and some IEPs testing suites, like IEP-14, IEP-15,
> > >>    etc, etc, etc.
> > >>     >
> > >>     > Most of the suites above use another in-house developed Java
> tool
> > >>    – PiClient – to perform actual loading and miscellaneous operations
> > >>    with Ignite under test. We use py4j Python-Java gateway library to
> > >>    control PiClient instances from the tests.
> > >>     >
> > >>     > When we considered CI, we put TeamCity out of scope, because
> > >>    distributed integration and performance tests tend to run for hours
> > >>    and TeamCity agents are scarce and costly resource. So, bundled
> with
> > >>    Tiden there is jenkins-job-builder [10] based CI pipelines and
> > >>    Jenkins xUnit reporting. Also, rich web UI tool Ward aggregates
> test
> > >>    run reports across versions and has built in visualization support
> > >>    for Combinator suite.
> > >>     >
> > >>     > All of the above is currently closed source, but we plan to make
> > >>    it public for community, and publishing Tiden core [9] is the first
> > >>    step on that way. You can review some examples of using Tiden for
> > >>    tests at my repository [11], for start.
> > >>     >
> > >>     > Now, let’s compare Ducktape PoC and Tiden.
> > >>     >
> > >>     > Criteria: Language
> > >>     > Tiden: Python, 3.7
> > >>     > Ducktape: Python, proposes itself as Python 2.7, 3.6, 3.7
> > >>    compatible, but actually can’t work with Python 3.7 due to broken
> > >>    Zmq dependency.
> > >>     > Comment: Python 3.7 has a much better support for async-style
> > >>    code which might be crucial for distributed application testing.
> > >>     > Score: Tiden: 1, Ducktape: 0
> > >>     >
> > >>     > Criteria: Test writers API
> > >>     > Supported integration test framework concepts are basically the
> same:
> > >>     > - a test controller (test runner)
> > >>     > - a cluster
> > >>     > - a node
> > >>     > - an application (a service in Ducktape terms)
> > >>     > - a test
> > >>     > Score: Tiden: 5, Ducktape: 5
> > >>     >
> > >>     > Criteria: Tests selection and run
> > >>     > Ducktape: suite-package-class-method level selection, internal
> > >>    scheduler allows to run tests in suite in parallel.
> > >>     > Tiden: also suite-package-class-method level selection,
> > >>    additionally allows selecting subset of tests by attribute,
> parallel
> > >>    runs not built in, but allows merging test reports after different
> runs.
> > >>     > Score: Tiden: 2, Ducktape: 2
> > >>     >
> > >>     > Criteria: Test configuration
> > >>     > Ducktape: single JSON string for all tests
> > >>     > Tiden: any number of YaML config files, command line option for
> > >>    fine-grained test configuration, ability to select/modify tests
> > >>    behavior based on Ignite version.
> > >>     > Score: Tiden: 3, Ducktape: 1
> > >>     >
> > >>     > Criteria: Cluster control
> > >>     > Ducktape: allow execute remote commands by node granularity
> > >>     > Tiden: additionally can address cluster as a whole and execute
> > >>    remote commands in parallel.
> > >>     > Score: Tiden: 2, Ducktape: 1
> > >>     >
> > >>     > Criteria: Logs control
> > >>     > Both frameworks have similar builtin support for remote logs
> > >>    collection and grepping. Tiden has built-in plugin that can zip,
> > >>    collect arbitrary log files from arbitrary locations at
> > >>    test/module/suite granularity and unzip if needed, also application
> > >>    API to search / wait for messages in logs. Ducktape allows each
> > >>    service declare its log files location (seemingly does not support
> > >>    logs rollback), and a single entrypoint to collect service logs.
> > >>     > Score: Tiden: 1, Ducktape: 1
> > >>     >
> > >>     > Criteria: Test assertions
> > >>     > Tiden: simple asserts, also few customized assertion helpers.
> > >>     > Ducktape: simple asserts.
> > >>     > Score: Tiden: 2, Ducktape: 1
> > >>     >
> > >>     > Criteria: Test reporting
> > >>     > Ducktape: limited to its own text/html format
> > >>     > Tiden: provides text report, yaml report for reporting tools
> > >>    integration, XML xUnit report for integration with
> Jenkins/TeamCity.
> > >>     > Score: Tiden: 3, Ducktape: 1
> > >>     >
> > >>     > Criteria: Provisioning and deployment
> > >>     > Ducktape: can provision subset of hosts from cluster for test
> > >>    needs. However, that means, that test can’t be scaled without test
> > >>    code changes. Does not do any deploy, relies on external means,
> e.g.
> > >>    pre-packaged in docker image, as in PoC.
> > >>     > Tiden: Given a set of hosts, Tiden uses all of them for the
> test.
> > >>    Provisioning should be done by external means. However, provides a
> > >>    conventional automated deployment routines.
> > >>     > Score: Tiden: 1, Ducktape: 1
> > >>     >
> > >>     > Criteria: Documentation and Extensibility
> > >>     > Tiden: current API documentation is limited, should change as we
> > >>    go open source. Tiden is easily extensible via hooks and plugins,
> > >>    see example Maven plugin and Gatling application at [11].
> > >>     > Ducktape: basic documentation at readthedocs.io
> > >>    <http://readthedocs.io>. Codebase is rigid, framework core is
> > >>    tightly coupled and hard to change. The only possible extension
> > >>    mechanism is fork-and-rewrite.
> > >>     > Score: Tiden: 2, Ducktape: 1
> > >>     >
> > >>     > I can continue more on this, but it should be enough for now:
> > >>     > Overall score: Tiden: 22, Ducktape: 14.
> > >>     >
> > >>     > Time for discussion!
> > >>     >
> > >>     > ---
> > >>     > [1] - https://www.testcontainers.org/
> > >>     > [2] - http://arquillian.org/guides/getting_started/
> > >>     > [3] - https://jmeter.apache.org/index.html
> > >>     > [4] - https://openjdk.java.net/projects/code-tools/jmh/
> > >>     > [5] - https://gatling.io/docs/current/
> > >>     > [6] - https://github.com/gridgain/yardstick
> > >>     > [7] - https://github.com/gridgain/poc-tester
> > >>     > [8] -
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/System+Test+Improvements
> > >>     > [9] - https://github.com/gridgain/tiden
> > >>     > [10] - https://pypi.org/project/jenkins-job-builder/
> > >>     > [11] - https://github.com/mshonichev/tiden_examples
> > >>     >
> > >>     > On 25.05.2020 11:09, Nikolay Izhikov wrote:
> > >>     >> Hello,
> > >>     >>
> > >>     >> Branch with duck tape created -
> > >>    https://github.com/apache/ignite/tree/ignite-ducktape
> > >>     >>
> > >>     >> Any who are willing to contribute to PoC are welcome.
> > >>     >>
> > >>     >>
> > >>     >>> 21 мая 2020 г., в 22:33, Nikolay Izhikov
> > >>    <[email protected] <mailto:[email protected]>>
> написал(а):
> > >>     >>>
> > >>     >>> Hello, Denis.
> > >>     >>>
> > >>     >>> There is no rush with these improvements.
> > >>     >>> We can wait for Maxim proposal and compare two solutions :)
> > >>     >>>
> > >>     >>>> 21 мая 2020 г., в 22:24, Denis Magda <[email protected]
> > >>    <mailto:[email protected]>> написал(а):
> > >>     >>>>
> > >>     >>>> Hi Nikolay,
> > >>     >>>>
> > >>     >>>> Thanks for kicking off this conversation and sharing your
> > >>    findings with the
> > >>     >>>> results. That's the right initiative. I do agree that Ignite
> > >>    needs to have
> > >>     >>>> an integration testing framework with capabilities listed by
> you.
> > >>     >>>>
> > >>     >>>> As we discussed privately, I would only check if instead of
> > >>     >>>> Confluent's Ducktape library, we can use an integration
> > >>    testing framework
> > >>     >>>> developed by GridGain for testing of Ignite/GridGain
> clusters.
> > >>    That
> > >>     >>>> framework has been battle-tested and might be more
> convenient for
> > >>     >>>> Ignite-specific workloads. Let's wait for @Maksim Shonichev
> > >>     >>>> <[email protected] <mailto:[email protected]>>
> who
> > >>    promised to join this thread once he finishes
> > >>     >>>> preparing the usage examples of the framework. To my
> > >>    knowledge, Max has
> > >>     >>>> already been working on that for several days.
> > >>     >>>>
> > >>     >>>> -
> > >>     >>>> Denis
> > >>     >>>>
> > >>     >>>>
> > >>     >>>> On Thu, May 21, 2020 at 12:27 AM Nikolay Izhikov
> > >>    <[email protected] <mailto:[email protected]>>
> > >>     >>>> wrote:
> > >>     >>>>
> > >>     >>>>> Hello, Igniters.
> > >>     >>>>>
> > >>     >>>>> I created a PoC [1] for the integration tests of Ignite.
> > >>     >>>>>
> > >>     >>>>> Let me briefly explain the gap I want to cover:
> > >>     >>>>>
> > >>     >>>>> 1. For now, we don’t have a solution for automated testing
> of
> > >>    Ignite on
> > >>     >>>>> «real cluster».
> > >>     >>>>> By «real cluster» I mean cluster «like a production»:
> > >>     >>>>>       * client and server nodes deployed on different hosts.
> > >>     >>>>>       * thin clients perform queries from some other hosts
> > >>     >>>>>       * etc.
> > >>     >>>>>
> > >>     >>>>> 2. We don’t have a solution for automated benchmarks of some
> > >>    internal
> > >>     >>>>> Ignite process
> > >>     >>>>>       * PME
> > >>     >>>>>       * rebalance.
> > >>     >>>>> This means we don’t know - Do we perform rebalance(or PME)
> in
> > >>    2.7.0 faster
> > >>     >>>>> or slower than in 2.8.0 for the same cluster?
> > >>     >>>>>
> > >>     >>>>> 3. We don’t have a solution for automated testing of Ignite
> > >>    integration in
> > >>     >>>>> a real-world environment:
> > >>     >>>>> Ignite-Spark integration can be taken as an example.
> > >>     >>>>> I think some ML solutions also should be tested in
> real-world
> > >>    deployments.
> > >>     >>>>>
> > >>     >>>>> Solution:
> > >>     >>>>>
> > >>     >>>>> I propose to use duck tape library from confluent (apache
> 2.0
> > >>    license)
> > >>     >>>>> I tested it both on the real cluster(Yandex Cloud) and on
> the
> > >>    local
> > >>     >>>>> environment(docker) and it works just fine.
> > >>     >>>>>
> > >>     >>>>> PoC contains following services:
> > >>     >>>>>
> > >>     >>>>>       * Simple rebalance test:
> > >>     >>>>>               Start 2 server nodes,
> > >>     >>>>>               Create some data with Ignite client,
> > >>     >>>>>               Start one more server node,
> > >>     >>>>>               Wait for rebalance finish
> > >>     >>>>>       * Simple Ignite-Spark integration test:
> > >>     >>>>>               Start 1 Spark master, start 1 Spark worker,
> > >>     >>>>>               Start 1 Ignite server node
> > >>     >>>>>               Create some data with Ignite client,
> > >>     >>>>>               Check data in application that queries it from
> > >>    Spark.
> > >>     >>>>>
> > >>     >>>>> All tests are fully automated.
> > >>     >>>>> Logs collection works just fine.
> > >>     >>>>> You can see an example of the tests report - [4].
> > >>     >>>>>
> > >>     >>>>> Pros:
> > >>     >>>>>
> > >>     >>>>> * Ability to test local changes(no need to public changes to
> > >>    some remote
> > >>     >>>>> repository or similar).
> > >>     >>>>> * Ability to parametrize test environment(run the same tests
> > >>    on different
> > >>     >>>>> JDK, JVM params, config, etc.)
> > >>     >>>>> * Isolation by default so system tests are as reliable as
> > >>    possible.
> > >>     >>>>> * Utilities for pulling up and tearing down services easily
> > >>    in clusters in
> > >>     >>>>> different environments (e.g. local, custom cluster, Vagrant,
> > >>    K8s, Mesos,
> > >>     >>>>> Docker, cloud providers, etc.)
> > >>     >>>>> * Easy to write unit tests for distributed systems
> > >>     >>>>> * Adopted and successfully used by other distributed open
> > >>    source project -
> > >>     >>>>> Apache Kafka.
> > >>     >>>>> * Collect results (e.g. logs, console output)
> > >>     >>>>> * Report results (e.g. expected conditions met, performance
> > >>    results, etc.)
> > >>     >>>>>
> > >>     >>>>> WDYT?
> > >>     >>>>>
> > >>     >>>>> [1] https://github.com/nizhikov/ignite/pull/15
> > >>     >>>>> [2] https://github.com/confluentinc/ducktape
> > >>     >>>>> [3]
> https://ducktape-docs.readthedocs.io/en/latest/run_tests.html
> > >>     >>>>> [4] https://yadi.sk/d/JC8ciJZjrkdndg
> >
>
>
>

Re: [DISCUSSION] Ignite integration testing framework.

Reply via email to