Re: [DISCUSSION] Ignite integration testing framework.

Anton Vinogradov Thu, 02 Jul 2020 03:07:08 -0700

Max,
Thanks for joining us.

> 1. tiden can deploy artifacts by itself, while ducktape relies on
> dependencies being deployed by external scripts.
No. It is important to distinguish development, deploy, and orchestration.
All-in-one solutions have extremely limited usability.
As to Ducktests:
Docker is responsible for deployments during development.
CI/CD is responsible for deployments during release and nightly checks.
It's up to the team to chose AWS, VM, BareMetal, and even OS.
Ducktape is responsible for orchestration.


> 2. tiden can execute actions over remote nodes in real parallel fashion,
>while ducktape internally does all actions sequentially.
No. Ducktape may start any service in parallel. See Pme-free benchmark [1]
for details.

> if we used ducktape solution we would have to instead prepare some
> deployment scripts to pre-initialize Sberbank hosts, for example, with
> Ansible or Chef.
Sure, because a way of deploy depends on infrastructure.
How can we be sure that OS we use and the restrictions we have will be
compatible with Tiden?

> You have solved this deficiency with docker by putting all dependencies
> into one uber-image ...
and
> I guess we all know about docker hyped ability to run over distributed
>virtual networks.
It is very important not to confuse the test's development (docker
image you're talking about) and real deployment.

> If we had stopped and started 5 nodes one-by-one, as ducktape does
All actions can be performed in parallel.
See how Ducktests [2] starts cluster in parallel for example.

[1]
https://github.com/apache/ignite/pull/7967/files#diff-59adde2a2ab7dc17aea6c65153dfcda7R84
[2]
https://github.com/apache/ignite/pull/7967/files#diff-d6a7b19f30f349d426b8894a40389cf5R79

On Thu, Jul 2, 2020 at 1:00 PM Nikolay Izhikov <nizhi...@apache.org> wrote:

> Hello, Maxim.
>
> > 1. tiden can deploy artifacts by itself, while ducktape relies on
> dependencies being deployed by external scripts
>
> Why do you think that maintaining deploy scripts coupled with the testing
> framework is an advantage?
> I thought we want to see and maintain deployment scripts separate from the
> testing framework.
>
> > 2. tiden can execute actions over remote nodes in real parallel fashion,
> while ducktape internally does all actions sequentially.
>
> Can you, please, clarify, what actions do you have in mind?
> And why we want to execute them concurrently?
> Ignite node start, Client application execution can be done concurrently
> with the ducktape approach.
>
> > If we used ducktape solution we would have to instead prepare some
> deployment scripts to pre-initialize Sberbank hosts, for example, with
> Ansible or Chef
>
> We shouldn’t take some user approach as an argument in this discussion.
> Let’s discuss a general approach for all users of the Ignite. Anyway, what
> is wrong with the external deployment script approach?
>
> We, as a community, should provide several ways to run integration tests
> out-of-the-box AND the ability to customize deployment regarding the user
> landscape.
>
> > You have solved this deficiency with docker by putting all dependencies
> into one uber-image and that looks like simple and elegant solution
> however, that effectively limits you to single-host testing.
>
> Docker image should be used only by the Ignite developers to test
> something locally.
> It’s not intended for some real-world testing.
>
> The main issue with the Tiden that I see, it tested and maintained as a
> closed source solution.
> This can lead to the hard to solve problems when we start using and
> maintaining it as an open-source solution.
> Like, how many developers used Tiden? And how many of developers were not
> authors of the Tiden itself?
>
>
> > 2 июля 2020 г., в 12:30, Max Shonichev <mshon...@yandex.ru> написал(а):
> >
> > Anton, Nikolay,
> >
> > Let's agree on what we are arguing about: whether it is about "like or
> don't like" or about technical properties of suggested solutions.
> >
> > If it is about likes and dislikes, then the whole discussion is
> meaningless. However, I hope together we can analyse pros and cons
> carefully.
> >
> > As far as I can understand now, two main differences between ducktape
> and tiden is that:
> >
> > 1. tiden can deploy artifacts by itself, while ducktape relies on
> dependencies being deployed by external scripts.
> >
> > 2. tiden can execute actions over remote nodes in real parallel fashion,
> while ducktape internally does all actions sequentially.
> >
> > As for me, these are very important properties for distributed testing
> framework.
> >
> > First property let us easily reuse tiden in existing infrastructures,
> for example, during Zookeeper IEP testing at Sberbank site we used the same
> tiden scripts that we use in our lab, the only change was putting a list of
> hosts into config.
> >
> > If we used ducktape solution we would have to instead prepare some
> deployment scripts to pre-initialize Sberbank hosts, for example, with
> Ansible or Chef.
> >
> >
> > You have solved this deficiency with docker by putting all dependencies
> into one uber-image and that looks like simple and elegant solution,
> > however, that effectively limits you to single-host testing.
> >
> > I guess we all know about docker hyped ability to run over distributed
> virtual networks. We used to go that way, but quickly found that it is more
> of the hype than real work. In real environments, there are problems with
> routing, DNS, multicast and broadcast traffic, and many others, that turn
> docker-based distributed solution into a fragile hard-to-maintain monster.
> >
> > Please, if you believe otherwise, perform a run of your PoC over at
> least two physical hosts and share results with us.
> >
> > If you consider that one physical docker host is enough, please, don't
> overlook that we want to run real scale scenarios, with 50-100 cache
> groups, persistence enabled and a millions of keys loaded.
> >
> > Practical limit for such configurations is 4-6 nodes per single physical
> host. Otherwise, tests become flaky due to resource starvation.
> >
> > Please, if you believe otherwise, perform at least a 10 of runs of your
> PoC with other tests running at TC (we're targeting TeamCity, right?) and
> share results so we could check if the numbers are reproducible.
> >
> > I stress this once more: functional integration tests are OK to run in
> Docker and CI, but running benchmarks in Docker is a big NO GO.
> >
> >
> > Second property let us write tests that require real-parallel actions
> over hosts.
> >
> > For example, agreed scenario for PME benchmarkduring "PME optimization
> stream" was as follows:
> >
> >  - 10 server nodes, preloaded with 1M of keys
> >  - 4 client nodes perform transactional load  (client nodes physically
> separated from server nodes)
> >  - during load:
> >  -- 5 server nodes stopped in parallel
> >  -- after 1 minute, all 5 nodes are started in parallel
> >  - load stopped, logs are analysed for exchange times.
> >
> > If we had stopped and started 5 nodes one-by-one, as ducktape does, then
> partition map exchange merge would not happen and we could not have
> measured PME optimizations for that case.
> >
> >
> > These are limitations of ducktape that we believe as a more important
> > argument "against" than you provide "for".
> >
> >
> >
> >
> > On 30.06.2020 14:58, Anton Vinogradov wrote:
> >> Folks,
> >> First, I've created PR [1] with ducktests improvements
> >> PR contains the following changes
> >> - Pme-free switch proof-benchmark (2.7.6 vs master)
> >> - Ability to check (compare with) previous releases (eg. 2.7.6 & 2.8)
> >> - Global refactoring
> >> -- benchmarks javacode simplification
> >> -- services python and java classes code deduplication
> >> -- fail-fast checks for java and python (eg. application should
> explicitly write it finished with success)
> >> -- simple results extraction from tests and benchmarks
> >> -- javacode now configurable from tests/benchmarks
> >> -- proper SIGTERM handling at javacode (eg. it may finish last
> operation and log results)
> >> -- docker volume now marked as delegated to increase execution speed
> for mac & win users
> >> -- Ignite cluster now start in parallel (start speed-up)
> >> -- Ignite can be configured at test/benchmark
> >> - full and module assembly scripts added
> > Great job done! But let me remind one of Apache Ignite principles:
> > week of thinking save months of development.
> >
> >
> >> Second, I'd like to propose to accept ducktests [2] (ducktape
> integration) as a target "PoC check & real topology benchmarking tool".
> >> Ducktape pros
> >> - Developed for distributed system by distributed system developers.
> > So does Tiden
> >
> >> - Developed since 2014, stable.
> > Tiden is also pretty stable, and development start date is not a good
> argument, for example pytest is since 2004, pytest-xdist (plugin for
> distributed testing) is since 2010, but we don't see it as a alternative at
> all.
> >
> >> - Proven usability by usage at Kafka.
> > Tiden is proven usable by usage at GridGain and Sberbank deployments.
> > Core, storage, sql and tx teams use benchmark results provided by Tiden
> on a daily basis.
> >
> >> - Dozens of dozens tests and benchmarks at Kafka as a great example
> pack.
> > We'll donate some of our suites to Ignite as I've mentioned in previous
> letter.
> >
> >> - Built-in Docker support for rapid development and checks.
> > False, there's no specific 'docker support' in ducktape itself, you just
> wrap it in docker by yourself, because ducktape is lacking deployment
> abilities.
> >
> >> - Great for CI automation.
> > False, there's no specific CI-enabled features in ducktape. Tiden, on
> the other hand, provide generic xUnit reporting format, which is supported
> by both TeamCity and Jenkins. Also, instead of using private keys, Tiden
> can use SSH agent, which is also great for CI, because both
> > TeamCity and Jenkins store keys in secret storage available only for
> ssh-agent and only for the time of the test.
> >
> >
> >> > As an additional motivation, at least 3 teams
> >> - IEP-45 team (to check crash-recovery speed-up (discovery and Zabbix
> speed-up))
> >> - Ignite SE Plugins team (to check plugin's features does not slow-down
> or broke AI features)
> >> - Ignite SE QA team (to append already developed smoke/load/failover
> tests to AI codebase)
> >
> > Please, before recommending your tests to other teams, provide proofs
> > that your tests are reproducible in real environment.
> >
> >
> >> now, wait for ducktest merge to start checking cases they working on in
> AI way.
> >> Thoughts?
> > Let us together review both solutions, we'll try to run your tests in
> our lab, and you'll try to at least checkout tiden and see if same tests
> can be implemented with it?
> >
> >
> >
> >> [1] https://github.com/apache/ignite/pull/7967
> >> [2] https://github.com/apache/ignite/tree/ignite-ducktape
> >> On Tue, Jun 16, 2020 at 12:22 PM Nikolay Izhikov <nizhi...@apache.org
> <mailto:nizhi...@apache.org>> wrote:
> >>    Hello, Maxim.
> >>    Thank you for so detailed explanation.
> >>    Can we put the content of this discussion somewhere on the wiki?
> >>    So It doesn’t get lost.
> >>    I divide the answer in several parts. From the requirements to the
> >>    implementation.
> >>    So, if we agreed on the requirements we can proceed with the
> >>    discussion of the implementation.
> >>    1. Requirements:
> >>    The main goal I want to achieve is *reproducibility* of the tests.
> >>    I’m sick and tired with the zillions of flaky, rarely failed, and
> >>    almost never failed tests in Ignite codebase.
> >>    We should start with the simplest scenarios that will be as reliable
> >>    as steel :)
> >>    I want to know for sure:
> >>       - Is this PR makes rebalance quicker or not?
> >>       - Is this PR makes PME quicker or not?
> >>    So, your description of the complex test scenario looks as a next
> >>    step to me.
> >>    Anyway, It’s cool we already have one.
> >>    The second goal is to have a strict test lifecycle as we have in
> >>    JUnit and similar frameworks.
> >>     > It covers production-like deployment and running a scenarios over
> >>    a single database instance.
> >>    Do you mean «single cluster» or «single host»?
> >>    2. Existing tests:
> >>     > A Combinator suite allows to run set of operations concurrently
> >>    over given database instance.
> >>     > A Consumption suite allows to run a set production-like actions
> >>    over given set of Ignite/GridGain versions and compare test metrics
> >>    across versions
> >>     > A Yardstick suite
> >>     > A Stress suite that simulates hardware environment degradation
> >>     > An Ultimate, DR and Compatibility suites that performs functional
> >>    regression testing
> >>     > Regression
> >>    Great news that we already have so many choices for testing!
> >>    Mature test base is a big +1 for Tiden.
> >>    3. Comparison:
> >>     > Criteria: Test configuration
> >>     > Ducktape: single JSON string for all tests
> >>     > Tiden: any number of YaML config files, command line option for
> >>    fine-grained test configuration, ability to select/modify tests
> >>    behavior based on Ignite version.
> >>    1. Many YAML files can be hard to maintain.
> >>    2. In ducktape, you can set parameters via «—parameters» option.
> >>    Please, take a look at the doc [1]
> >>     > Criteria: Cluster control
> >>     > Tiden: additionally can address cluster as a whole and execute
> >>    remote commands in parallel.
> >>    It seems we implement this ability in the PoC, already.
> >>     > Criteria: Test assertions
> >>     > Tiden: simple asserts, also few customized assertion helpers.
> >>     > Ducktape: simple asserts.
> >>    Can you, please, be more specific.
> >>    What helpers do you have in mind?
> >>    Ducktape has an asserts that waits for logfile messages or some
> >>    process finish.
> >>     > Criteria: Test reporting
> >>     > Ducktape: limited to its own text/HTML format
> >>    Ducktape have
> >>    1. Text reporter
> >>    2. Customizable HTML reporter
> >>    3. JSON reporter.
> >>    We can show JSON with the any template or tool.
> >>     > Criteria: Provisioning and deployment
> >>     > Ducktape: can provision subset of hosts from cluster for test
> >>    needs. However, that means, that test can’t be scaled without test
> >>    code changes. Does not do any deploy, relies on external means, e.g.
> >>    pre-packaged in docker image, as in PoC.
> >>    This is not true.
> >>    1. We can set explicit test parameters(node number) via parameters.
> >>    We can increase client count of cluster size without test code
> changes.
> >>    2. We have many choices for the test environment. These choices are
> >>    tested and used in other projects:
> >>             * docker
> >>             * vagrant
> >>             * private cloud(ssh access)
> >>             * ec2
> >>    Please, take a look at Kafka documentation [2]
> >>     > I can continue more on this, but it should be enough for now:
> >>    We need to go deeper! :)
> >>    [1]
> >>
> https://ducktape-docs.readthedocs.io/en/latest/run_tests.html#options
> >>    [2] https://github.com/apache/kafka/tree/trunk/tests#ec2-quickstart
> >>     > 9 июня 2020 г., в 17:25, Max A. Shonichev <mshon...@yandex.ru
> >>    <mailto:mshon...@yandex.ru>> написал(а):
> >>     >
> >>     > Greetings, Nikolay,
> >>     >
> >>     > First of all, thank you for you great effort preparing PoC of
> >>    integration testing to Ignite community.
> >>     >
> >>     > It’s a shame Ignite did not have at least some such tests yet,
> >>    however, GridGain, as a major contributor to Apache Ignite had a
> >>    profound collection of in-house tools to perform integration and
> >>    performance testing for years already and while we slowly consider
> >>    sharing our expertise with the community, your initiative makes us
> >>    drive that process a bit faster, thanks a lot!
> >>     >
> >>     > I reviewed your PoC and want to share a little about what we do
> >>    on our part, why and how, hope it would help community take proper
> >>    course.
> >>     >
> >>     > First I’ll do a brief overview of what decisions we made and what
> >>    we do have in our private code base, next I’ll describe what we have
> >>    already donated to the public and what we plan public next, then
> >>    I’ll compare both approaches highlighting deficiencies in order to
> >>    spur public discussion on the matter.
> >>     >
> >>     > It might seem strange to use Python to run Bash to run Java
> >>    applications because that introduces IT industry best of breed’ –
> >>    the Python dependency hell – to the Java application code base. The
> >>    only strangest decision one can made is to use Maven to run Docker
> >>    to run Bash to run Python to run Bash to run Java, but desperate
> >>    times call for desperate measures I guess.
> >>     >
> >>     > There are Java-based solutions for integration testing exists,
> >>    e.g. Testcontainers [1], Arquillian [2], etc, and they might go well
> >>    for Ignite community CI pipelines by them selves. But we also wanted
> >>    to run performance tests and benchmarks, like the dreaded PME
> >>    benchmark, and this is solved by totally different set of tools in
> >>    Java world, e.g. Jmeter [3], OpenJMH [4], Gatling [5], etc.
> >>     >
> >>     > Speaking specifically about benchmarking, Apache Ignite community
> >>    already has Yardstick [6], and there’s nothing wrong with writing
> >>    PME benchmark using Yardstick, but we also wanted to be able to run
> >>    scenarios like this:
> >>     > - put an X load to a Ignite database;
> >>     > - perform an Y set of operations to check how Ignite copes with
> >>    operations under load.
> >>     >
> >>     > And yes, we also wanted applications under test be deployed ‘like
> >>    in a production’, e.g. distributed over a set of hosts. This arises
> >>    questions about provisioning and nodes affinity which I’ll cover in
> >>    detail later.
> >>     >
> >>     > So we decided to put a little effort to build a simple tool to
> >>    cover different integration and performance scenarios, and our QA
> >>    lab first attempt was PoC-Tester [7], currently open source for all
> >>    but for reporting web UI. It’s a quite simple to use 95% Java-based
> >>    tool targeted to be run on a pre-release QA stage.
> >>     >
> >>     > It covers production-like deployment and running a scenarios over
> >>    a single database instance. PoC-Tester scenarios consists of a
> >>    sequence of tasks running sequentially or in parallel. After all
> >>    tasks complete, or at any time during test, user can run logs
> >>    collection task, logs are checked against exceptions and a summary
> >>    of found issues and task ops/latency statistics is generated at the
> >>    end of scenario. One of the main PoC-Tester features is its
> >>    fire-and-forget approach to task managing. That is, you can deploy a
> >>    grid and left it running for weeks, periodically firing some tasks
> >>    onto it.
> >>     >
> >>     > During earliest stages of PoC-Tester development it becomes quite
> >>    clear that Java application development is a tedious process and
> >>    architecture decisions you take during development are slow and hard
> >>    to change.
> >>     > For example, scenarios like this
> >>     > - deploy two instances of GridGain with master-slave data
> >>    replication configured;
> >>     > - put a load on master;
> >>     > - perform checks on slave,
> >>     > or like this:
> >>     > - preload a 1Tb of data by using your favorite tool of choice to
> >>    an Apache Ignite of version X;
> >>     > - run a set of functional tests running Apache Ignite version Y
> >>    over preloaded data,
> >>     > do not fit well in the PoC-Tester workflow.
> >>     >
> >>     > So, this is why we decided to use Python as a generic scripting
> >>    language of choice.
> >>     >
> >>     > Pros:
> >>     > - quicker prototyping and development cycles
> >>     > - easier to find DevOps/QA engineer with Python skills than one
> >>    with Java skills
> >>     > - used extensively all over the world for DevOps/CI pipelines and
> >>    thus has rich set of libraries for all possible integration uses
> cases.
> >>     >
> >>     > Cons:
> >>     > - Nightmare with dependencies. Better stick to specific
> >>    language/libraries version.
> >>     >
> >>     > Comparing alternatives for Python-based testing framework we have
> >>    considered following requirements, somewhat similar to what you’ve
> >>    mentioned for Confluent [8] previously:
> >>     > - should be able run locally or distributed (bare metal or in the
> >>    cloud)
> >>     > - should have built-in deployment facilities for applications
> >>    under test
> >>     > - should separate test configuration and test code
> >>     > -- be able to easily reconfigure tests by simple configuration
> >>    changes
> >>     > -- be able to easily scale test environment by simple
> >>    configuration changes
> >>     > -- be able to perform regression testing by simple switching
> >>    artifacts under test via configuration
> >>     > -- be able to run tests with different JDK version by simple
> >>    configuration changes
> >>     > - should have human readable reports and/or reporting tools
> >>    integration
> >>     > - should allow simple test progress monitoring, one does not want
> >>    to run 6-hours test to find out that application actually crashed
> >>    during first hour.
> >>     > - should allow parallel execution of test actions
> >>     > - should have clean API for test writers
> >>     > -- clean API for distributed remote commands execution
> >>     > -- clean API for deployed applications start / stop and other
> >>    operations
> >>     > -- clean API for performing check on results
> >>     > - should be open source or at least source code should allow ease
> >>    change or extension
> >>     >
> >>     > Back at that time we found no better alternative than to write
> >>    our own framework, and here goes Tiden [9] as GridGain framework of
> >>    choice for functional integration and performance testing.
> >>     >
> >>     > Pros:
> >>     > - solves all the requirements above
> >>     > Cons (for Ignite):
> >>     > - (currently) closed GridGain source
> >>     >
> >>     > On top of Tiden we’ve built a set of test suites, some of which
> >>    you might have heard already.
> >>     >
> >>     > A Combinator suite allows to run set of operations concurrently
> >>    over given database instance. Proven to find at least 30+ race
> >>    conditions and NPE issues.
> >>     >
> >>     > A Consumption suite allows to run a set production-like actions
> >>    over given set of Ignite/GridGain versions and compare test metrics
> >>    across versions, like heap/disk/CPU consumption, time to perform
> >>    actions, like client PME, server PME, rebalancing time, data
> >>    replication time, etc.
> >>     >
> >>     > A Yardstick suite is a thin layer of Python glue code to run
> >>    Apache Ignite pre-release benchmarks set. Yardstick itself has a
> >>    mediocre deployment capabilities, Tiden solves this easily.
> >>     >
> >>     > A Stress suite that simulates hardware environment degradation
> >>    during testing.
> >>     >
> >>     > An Ultimate, DR and Compatibility suites that performs functional
> >>    regression testing of GridGain Ultimate Edition features like
> >>    snapshots, security, data replication, rolling upgrades, etc.
> >>     >
> >>     > A Regression and some IEPs testing suites, like IEP-14, IEP-15,
> >>    etc, etc, etc.
> >>     >
> >>     > Most of the suites above use another in-house developed Java tool
> >>    – PiClient – to perform actual loading and miscellaneous operations
> >>    with Ignite under test. We use py4j Python-Java gateway library to
> >>    control PiClient instances from the tests.
> >>     >
> >>     > When we considered CI, we put TeamCity out of scope, because
> >>    distributed integration and performance tests tend to run for hours
> >>    and TeamCity agents are scarce and costly resource. So, bundled with
> >>    Tiden there is jenkins-job-builder [10] based CI pipelines and
> >>    Jenkins xUnit reporting. Also, rich web UI tool Ward aggregates test
> >>    run reports across versions and has built in visualization support
> >>    for Combinator suite.
> >>     >
> >>     > All of the above is currently closed source, but we plan to make
> >>    it public for community, and publishing Tiden core [9] is the first
> >>    step on that way. You can review some examples of using Tiden for
> >>    tests at my repository [11], for start.
> >>     >
> >>     > Now, let’s compare Ducktape PoC and Tiden.
> >>     >
> >>     > Criteria: Language
> >>     > Tiden: Python, 3.7
> >>     > Ducktape: Python, proposes itself as Python 2.7, 3.6, 3.7
> >>    compatible, but actually can’t work with Python 3.7 due to broken
> >>    Zmq dependency.
> >>     > Comment: Python 3.7 has a much better support for async-style
> >>    code which might be crucial for distributed application testing.
> >>     > Score: Tiden: 1, Ducktape: 0
> >>     >
> >>     > Criteria: Test writers API
> >>     > Supported integration test framework concepts are basically the
> same:
> >>     > - a test controller (test runner)
> >>     > - a cluster
> >>     > - a node
> >>     > - an application (a service in Ducktape terms)
> >>     > - a test
> >>     > Score: Tiden: 5, Ducktape: 5
> >>     >
> >>     > Criteria: Tests selection and run
> >>     > Ducktape: suite-package-class-method level selection, internal
> >>    scheduler allows to run tests in suite in parallel.
> >>     > Tiden: also suite-package-class-method level selection,
> >>    additionally allows selecting subset of tests by attribute, parallel
> >>    runs not built in, but allows merging test reports after different
> runs.
> >>     > Score: Tiden: 2, Ducktape: 2
> >>     >
> >>     > Criteria: Test configuration
> >>     > Ducktape: single JSON string for all tests
> >>     > Tiden: any number of YaML config files, command line option for
> >>    fine-grained test configuration, ability to select/modify tests
> >>    behavior based on Ignite version.
> >>     > Score: Tiden: 3, Ducktape: 1
> >>     >
> >>     > Criteria: Cluster control
> >>     > Ducktape: allow execute remote commands by node granularity
> >>     > Tiden: additionally can address cluster as a whole and execute
> >>    remote commands in parallel.
> >>     > Score: Tiden: 2, Ducktape: 1
> >>     >
> >>     > Criteria: Logs control
> >>     > Both frameworks have similar builtin support for remote logs
> >>    collection and grepping. Tiden has built-in plugin that can zip,
> >>    collect arbitrary log files from arbitrary locations at
> >>    test/module/suite granularity and unzip if needed, also application
> >>    API to search / wait for messages in logs. Ducktape allows each
> >>    service declare its log files location (seemingly does not support
> >>    logs rollback), and a single entrypoint to collect service logs.
> >>     > Score: Tiden: 1, Ducktape: 1
> >>     >
> >>     > Criteria: Test assertions
> >>     > Tiden: simple asserts, also few customized assertion helpers.
> >>     > Ducktape: simple asserts.
> >>     > Score: Tiden: 2, Ducktape: 1
> >>     >
> >>     > Criteria: Test reporting
> >>     > Ducktape: limited to its own text/html format
> >>     > Tiden: provides text report, yaml report for reporting tools
> >>    integration, XML xUnit report for integration with Jenkins/TeamCity.
> >>     > Score: Tiden: 3, Ducktape: 1
> >>     >
> >>     > Criteria: Provisioning and deployment
> >>     > Ducktape: can provision subset of hosts from cluster for test
> >>    needs. However, that means, that test can’t be scaled without test
> >>    code changes. Does not do any deploy, relies on external means, e.g.
> >>    pre-packaged in docker image, as in PoC.
> >>     > Tiden: Given a set of hosts, Tiden uses all of them for the test.
> >>    Provisioning should be done by external means. However, provides a
> >>    conventional automated deployment routines.
> >>     > Score: Tiden: 1, Ducktape: 1
> >>     >
> >>     > Criteria: Documentation and Extensibility
> >>     > Tiden: current API documentation is limited, should change as we
> >>    go open source. Tiden is easily extensible via hooks and plugins,
> >>    see example Maven plugin and Gatling application at [11].
> >>     > Ducktape: basic documentation at readthedocs.io
> >>    <http://readthedocs.io>. Codebase is rigid, framework core is
> >>    tightly coupled and hard to change. The only possible extension
> >>    mechanism is fork-and-rewrite.
> >>     > Score: Tiden: 2, Ducktape: 1
> >>     >
> >>     > I can continue more on this, but it should be enough for now:
> >>     > Overall score: Tiden: 22, Ducktape: 14.
> >>     >
> >>     > Time for discussion!
> >>     >
> >>     > ---
> >>     > [1] - https://www.testcontainers.org/
> >>     > [2] - http://arquillian.org/guides/getting_started/
> >>     > [3] - https://jmeter.apache.org/index.html
> >>     > [4] - https://openjdk.java.net/projects/code-tools/jmh/
> >>     > [5] - https://gatling.io/docs/current/
> >>     > [6] - https://github.com/gridgain/yardstick
> >>     > [7] - https://github.com/gridgain/poc-tester
> >>     > [8] -
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/System+Test+Improvements
> >>     > [9] - https://github.com/gridgain/tiden
> >>     > [10] - https://pypi.org/project/jenkins-job-builder/
> >>     > [11] - https://github.com/mshonichev/tiden_examples
> >>     >
> >>     > On 25.05.2020 11:09, Nikolay Izhikov wrote:
> >>     >> Hello,
> >>     >>
> >>     >> Branch with duck tape created -
> >>    https://github.com/apache/ignite/tree/ignite-ducktape
> >>     >>
> >>     >> Any who are willing to contribute to PoC are welcome.
> >>     >>
> >>     >>
> >>     >>> 21 мая 2020 г., в 22:33, Nikolay Izhikov
> >>    <nizhikov....@gmail.com <mailto:nizhikov....@gmail.com>> написал(а):
> >>     >>>
> >>     >>> Hello, Denis.
> >>     >>>
> >>     >>> There is no rush with these improvements.
> >>     >>> We can wait for Maxim proposal and compare two solutions :)
> >>     >>>
> >>     >>>> 21 мая 2020 г., в 22:24, Denis Magda <dma...@apache.org
> >>    <mailto:dma...@apache.org>> написал(а):
> >>     >>>>
> >>     >>>> Hi Nikolay,
> >>     >>>>
> >>     >>>> Thanks for kicking off this conversation and sharing your
> >>    findings with the
> >>     >>>> results. That's the right initiative. I do agree that Ignite
> >>    needs to have
> >>     >>>> an integration testing framework with capabilities listed by
> you.
> >>     >>>>
> >>     >>>> As we discussed privately, I would only check if instead of
> >>     >>>> Confluent's Ducktape library, we can use an integration
> >>    testing framework
> >>     >>>> developed by GridGain for testing of Ignite/GridGain clusters.
> >>    That
> >>     >>>> framework has been battle-tested and might be more convenient
> for
> >>     >>>> Ignite-specific workloads. Let's wait for @Maksim Shonichev
> >>     >>>> <mshonic...@gridgain.com <mailto:mshonic...@gridgain.com>> who
> >>    promised to join this thread once he finishes
> >>     >>>> preparing the usage examples of the framework. To my
> >>    knowledge, Max has
> >>     >>>> already been working on that for several days.
> >>     >>>>
> >>     >>>> -
> >>     >>>> Denis
> >>     >>>>
> >>     >>>>
> >>     >>>> On Thu, May 21, 2020 at 12:27 AM Nikolay Izhikov
> >>    <nizhi...@apache.org <mailto:nizhi...@apache.org>>
> >>     >>>> wrote:
> >>     >>>>
> >>     >>>>> Hello, Igniters.
> >>     >>>>>
> >>     >>>>> I created a PoC [1] for the integration tests of Ignite.
> >>     >>>>>
> >>     >>>>> Let me briefly explain the gap I want to cover:
> >>     >>>>>
> >>     >>>>> 1. For now, we don’t have a solution for automated testing of
> >>    Ignite on
> >>     >>>>> «real cluster».
> >>     >>>>> By «real cluster» I mean cluster «like a production»:
> >>     >>>>>       * client and server nodes deployed on different hosts.
> >>     >>>>>       * thin clients perform queries from some other hosts
> >>     >>>>>       * etc.
> >>     >>>>>
> >>     >>>>> 2. We don’t have a solution for automated benchmarks of some
> >>    internal
> >>     >>>>> Ignite process
> >>     >>>>>       * PME
> >>     >>>>>       * rebalance.
> >>     >>>>> This means we don’t know - Do we perform rebalance(or PME) in
> >>    2.7.0 faster
> >>     >>>>> or slower than in 2.8.0 for the same cluster?
> >>     >>>>>
> >>     >>>>> 3. We don’t have a solution for automated testing of Ignite
> >>    integration in
> >>     >>>>> a real-world environment:
> >>     >>>>> Ignite-Spark integration can be taken as an example.
> >>     >>>>> I think some ML solutions also should be tested in real-world
> >>    deployments.
> >>     >>>>>
> >>     >>>>> Solution:
> >>     >>>>>
> >>     >>>>> I propose to use duck tape library from confluent (apache 2.0
> >>    license)
> >>     >>>>> I tested it both on the real cluster(Yandex Cloud) and on the
> >>    local
> >>     >>>>> environment(docker) and it works just fine.
> >>     >>>>>
> >>     >>>>> PoC contains following services:
> >>     >>>>>
> >>     >>>>>       * Simple rebalance test:
> >>     >>>>>               Start 2 server nodes,
> >>     >>>>>               Create some data with Ignite client,
> >>     >>>>>               Start one more server node,
> >>     >>>>>               Wait for rebalance finish
> >>     >>>>>       * Simple Ignite-Spark integration test:
> >>     >>>>>               Start 1 Spark master, start 1 Spark worker,
> >>     >>>>>               Start 1 Ignite server node
> >>     >>>>>               Create some data with Ignite client,
> >>     >>>>>               Check data in application that queries it from
> >>    Spark.
> >>     >>>>>
> >>     >>>>> All tests are fully automated.
> >>     >>>>> Logs collection works just fine.
> >>     >>>>> You can see an example of the tests report - [4].
> >>     >>>>>
> >>     >>>>> Pros:
> >>     >>>>>
> >>     >>>>> * Ability to test local changes(no need to public changes to
> >>    some remote
> >>     >>>>> repository or similar).
> >>     >>>>> * Ability to parametrize test environment(run the same tests
> >>    on different
> >>     >>>>> JDK, JVM params, config, etc.)
> >>     >>>>> * Isolation by default so system tests are as reliable as
> >>    possible.
> >>     >>>>> * Utilities for pulling up and tearing down services easily
> >>    in clusters in
> >>     >>>>> different environments (e.g. local, custom cluster, Vagrant,
> >>    K8s, Mesos,
> >>     >>>>> Docker, cloud providers, etc.)
> >>     >>>>> * Easy to write unit tests for distributed systems
> >>     >>>>> * Adopted and successfully used by other distributed open
> >>    source project -
> >>     >>>>> Apache Kafka.
> >>     >>>>> * Collect results (e.g. logs, console output)
> >>     >>>>> * Report results (e.g. expected conditions met, performance
> >>    results, etc.)
> >>     >>>>>
> >>     >>>>> WDYT?
> >>     >>>>>
> >>     >>>>> [1] https://github.com/nizhikov/ignite/pull/15
> >>     >>>>> [2] https://github.com/confluentinc/ducktape
> >>     >>>>> [3]
> https://ducktape-docs.readthedocs.io/en/latest/run_tests.html
> >>     >>>>> [4] https://yadi.sk/d/JC8ciJZjrkdndg
>
>

Re: [DISCUSSION] Ignite integration testing framework.

Reply via email to