Armin Balalaie created KAFKA-8192:
-------------------------------------
Summary: Using Failify for e2e testing of Kafka
Key: KAFKA-8192
URL: https://issues.apache.org/jira/browse/KAFKA-8192
Project: Kafka
Issue Type: Test
Reporter: Armin Balalaie
Hi,
I am the author of Failify, a test framework for end-to-end testing of
distributed systems. Failify can be used to deterministically inject failures
during a normal test case execution. Currently, node failure, network
partition, network delay, network packet loss, and clock drift is supported.
For a few supported languages (right now, Java and Scala), it is possible to
enforce a specific order between nodes in order to reproduce a specific
time-sensitive scenario and inject failures before or after a specific method
is called when a specific stack trace is present. You can find more information
in [https://failify.io|https://failify.io/].
My reasons for Failify being useful to Kafka are:
* It is Docker-based and less messy and you can run the test cases in a single
node and in parallel (there are plans for implementing the ability of deploying
the same test case on a K8S or a Swarm cluster).
* It is Docker-based so you can easily have test cases that run on different
OSes. Also, you can define the services you depend on e.g. ZK as another node
in your deployment definition.
* The failure kinds supported are a superset of what is being supported now by
Trogdor (in particular, Network delay and loss, clock drift and a more
sophisticated network partitioning)
* There will be more control over when a failure should be introduced in a
test case.
* You can write your test cases in Java or Scala or any other language that
can be run on JVM and can use Java libraries.
* It can be easily integrated into your build pipeline as you will be writing
your regular JUnit test cases.
* The API is compact and intuitive and there is a good documentation for the
tool
Please let me know if you want to give it a try.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)