Proposal: deploy ChaosMesh on APISIX, to simulate more faults

Shuyang Wu Sun, 15 Nov 2020 20:45:03 -0800

Hi Comunity,

Nowadays, we have unit tests, integration tests, and e2e tests, to ensure
the fault tolerance of APISIX. But there are still some problems, like
network delay and CPU stress, that have not covered by the above tests.
Thus, it would be a better idea to introduce chaos engineering, to simulate
different types of faults, and test the performance of APISIX in these
circumstances.


To deploy chaos engineering, ChaosMesh[1] could be a good choice for us.
There are several benefits above other chaos engineering tools:

   1. ChaosMesh is a CNCF sandbox project and has quite an active
   community, which ensures the project would be better and we could get help
   when needed.
   2. ChaosMesh support Github Actions, so when we set up the workflow of
   this integration, it would be easy to do the test in our daily working
   3. ChaosMesh currently supports most types of different chaos for now
   and is supporting more. Although we might not need that much for now, it is
   a good point when we decide to test more with it.
   BTW, chaos types ChaosMesh supports[2] for now(Nov.16, 2020) includes
   pod chaos, network chaos, stress chaos, io chaos, time chaos, kernel chaos,
   HTTP chaos, and DNS chaos.

Following the principles of chaos engineering, there are two main parts we
need to care about: 1. what should we test and 2. how to prove the
correctness after chaos injection.

As for what we got for now, the current problems we encounter and need to
simulating are:

   1. the connection with etcd is unstable
   2. etcd failure
   3. problems when cpu/memory/disk stressed out

And the method to test correctness including:

   1. error log of Nginx and APISIX
   2. whether cpu/memory use of APISIX is abnormally high
   3. whether wrk benchmarking would fail

Welcome provide some other problems or correctness that you might find
useful to this~


[1] <https://chaos-mesh.org/>https://chaos-mesh.org/

[2] <https://chaos-mesh.org/docs/chaos_experiments>
https://chaos-mesh.org/docs/chaos_experiments


Thanks,

Shuyang Wu

Proposal: deploy ChaosMesh on APISIX, to simulate more faults

Reply via email to