
Recently there was a bit of discussion on how to write tests for operators
that will result in good coverage and high confidence in the results of the
CI. Experience from past releases show that those operators with good
coverage are less likely to break down (with a user) due to subsequent
changes, while those that don't have coverage in the CI (think contrib) are
likely to suffer breakdown even due to trivial changes that are otherwise
easily caught.

IMO writing good tests is as important as the operator main code (and
documentation and examples..). It was also part of the maturity framework
that Ashwin proposed a while ago (Ashwin, maybe you can also share a few
points). I suggest we expand the contribution guidelines to reflect an
agreed set of expectations that contributors can follow when submitting PRs
or even come up with a checklist for submitting PRs:


Here are a few recurring problems and suggestions in nor particular order:

   - Unit tests are for testing small pieces of code in isolation ("unit").
   Running a DAG in embedded mode is not a unit test, it is an integration
   - When writing an operator or making changes to fix bugs etc., it is
   recommended to write or modify the granular test that exercises this change
   and as little as possible around it. This happens before writing or running
   an application and can be done in fast iterations inside the IDE without
   extensive test data setup or application assembly.
   - When an operator consists of multiple other components, then testing
   for those should also be broken down into units. For example, managed state
   is not tested by testing dedup or join operator (which are special use
   cases), but through separate tests, that exercise the full spectrum (or at
   least close to) of managed state.
   - So what about serialization, don't I need to create a DAG to test it?
   You only need Kryo to test serialization of an operator. Use the existing
   utilities or contribute to utilities that are shared between tests.
   - Don't I need to run a DAG to test the lifecycle of an operator? No,
   the sequence of calls to an operator's lifecycle methods are documented (or
   how else would I implement an operator to start with). There are quite a
   few tests that "execute" the operator directly. They have access to the
   state and can assert that with a certain process invocation the expected
   changes occur. That is much more difficult when running a DAG.
   - I have to write a lot of code to do such testing and possibly I will
   forget some calls? Not when following test driven development. IMO that
   mostly happens when tests are written as afterthought and that's a waste of
   time. I would suggest though to develop a single operator test driver that
   will ensures all methods are called for basic sanity check.
   - Integration tests: with proper unit test coverage, the integration
   test is more like an example of how to use an operator. Nice for users,
   because they can use it as a starting point for writing their own app,
   including the configuration.
   - I wrote a nice integration test app with configuration. It runs  for
   exactly <n> seconds (localmode.run(n)) returns and all looks green. It even
   prints some nice stuff in the console. What's wrong? You have not tested
   anything! An operator may fail in setup and the test still passes. Travis
   CI is not reading the console (instead, it will complain that tests are
   filling up 4MB too fast and really important logs go under). Instead,
   assert on your test code that the DAG execution produces the expected
   results. Instead of waiting for <n> seconds wait until expected results are
   in and cap it with a timeout. This is yet another area where a few
   utilities for recurring test code will come in handy.
   - Tests sometimes fail, but they work on my local machine? Every
   environment is different and good tests don't depend on environment
   specific factors (timing dependency, excessive resource utilization etc.).
   It is important that tests pass in the CI consistently and that issues
   found there are investigated and fixed. Isn't it nice to see the green
   check mark in the PR instead of having to close/reopen several times so
   that the unrelated flaky test does not fail. If we collectively track and
   fix such failures life will be better for everyone.

Looking forward to feedback, additions and most importantly volunteers that
will help making the Apex CI better.


Reply via email to