A good start would be to revise the archetype to include as many illustrative tests as reasonably possible -- people seem more willing to follow examples than to follow instructions. Ram
On Sep 12, 2016 5:26 PM, "Thomas Weise" <t...@apache.org> wrote: Hi, Recently there was a bit of discussion on how to write tests for operators that will result in good coverage and high confidence in the results of the CI. Experience from past releases show that those operators with good coverage are less likely to break down (with a user) due to subsequent changes, while those that don't have coverage in the CI (think contrib) are likely to suffer breakdown even due to trivial changes that are otherwise easily caught. IMO writing good tests is as important as the operator main code (and documentation and examples..). It was also part of the maturity framework that Ashwin proposed a while ago (Ashwin, maybe you can also share a few points). I suggest we expand the contribution guidelines to reflect an agreed set of expectations that contributors can follow when submitting PRs or even come up with a checklist for submitting PRs: http://apex.apache.org/malhar-contributing.html Here are a few recurring problems and suggestions in nor particular order: - Unit tests are for testing small pieces of code in isolation ("unit"). Running a DAG in embedded mode is not a unit test, it is an integration test. - When writing an operator or making changes to fix bugs etc., it is recommended to write or modify the granular test that exercises this change and as little as possible around it. This happens before writing or running an application and can be done in fast iterations inside the IDE without extensive test data setup or application assembly. - When an operator consists of multiple other components, then testing for those should also be broken down into units. For example, managed state is not tested by testing dedup or join operator (which are special use cases), but through separate tests, that exercise the full spectrum (or at least close to) of managed state. - So what about serialization, don't I need to create a DAG to test it? You only need Kryo to test serialization of an operator. Use the existing utilities or contribute to utilities that are shared between tests. - Don't I need to run a DAG to test the lifecycle of an operator? No, the sequence of calls to an operator's lifecycle methods are documented (or how else would I implement an operator to start with). There are quite a few tests that "execute" the operator directly. They have access to the state and can assert that with a certain process invocation the expected changes occur. That is much more difficult when running a DAG. - I have to write a lot of code to do such testing and possibly I will forget some calls? Not when following test driven development. IMO that mostly happens when tests are written as afterthought and that's a waste of time. I would suggest though to develop a single operator test driver that will ensures all methods are called for basic sanity check. - Integration tests: with proper unit test coverage, the integration test is more like an example of how to use an operator. Nice for users, because they can use it as a starting point for writing their own app, including the configuration. - I wrote a nice integration test app with configuration. It runs for exactly <n> seconds (localmode.run(n)) returns and all looks green. It even prints some nice stuff in the console. What's wrong? You have not tested anything! An operator may fail in setup and the test still passes. Travis CI is not reading the console (instead, it will complain that tests are filling up 4MB too fast and really important logs go under). Instead, assert on your test code that the DAG execution produces the expected results. Instead of waiting for <n> seconds wait until expected results are in and cap it with a timeout. This is yet another area where a few utilities for recurring test code will come in handy. - Tests sometimes fail, but they work on my local machine? Every environment is different and good tests don't depend on environment specific factors (timing dependency, excessive resource utilization etc.). It is important that tests pass in the CI consistently and that issues found there are investigated and fixed. Isn't it nice to see the green check mark in the PR instead of having to close/reopen several times so that the unrelated flaky test does not fail. If we collectively track and fix such failures life will be better for everyone. Looking forward to feedback, additions and most importantly volunteers that will help making the Apex CI better. Thanks, Thomas