On 12/05/2014 11:51 PM, Nathaniel McCallum wrote:
So I've been working this week on porting the OTP tests that were
manually coded to the new pytest framework. Now, I'm the first to want
better tooling to make our job easier. But, pytest? Meh. I'm having to
write just as much code (or more) to get my tests on par with pytest,
and they are riddled with subtle bugs.

Hi,
Yeah, IPA runs into specific situations that are very different from how tests usually work in other projects. And you took one of the harder cases. IPA's existing tests also don't really follow best practices – they pretty much assume you run the whole test suite, or at least the whole module, so selecting individual tests is currently not reliable. On the other hand, this assumption made some things easy to code – like complicated ordering and setup/teardown.

One thing we want to do during the move to pytest is to have the tests be more independent of the current state of the database, and also of each other.

If certain assertions only make sense in a specific order, either put them in the same test*, or write additional code that explicitly handles dependencies and ordering. (Sorry, that's the price to pay for isolated tests – I don't see a way around it.)

If on the other hand the order doesn't matter – any order of any subset of individual tests should pass, which is our goal – but pytest's default ordering makes the suite slow, there is a hook that allows you to override the order: pytest_collection_modifyitems [0]. Stick it in the affected module (or conftest.py file for multiple modules, or a plugin for the whole suite), and sort the items explicitly.

If your test depends on something not existing or being disabled, you should try removing/disabling it before the test, or write a fixture to do that.


* might make sense, for starters?
[0] http://pytest.org/latest/plugins.html#_pytest.hookspec.pytest_collection_modifyitems


Most painful, is pytest.fixture(). It is a cool idea, but the scoping
and lifecycle is horribly broken. Here is an example:

Here is an example. I want to create a user, enable otp in two different
ways, test with two different types of tokens under two different
authentication scenarios. This is a total of 8 tests.

@pytest.fixture(scope="function") # This scope is default.
def user(request):
   user = create_user('tuser1')
   request.addfinalizer(lambda: delete_user('tuser1'))
   return user

@pytest.fixture(params=["per-user", "global"])
def enablement(request, user):
   output = enable(user, request.param)
   request.addfinalizer(lambda: disable(user))
   return output

@pytest.fixture(params=[{'type': 'HOTP'}, {'disabled': True}])
def token(request, user, enablement):
   output = create_token(request.param)
   request.addfinalizer(lambda: delete_token(output.id))
   return output

@pytest.mark.parametrize("opts", [foo, bar])
def test_authentication(user, token, opts):
   return auth(user.uid, token.code(), opts)

Because the default scope is "function", a new user is created for every
single run of token. This is *way* more work than necessary, and leads
to slow runtimes. Fortunately, we can fix this by changing the scope of
the user fixture to "module". In this case, only one user will be
created per module. So far so good.

The enablement scope is still "function", so it will still be called
eight times (#tokens * #enablements * #opts). Now, this is a problem
because (let's say) enablement takes a long time to switch tests. Here
is problem #1 with pytest: it presumes that all variants of fixtures
should be grouped by test parameters. This is a bad assumption. It may
make sense in some cases, but there is no way to change it.

Now, we can work around this problem just like with the user fixture:
change the scope to "module". Unfortunately, we've just caused a second
bug. Now, enablement() will only be run twice; but the finalizer will
not be called until all tests are run. This means that "per-user"
enablement will still be active when "global" enablement runs; both will
be torn down simultaneously.

This same problem arises for the token fixture: eight tokens will be
created and destroyed when only four are necessary. You guessed it, we
can fix this by changing the scope. But by doing this all four tokens
will be deleted simultaneously. This means we cannot test the state
where only a disabled token exists.

Any attempt to use a fixture to create long-lived states (precisely what
fixtures are designed for) with mutually exclusive parameters is
extremely error prone as it is not clear which items are alive in the
LDAP db at any point. This leads to both long runtimes *and* extremely
subtle bugs in the tests that are multiplied by every layer of fixtures.
Call me nonplussed.

Now, let's look at how this same test can be implemented without pytest:

user = create_user('tuser1')
try:
   for enablement in ('per-user', 'global'):
     enable(user, enablement)
     try:
       for token in [{'type': 'HOTP'}, {'disabled': True}]:
         object = create_token(token)
         try:
           for opt in [foo, bar]:
             auth(user.uid, object.code(), opt)
         finally:
           delete_token(object.id)
     finally:
       disable(user)
finally:
   delete_user('tuser1')

Comparing the non-pytest implementation to the pytest one we can see a
few things. First, the pytest code has some nice separation of roles.
This is the big benefit of pytest. Second, the non-pytest implementation
is less code. Third, the non-pytest implementation has no subtle scoping
bugs: we know exactly when each database record is created and
destroyed.

This "non-pytest" implementation is a *single test*, since it hard-codes dependencies and ordering. If you write it as as a single test function in pytest, it should look much the same, and get the ordering/dependencies right (except the case of the test user already existing).

I *want* to like pytest. Honestly I do. But this scoping idiocy has
already cost me roughly two days tracking down subtle bugs in the tests
(while not finding any bugs in the OTP code itself). And I can't think
that OTP tests written using pytest will be maintainable in any regard;
not when such small changes can cause such hard to find bugs.

Now, it may be possible that I have just missed something obvious. If I
have, forgive me this rant. But I've spent a lot of time reading docs
and I haven't seen any obvious solutions to these problems.

Nathaniel

--
Petr³

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Reply via email to