I really like Ansible and have built a large infrastructure around it,
but I'm finding it untrustworthy to the point of being unusable.
In the last 9 months, I've reported 4 variable precedence bugs:
https://github.com/ansible/ansible/issues?utf8=%E2%9C%93&q=is%3Aissue+author%3Adstillman+
The first two were marked as P1 and fixed, and the third was confirmed
as P2 in December but remains open. The last one, which I reported
today, occurs in 1.9.3 but is fixed on devel for 2.0 — and yet devel
appears to break one of the P1 bugs (#9498) again, despite my including
a test case with the original report (as I've done for all of them). The
other P1 bug also disappeared and reappeared a couple times during 1.8
development as other variable bugs were fixed, which seems to be the
general pattern for these bugs.
If it's not clear, these are incredibly dangerous bugs in production
environments, because they can cause services to silently be rolled out
in the wrong location or with the wrong configuration. (I noticed this
because a service had been deployed to a directory with the name of
another service, resulting in two copies of the service trying to run —
though fortunately this was on a dev machine.) The safest solution I've
found is to configure different roles on the systems separately using
tags, but that somewhat defeats the purpose of a central configuration
management tool (and actually doesn't even avoid the P1 bug that's
broken again on devel, so I guess I should say the safest solution is
not to use variables at all).
It's possible I'm using variables somewhat differently than most people
using Ansible — the bugs I've reported all depend on include_vars within
a role, which I use extensively — but there seem to be quite a few
reports of variable bugs, and none of the issues I've reported have been
marked as invalid.
I don't want to abandon Ansible, but I can't keep using it if I can't
trust it to deploy services correctly. I also shouldn't have to keep my
own set of tests that I run whenever I try a new version just to make
sure dangerous bugs that I've reported previously — with those same
tests — haven't regressed.
If the current variable precedence system is salvageable (and I'm not
convinced it is or should be), it seems like many more integration test
cases are needed, all run in separate processes and — needless to say —
with new ones added whenever variable bugs are found.
(I think a contributing factor here may actually be the layout of the
integration test suite. Most of the test cases I've submitted require
multiple roles, but adding those to the current suite would get messy
quickly, since there's just a single root directory and single roles
directory for all integration tests. I think it'd be much cleaner to use
a subdirectory for each integration test, with a top-level playbook in
each, to keep all test files grouped together and avoid accidental
interactions with other files. That would also make it much simpler to
add people's test contributions.)
Anyway, I hope something can be done. As it stands now, I'm nervous
every time Ansible runs.
--
You received this message because you are subscribed to the Google Groups "Ansible
Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/ansible-project/55D3BB06.6030800%40gmail.com.
For more options, visit https://groups.google.com/d/optout.