2014-10-31 11:10 GMT+01:00 David Lang <[email protected]>: > On Fri, 31 Oct 2014, Rainer Gerhards wrote: > > Let me get straight to the beef (nun pun intended): This sounds all really >> great. Is anyone willing to do the initial work of creating a couple of >> tests and integrate them into the rsyslog process? >> > > My point in trying to document the types of bugs we are running into was > to try and define what sorts of tests would be useful to add. Your comments > to confirm what is already in place are also helpful. > > My hope is that by getting this information out, someone will be willing > to help write tests (as singh has been doing with the code he's been adding) > > Yeah, in general it would be great if we get more tests. It's just the "one person" issue... I think tests could be added to many areas, and I will probably experiment myself a little more with the unit tests. But I think it's hard to do (at least it looked so the last time I tried) because of the interdependencies inside the runtime.
So it would be great to have someone else work on that. It would be especially well because I usually test what I can think about -- and as I thought about it, this of course works. I guess that was the prime reason why tester and developer should be different folks ;) Rainer David Lang > > > Rainer >> >> 2014-10-31 7:35 GMT+01:00 singh.janmejay <[email protected]>: >> >> Im trying to see these problems in the light of CI. >>> >>> On Fri, Oct 31, 2014 at 10:50 AM, David Lang <[email protected]> wrote: >>> >>> We already have make check being run daily. So it's not on every commit, >>>> but given the time it takes to run and the pace of merging new features >>>> into the master tree, daily should be just fine. >>>> >>>> The problems that we run into are not easy for the system to find. As I >>>> understand it, we have a few noticable pain points. >>>> >>>> 1. distro specific problems. >>>> >>>> The CentOS boot problem for example. >>>> The ongoing SELinux/etc issues that we run into are another >>>> Systemd is going to be more common in the future (and it's a moving >>>> target) >>>> >>>> >>> My team had to solve a similar problem with one of my ex-employers. Our >>> CI >>> setup was composed of machines running various environments. We had >>> machines running CentOS, Windows, Solaris etc. All of these machines were >>> running CI-agent/worker and the build was configured to do exactly the >>> same >>> thing on each worker. So we basically found issues before releasing >>> GA-versions provided we had all relevant versions of environments >>> running. >>> This kind of infrastructure would require several machines/VMs, each one >>> running one supported environment. But this also requires some budget to >>> finance the setup. >>> >>> The only other option seems to be what -devel releases are supposed to >>> facilitate, but if we had limited success with that, it can't be counted >>> as >>> a real option. >>> >>> >>> >>>> 2. difficulty in setting up a repeatable test environment. >>>> >>>> Output to Oracle or ElasticSearch are two examples. The difficulty is >>>> setting up the external data store in such a way that it can be >>>> >>> populated, >>> >>>> tested, and then reset back to the known good state. >>>> >>>> Other examples are connections to the various MQ systems. >>>> >>>> >>> Common/recommended practise here is to make each tests stateless. Most >>> language/platform specific testing-frameworks provide a notion of setup >>> and >>> teardown. Large end-to-end integration tests usually have elaborate >>> setup/teardown steps which involve truncating database tables, deleting >>> elasticsearch indices, draining/purging relevant external queues etc. >>> >>> But most value is derived from having 2 levels of tests. >>> >>> The make-check tests that we have are end-to-end integration tests. Such >>> tests generally face external-dependency based flakiness problems, until >>> setup/teardown are done very well (like they aggressively re-try failed >>> operations etc). >>> >>> A lower level of test would be unit-tests, which test individual >>> functions, >>> or set of functions rather than testing end-to-end behaviour of rsyslog. >>> This is where tools such as google-test, cppunit etc come in. These tests >>> are usually quicker to write as they don't involve elaborate >>> orchestration. >>> For instance, regex-property-extractor could have dedicated tests to see >>> if >>> it handles groups well, it handles char-classes well etc. With such 5 - >>> 10 >>> tests guarding different behaviours of regex-property-extractor, we >>> should >>> have enough confidence that it works standalone. The only thing >>> integration >>> test needs to check is, does it work well when used end-to-end, for which >>> one test with simple input and output exercising match and submatch >>> features would suffice. Trouble starts when one tries to test all >>> combination of features through integration tests, which is usually >>> prohibitively expensive (not just in terms of running-time, but >>> writing/maintaining overhead too). >>> >>> >>> >>>> 3. (extending #2) Contributed modules tend to be especially bad here. >>>> >>> They >>> >>>> tend to be written for specific cases and have heavy depenencies on >>>> external components. Without assitance from someone who knows those >>>> external components well, setting up a good test is _hard_ >>>> >>>> >>> Yes, writing integration tests involving multiple systems is painful. The >>> solution is to bank a lot more on unit-tests for functional correctness >>> and >>> only write a few end-to-end tests to validate that integration points >>> work >>> well with each other. Such a test would fail when a new version of >>> elasticsearch changes semantics of a certain API, while all unit-tests >>> would keep passing. >>> >>> Also, large amount of functional-testing can be done without involving >>> external systems by using mock-endpoints. For instance, think of having >>> an >>> endpoint which receives elasticsearch bulk api payload and lets you >>> validate it was well-formed and semantically correct. This test won't >>> fail >>> if elasticsearch changes its API, but it'll fail if anything in >>> implementation misbehaves, allowing one to catch large set of bugs >>> easily. >>> End-to-end integration tests are then only required for the last-mile. >>> >>> >>> >>>> 4. (extending #2) configuring modules can sometimes be hard. >>>> >>>> The GnuTLS problems are a good example of this sort of thing. Even >>>> when >>>> tests exist, that just means that those particular configuations work. >>>> People trying to use different features of the module (or in this case, >>>> >>> the >>> >>>> underlying library and certificates) run into new problems. >>>> >>>> >>>> Again, unit-tests allow one to test one feature standalone very >>> easily. So >>> generally I'd write tests for each feature working alone involving >>> minimal >>> set of features, and then cover bad combinations as they surface(or as we >>> expect them to surface). >>> >>> >>> >>>> #1 needs more people running the tests, and just trying to use the >>>> system >>>> on a variety of situations. >>>> >>>> #4 needs documentation and more tests written. >>>> >>>> Rainer does a pretty good job at testing things before they hit the >>>> >>> master >>> >>>> branch. While I was working at Intuit building their logging systems >>>> >>> out, I >>> >>>> was very comfortable running custom builds from git in production with >>>> >>> just >>> >>>> a short amount of 'smoke test' time. I wish all projects had as good a >>>> track record. >>>> >>>> We really don't have that many real bugs reported, and the vast majority >>>> of the problems that we do run into tend to be race conditions and other >>>> things that are fairly hard to trigger. That doesn't mean that we don't >>>> have occasional problems. The segfault bug that a handful of people have >>>> reported recently is an example of that, but the fact that it's so hard >>>> >>> to >>> >>>> duplicate means that automated testing isn't likely to help a lot. >>>> >>>> >>> I have had some amount of success in guarding regressions with >>> race-conditions in the past. It generally involves one big ugly test per >>> race-condition which uses semaphores and wait/notify mechanisms to make >>> code run in a certain way which is known to reproduce the problem. >>> However, >>> this won't help in discovering potential/new race-conditions. In TDD >>> (test-driven-development) style of working, people write tests for new >>> features that involve complex concurrency concerns before/while writing >>> actual code. The idea is to enumerate as many race-conditions as possible >>> and cover them with tests. This approach is not bullet-proof, but it >>> works >>> in most cases. >>> >>> >>> >>>> David Lang >>>> >>>> >>>> On Fri, 31 Oct 2014, singh.janmejay wrote: >>>> >>>> Im unsure if I have understood you correctly, but you seem to be >>>> >>> thinking >>> >>>> of CI as a way of writing tests(like a testing framework, eg. >>>>> >>>> google-test, >>> >>>> make-check etc). >>>>> >>>>> But actually CI is a process(not a tool/framework). I guess this is the >>>>> best source of information about it: >>>>> http://en.wikipedia.org/wiki/Continuous_integration >>>>> >>>>> So having a CI process in simple terms means that the >>>>> integration-branch >>>>> (which happens to be master in our case) and all the other long-lived >>>>> branches such as feature-branches that live for a few days/weeks, >>>>> should >>>>> be >>>>> monitored by an automated build process. >>>>> >>>>> This build process would trigger 'make check' for every new commit >>>>> >>>> (people >>> >>>> often use git-push based post-update hook to notify build-server). This >>>>> build would be reported on a CI dashboard, which tells people if the >>>>> >>>> build >>> >>>> was green or red. >>>>> >>>>> CI servers also go one step further and support output-artifact >>>>> integration >>>>> so developers can see which tests failed, why they failed etc. >>>>> >>>>> So no changes are required to the way rsyslog codebase is right now. We >>>>> have a directory full of tests, we have a command that returns non-zero >>>>> exit code for failing tests and this setup is sufficient for us to >>>>> setup >>>>> an >>>>> automated build process. >>>>> >>>>> We should check if travis can run 'make check' for us. >>>>> >>>>> >>>>> On Thu, Oct 30, 2014 at 8:38 PM, Rainer Gerhards < >>>>> [email protected]> >>>>> wrote: >>>>> >>>>> 2014-10-30 13:35 GMT+01:00 singh.janmejay <[email protected]>: >>>>> >>>>>> >>>>>> +1 for testsuite and CI. >>>>>> >>>>>>> >>>>>>> >>>>>>> I tried to get a quick glimpse at CI, but not only rsyslog doc is >>>>>> full >>>>>> >>>>> of >>> >>>> acronyms and references that talk only for those in the field ;) >>>>>> >>>>>> Can someone tell me in quick words how in CI a test to do this looks >>>>>> like: >>>>>> >>>>>> - spin up a rsyslogd listener instance "LST" >>>>>> - spin up a rsyslogd sender instance "SND" >>>>>> - make "SND" send a couple of thousand messages to LST >>>>>> - stop both instances *when work is finished* >>>>>> - count if LST did receive all messages and did so in correct format >>>>>> >>>>>> This is one of the basic tests in the current rsyslog testbench. >>>>>> >>>>>> I would appreciate if a control file for CI could be provided, so that >>>>>> >>>>> I >>> >>>> can judge the effort that's to be made if I were to convert to that >>>>>> system >>>>>> (it sounds interesting). Is there some volunteer who would migrate the >>>>>> current testbench and educate me ... in the way James has done this >>>>>> for >>>>>> the >>>>>> rsyslog-doc project? >>>>>> >>>>>> Thanks, >>>>>> Rainer >>>>>> >>>>>> >>>>>> +1 for time based releases. >>>>>> >>>>>>> >>>>>>> It'll be valuable to have adhoc minor/micro releases. Release feature >>>>>>> >>>>>> by >>> >>>> feature or few features/fixes at a time type of thing. >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> Janmejay >>>>>>> >>>>>>> PS: Please blame the typos in this mail on my phone's uncivilized >>>>>>> soft >>>>>>> keyboard sporting it's not-so-smart-assist technology. >>>>>>> >>>>>>> On Oct 30, 2014 6:01 PM, "Thomas D." <[email protected]> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>>> >>>>>>>> +1 for an always working "master" branch. >>>>>>>> >>>>>>>> Do your work in feature branches. Only merge when you think the >>>>>>>> >>>>>>> changes >>> >>>> are ready. Don't merge when you think "I am ready, but this needs >>>>>>>> >>>>>>>> testing". >>>>>>> >>>>>>> >>>>>>>> Regarding the testing problem in general: I would stop adding new >>>>>>>> features for a while. Spend more time in improving code quality. >>>>>>>> Try to find/create a working test suite. There will always be the >>>>>>>> problem that nobody will test your stuff. So you need a way to write >>>>>>>> tests. Yes, creating a test suite will take some time. But in the >>>>>>>> end >>>>>>>> >>>>>>>> it >>>>>>> >>>>>> >>>>>> will improve the software quality and boost your development. >>>>>>> >>>>>>>> Remember that you can use CI for free with GitHub. Every pull >>>>>>>> request >>>>>>>> could be automatically tested for you... >>>>>>>> >>>>>>>> >>>>>>>> +1 for a time-based release approach. >>>>>>>> >>>>>>>> >>>>>>>> PS: Debian Jessie will freeze at 23:59 UTC on the 5th of November >>>>>>>> >>>>>>> 2014. >>> >>>> Just a reminder if you want to see some of the recent changes in >>>>>>>> >>>>>>>> Jessie... >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> -Thomas >>>>>>>> _______________________________________________ >>>>>>>> rsyslog mailing list >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>> >>>>>>>> myriad >>>>>>> >>>>>> >>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>> you >>>>>>> >>>>>>>> DON'T LIKE THAT. >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> >>>>>>> rsyslog mailing list >>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>> >>>>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>>> DON'T LIKE THAT. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> >>>>>> rsyslog mailing list >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>> http://www.rsyslog.com/professional-services/ >>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>> >>>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>> DON'T LIKE THAT. >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>>> >>> >>> >>> -- >>> Regards, >>> Janmejay >>> http://codehunk.wordpress.com >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

