On 26 June 2016 at 09:40, D. Hugh Redelmeier <[email protected]> wrote: > | From: Paul Wouters <[email protected]> > > | > On Jun 26, 2016, at 03:49, D. Hugh Redelmeier <[email protected]> wrote: > | > > | > I'm quite unhappy about uncaptured core files. At the moment, I'm > | > not willing to say what a good solution would be. > | > | That's not quite was it is. > > I'm not sure what you mean by that sentence.
My "recollection" of the issue on IRC was wrong. This e-mail has the correct details. > | Not all tests perform a shutdown and so some > | errors in a shutdown that could cause a core would theoretically not be > | detected". Any core files made are always detected. > > I don't know what you are claiming. > > When I ran all the tests Friday morning (using kvmrunner.py), several > assertion failures showed up but no core files were preserved. All > failures were of the same assertion (which we know about). > > I'm a dumb user. I don't know why this happened. > > But I'll make a sequence of guesses: > - the assertion failed during shutdown of Pluto The entire system was shutting down; pluto didn't quite make it and dumped core. > - maybe shutdown was performed by final.sh (I don't know) You shouldn't need to know. However to be specific, final.sh scripts sometimes do the following: - shutdown pluto - check for core files and save them but some skip one or both steps. Regardless, it shouldn't matter. > - the pluto log file captured the ASSERT failure By luck we've this bread crumb. If pluto were to dump core for some other reason, we'd have nothing. > - the core file doesn't get captured in this case. Right, per above. Capturing core files is an optional part of final.sh and that can be before pluto exits. > | As antony said, it is VERY useful to run a single test and then ssh in > | while the tunnel is still up. > > Yes. So maybe we need to cleanly separate shutdown into a separate step > and have the script-runner capable of stopping at any designated step. > For an example of this kind of control, look at rpmbuild's -b parameter. Yea, an explicit option like --stop-before <script>. However, I view that as a "nice to have". To me the "must have" is consistent default behaviour whether an individual or group of tests. For instance: ./testing/utils/kvmrunner.py testing/pluto ./testing/utils/kvmrunner.py testing/pluto/basic-pluto-01 ./testing/utils/kvmrunner.py testing/pluto/basic-pluto-* should all run the tests the same way. > | We have a bunch of tests running shutdown > | but don't for the majority of tests. I think that's fine. > > I don't. The reason is that each test tests different paths through the > system and each might cause different problems that linger undetected > until shutdown. > > I come to that conclusion honestly: I don't have a core dump for any of > these particular assertion failures. But I might be mis-diagnosing. > > There is a chance that kvmrunner.py needs some added code to make me > happy. I know that it isn't an advertised component of our test system. > Do you see core dumps for these assertion failures? So far the best solution I've seen involves always shutting down pluto _before_ shutting down the entire system (if systemd is causing pluto to crash we've another problem). Perhaps final.sh should be required to run a new script "swan-destroy", or perhaps that should be run outside of the *.sh scripts. _______________________________________________ Swan-dev mailing list [email protected] https://lists.libreswan.org/mailman/listinfo/swan-dev
