On 21/05/15 03:31, Ryota Ozaki wrote:
As it happens, last night I needed to get the dmesg from a rump kernel
running on bare metal, so I made a trivial adjustment to make the
kern.msgbuf sysctl node available.  Now you can get the log slightly more
easily than with gdb by using sysctl -r kern.msgbuf (I assume hijacking
dmesg(1) would also work on NetBSD).

rump.sysctl -r kern.msgbuf works (though some NULL bytes appears before actual
kernel logs),

That's expected, that's how kern.msgbuf works (before the ringbuffer fills, after which you encounter another weird issue).

however, dmesg with hijacking doesn't. dmesg still shows
host's kernel logs.

Are you hijacking the sysctl system call, i.e. RUMPHIJACK=sysctl dmesg?
(unlike with path- or fd based system calls, hijacking sysctl is all-or-nothing -- it might be possible to add some pseudo-mib path handling along the sysctl hijacker so that only rump.foo.bar gets hijacked, but don't really see the point of using a few days to do so).

Can you sketch a bit how you'd integrate the feature with ATF?  For example,
do you plan to always include the log of all rump kernels started by ATF in
the test output (how?) and leave parsing to a human reading the logs, or
will the logcat be executed only by a failure handler, or something
different?

I thought I use kernel logs for debugging a kernel with ATF tests. So I
wouldn't output kernel logs by default in ATF (it may be useful though).

I imagined the following scenario:
- Modify the code of the kernel and test it with ATF tests
- Find a regression via an ATF test
- Want to debug the kernel with the ATF test
- Try printf debugging of the kernel (add printf to the code)
- Let rump_server(s) dump kernel logs at the cleanup phase (or somewhere)
   (it may be done by just enabling a debug flag of the test if it supports
    or adding some code to the test if not.)
- Run the ATF test again, see the output and debug it

If printf debugging is not sufficient, move to use of gdb, though I have
no idea on using gdb with ATF tests easily.

Using gdb with ATF is very easy. I added support to ATF so that it internally uses gdb to print a stack trace in case a C test program creates a coredump. So every time a test program crashes, you are using gdb with ATF ;)

More seriously though, IMO the biggest usability problem with ATF is that while it works great when the tests work or fail in expected ways, it's difficult to debug the tests (or system under test) when that's not the case. It used to be completely impossible to use gdb with ATF tests, at least now it's somewhat possible.

So, yes, I completely agree that being able to iterate with printf debugging would solve >50% of "debugging the test" problems, at least for tests which use rump kernels and when the problem is in a kernel component.

Support of kernel log output in ATF would be like this:
http://www.netbsd.org/~ozaki-r/atf-dmesg.diff

Isn't it very inconvenient to do that dance individually for every test? Also, I can imagine log-cat support going out-of-date in tests if the test is normally run without DEBUG. Furthermore, it would be desirable to be able to enable dmesg output in any [conforming] test without having to start modifying the test.

So, I am thinking that maybe there should be some higher-level construct for running rump_server in ATF tests, something like atf_rump_server. And I'm thinking that once that higher level construct has been specified after short experimentation, we might notice that -L is not really what we wanted (or we might notice that it is).

Now, understandably, defining such testing abstractions may not be what you want to spend time on now, though I think it would quickly start saving a lot of time, especially if you want to introduce the log capability to a large number of tests. So, if you or nobody else has any ideas on what the higher level construct should be, I can add support for -L. ... anyone?

I think -L is fine, but I'd like to see at least one concrete example -- and
preferably more than one -- on how you plan to use the feature to make sure
-L is really the best possible way instead of just fine.

Well, another example would be to use kernel logs in test itself,
e.g., atf_check -s exit:0 -o match:'something' dmesg. I'm not sure
such tests are proper.

I think they not only are proper, they also are desirable. Sometimes there is no good way to retrieve information from the kernel. For example, I think dev/scsipi/t_cd.c would benefit from that capability.

The counter-argument one hears is that then you can't change kernel printfs, but who really changes printfs that often, and even if they are changed, the test can be quickly fixed to conform.

  - antti

Reply via email to