On 05/08/07 16:38, Garrett D'Amore wrote:
[snip]
This sounds like an "edge" case to me. I.e. using kstats to attempt to locate a kernel bug. (Kstats are unlikely to explain the "why" for a failure-to-detach, or at least, they are for a driver that is detaching, since the kstats of interest are probably still in the kernel at detach time. (Generally clobbering freeing kstats is one of the last things a driver does on the way out. Note that in the nemo case this isn't quite precisely true... but there may be some adjustment we can do there.)

The second question I have is, where is this most useful? If it is only for DEBUG kernels, I can imagine that we could create a DEBUG behavior where kstat_delete() doesn't really delete the stat, but stat, but puts it into some kind of historical archive. (Keeping the most recent, or N most recent, stats from each driver.) This could facilitate debug, without confusing _administrative_ use, and without interfering with normal driver use. Further, this functionality could (should?) be made independent of KSTAT_FLAG_PERSISTENT (or any other flag). And it doesn't have to be DEBUG kernels only, it could be tunable via an /etc/system value (historical_kstats = 1 in /etc/system?)

I'm thinking of production kernels, not DEBUG ones.  That is, I am considering
service etc having to root-cause a failure based on what evidence can be
found, and not thinking of a developer or similar trying to locate
a kernel bug.  Admittedly it would be nicer to have some more well-defined
history trail, but you take what you can get in post-mortem debugging.
Kstats quite often include error count info, reset info etc which may
be relevant or at least interesting in debugging failed DRs etc.

Gavin
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to