On Nov 9, 2012, at 08:01, Chuck Silvers <c...@chuq.com> wrote: > On Wed, Nov 07, 2012 at 02:22:49PM +0100, Edgar Fu wrote: >>> Try to get a sparse dump via machdep.sparse_dump=1 >> How long is that supposed to take? >> It said "dump", paused for a few seconds, then counted from 44 down to 38 >> and >> then nothing happened for minutes. Until I hit the virtual reset button. > > I tried triggering a sparse dump (with "reboot -qd") on amd64 > and after a number of tries I did see the hang during the dump. > but even when it doesn't hang, the resulting sparse dump is not valid: > > savecore: kvm_read: invalid translation (invalid level 4 PDE) > > sparse dumps appear to be a bit too sparse. > > after I fixed that (and the problem that causes the kernel to spew > "pmap_kenter_pa: mapping already present"), the next problem was that savecore > generates a useless kernel image file, so you need to ignore the one > from savecore and use the kernel image you actually booted. this isn't > specific to sparse dumps, it happens with both normal and sparse dumps. > > but once I get past all that, sparse dumps work for me on amd64. > > ... I later tried triggering a dump from ddb with "reboot 0x104" > to make sure that my fix for the "mapping already present" thing > would work in this context as well (since the last attempt to fix that > resulted in a different hang), and I found that rebooting from ddb > currently always hangs. I traced it as far as cpu_shutdown(), > and it's not surprising that the xcalls from that also cause problems. > I'm inclined to have pmf_system_shutdown() return without doing anything > if panicstr is set, since the context in which this is called could cause > a hang for any driver shutdown hook. does anyone have any other ideas > on what to do about this? > > the attached patch fixes the amd64 kernel problems with sparse dumps for me, > could you give that a try?
I have tested your patches for NetBSD-current on VMware Fusion (under Mac OSX). Breaking into ddb and entering "reboot 0x104" results in a good core dump. As you note, the kernel copy is invalid. Thanks for the patches! I cannot remember the last time I was able to get a workable core dump on amd64. Regards, Sverre PS "vmstat -M netbsd.0.core -N /netbsd" results in vmstat: can't dereference kptr 0x7f7fffffd780 vmstat: invalid translation (invalid level 4 PDE) adding specific options, e.g., -e , work fine. PPS Is it now safe to enable core dumps on systems where the dump partition is a sub-partition of a raidframe RAID 1 partition? This used to warned against in the old raidframe documentation but the warnings are gone in recent versions.