> On Mar 7, 2017, at 7:53 AM, Pratyush Anand <[email protected]> wrote: > > Hi Philip, > > On Sunday 05 March 2017 04:56 AM, Philip Prindeville wrote: > > [...] > >> >> In the case of having a single system kernel binary, then you’d have to >> install this kernel and it’s modules, and add this kernel to the boot loader >> configuration files, wouldn’t you? What do my grub arguments look like? > > Not necessarily all the modules. Kdump kernel will use only minimal modules. > You can build your initramfs with a minimum needed module, so that you can > boot and copy vmcore. > >> >> Do I always load my system kernel with “crashkernel=64M@16M” per the >> “CONFIG_PHYSICAL_START” and here: > > In the first kernel you need to pass "crashkernel=". Only size(64M )should > also work. Kernel should find the appropriate start address of crash kernel > location. > >> >> >>> 2) Boot the system kernel with the boot parameter "crashkernel=Y@X", >>> where Y specifies how much memory to reserve for the dump-capture kernel >>> and X specifies the beginning of this reserved memory. For example, >>> "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory >>> starting at physical address 0x01000000 (16MB) for the dump-capture kernel. >> >> >> >> Okay, we have a 2.6MB /vmlinuz in our /boot partition, so it’s relocatable >> and this part applies: >> >> >>> If you are using a compressed bzImage/vmlinuz, then use following command >>> to load dump-capture kernel. >>> >>> kexec -p <dump-capture-kernel-bzImage> \ >>> --initrd=<initrd-for-dump-capture-kernel> \ >>> --append="root=<root-dev> <arch-specific-options>" >> >> >> >> Not sure I understand this part. So if we have a relocatable kernel with >> crashdump built-in to our system kernel, do we need to load two kernels, >> just with different <arch-specific-options> and everything else being the >> same? > > You are in primary kernel and you need to load crash kernel. > > `kexec -p /boot/vmlinuz --initrd=/boot/kdump-initrd --reuse-cmdline > --append="irqpoll maxcpus=1 reset_devices"` should work.
Tried something like that: root@PowercodeBMU:/# kexec -p /boot/vmlinuz --reuse-cmdline --append="irqpoll maxcpus=1 reset_devices 1" Cannot get kernel page_offset_base symbol address Cannot load /boot/vmlinuz root@PowercodeBMU:/# Not sure why I’m seeing this. > > You need to prepare kdump-initrd, OR you can use current initrd, but that > will load all your modules of 1st kernel and 64M might not be sufficient > space then. It’s an embedded system so it’s pretty skinny. Everything needed to boot is “baked in”. Everything else gets loaded as a module into the booting kernel via init.d scripts … > >> >> Would the <arch-specific-options> be: >> >> crashkernel=64M@16M 1 irqpoll maxcpus=1 reset_devices > > "crashkernel=" *must* *not* be passed to crash kernel. It is only for the > primary kernel. Okay. And --reuse-cmdline takes care of stripping that out for you, it looks like. That option isn’t discussed in Documentation/kdump/ but it might be handy to add something about it. > >> >> in that case? >> >> On a normally running system, using an overlay root, our cmdline looks like: >> >> BOOT_IMAGE=/boot/vmlinuz block2mtd.block2mtd=/dev/sda2,65536,rootfs,5 >> root=/dev/mtdblock0 rootfstype=squashfs rootwait console=tty0 >> console=ttyS0,115200n8r noinitrd > > So, it should also have crashkernel=64M. Well, right. I was talking about a nominal system before I’ve started trying to get it to be crash-dump capable. > >> >> so I guess we’d just mash on those extra arguments. On a running system, >> our mount points are: >> >> /dev/root on /rom type squashfs (ro,relatime) >> proc on /proc type proc (rw,nosuid,nodev,noexec,noatime) >> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime) >> tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noatime) >> tmpfs on /tmp/root type tmpfs (rw,noatime,mode=755) >> tmpfs on /dev type tmpfs (rw,nosuid,relatime,size=512k,mode=755) >> devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,mode=600) >> debugfs on /sys/kernel/debug type debugfs (rw,noatime) >> /dev/mtdblock1 on /overlay type jffs2 (rw,noatime) >> overlayfs:/overlay on / type overlay >> (rw,noatime,lowerdir=/,upperdir=/overlay/upper,workdir=/overlay/work) >> >> >> but it doesn’t sound like any of that would change (except perhaps mounting >> a USB thumb-drive if we wanted to copy our crashdump to that device instead). Ah, actually, that’s not quite right. /boot has been unmounted early on but we’ll need to keep it mounted (even if we remount it as ‘ro’). >> >> So if I’ve understood, when the first loaded kernel (the system kernel) >> crashes, kexec will then try the next kernel it sees… which will be >> something like: >> >> kexec -p /boot/vmlinuz \ >> —-append=“$(cat /proc/cmdline) irqpoll maxcpus=1 reset_devices 1” >> >> (we don’t use a initrd as you can see above) and that’s described here: > > OK..so you can exclude --initrd argument to kexec. Yes. > >> >> >>> Kernel Panic >>> ============ >>> >>> After successfully loading the dump-capture kernel as previously >>> described, the system will reboot into the dump-capture kernel if a >>> system crash is triggered. [snip] >> >> >> >> assuming the system isn’t so badly hosed that a WDT expires causing a BIOS >> reset, etc. >> >> Do both kernels use the same “crashdump=“ value, or do they need different >> base addresses? > > Again, only 1st kernel need "crashkernel=“. Okay, got it. > >> >> And assuming that you’re using the same kernel, etc. how does the init.d >> scripting on the crashdump (2nd instance of the kernel) know that it’s not >> the nominal kernel? Do we use /sys/kernel/kexec_loaded for this purpose? >> Or do we just look for the existence of /proc/vmcore? > > Yep, you can find /proc/vmcore in 2nd kernel but not in 1st kernel. > /sys/kernel/kexec_crash_loaded should have 1 in 1st kernel while 0 in crash > kernel. So far I’m seeing the opposite: root@PowercodeBMU:/# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz block2mtd.block2mtd=/dev/sda2,65536,rootfs,5 root=/dev/mtdblock0 rootfstype=squashfs rootwait console=tty0 console=ttyS0,115200n8r noinitrd crashkernel=64M root@PowercodeBMU:/# cat /sys/kernel/kexec_crash_loaded 0 root@PowercodeBMU:/# Maybe it’s the other way around? > >> >> And then have something in my init.d scripts like: >> >> kexec_loaded=$(< /sys/kernel/kexec_loaded) > > /sys/kernel/kexec_crash_loaded Right. > >> >> if [ “$kexec_loaded” = 0 ]; then >> kexec -p /boot/vmlinuz \ >> —-append=“$(cat /proc/cmdline) irqpoll maxcpus=1 reset_devices 1” >> else >> echo “*** HANDLING CRASH DUMP COLLECTION" >> mkdir -p /mnt/crashdrive >> mount LABEL=crashdrive /mnt/crashdrive >> # might do something clever here with “df —output=avail -m /mnt/crashdrive” >> to make >> # sure I have enough space for the copy, perhaps deleting older dumps until >> I do… >> cp /proc/vmcore /mnt/crashdrive >> sync >> umount /mnt/crashdrive >> echo “*** NOW REBOOTING" >> reboot -f >> fi >> > > Above should work. Question… will crashkernel being 64M mean that /sys/kernel/kexec_crash_size is also 64M (67108864) and that would also be the size of /proc/vmcore? > > There can be many ways. You can have a look on fedora kexec-tools code. > http://pkgs.fedoraproject.org/cgit/rpms/kexec-tools.git/ > > >> Do I need to reboot in a particular way to avoid looping? The “Kernel >> Panic” section seems to state that normal reboots won’t be affected. > > When you execute reboot, it will reboot to the 1st kernel through grub (boot > loader). Okay. Thanks, -Philip > >> >> I appreciate the documentation you’ve written, but it’s a little unclear (to >> me at least) how to handle the degenerate case of using the same kernel as >> the system kernel and the crashdump kernel… >> >> I want to make sure that I don’t inadvertently set it up to do looping >> infinitely nested kernels, etc. >> >> I’m probably overthinking this, but… we’re having crashes in the field and >> the customers are a little riled up right now so I don’t want to spend a lot >> of time saying “here try this image”. They want their smoking gun and they >> want it soon. >> > > > ~Pratyush _______________________________________________ kexec mailing list [email protected] http://lists.infradead.org/mailman/listinfo/kexec
