Re: Difficulty configuring kdump
Hello After you reboot and configure your kdump, before you do the test crash dump, check if your crash kernel was loaded corretly with grep -i crash /proc/iomem 2013/7/29 Andrew Walsh awa...@permabit.com Hi all, I’ve got an issue trying to configure kdump on my debian systems, and I was hoping someone could help shine some light on the issue for me that might help me get it working. I’ve done some extensive reading about how to configure it, but nothing has been fruitful in pointing me in the right direction. The systems I am trying to get this working on are running debian squeeze with the 64-bit 3.2 backported kernel via squeeze-backports. Here's the scenario: I configure kdump /etc/default/kdump-tools as I would like (I've varied the location of /var/crash around in the event that partitioning or location had anything to do with it, with no success): USE_KDUMP=1 KDUMP_COREDIR=/var/crash DEBUG_KERNEL=/usr/lib/debug/boot/vmlinux-3.2.0-0.bpo.3-amd64 KDUMP_CMDLINE_APPEND=irqpoll maxcpus=1 And then update the grub config For grub1 (for xen-hosts): Append crashkernel=64M@192M to kernel line on default boot For grub2: edit /etc/default/grub and append crashkernel=64M to GRUB_CMDLINE_LINUX_DEFAULT (I’ve noticed that if I keep quiet in there, kdump-tools fails to load as well) run update-grub (If there is a double space before crashkernel in the resulting grub.cfg or menu.lst, I noticed that I have to remove it manually) On one of the systems (the same one showing the output of kdump-config), here is the resulting kernel param in my grub.cfg: linux /boot/vmlinuz-3.2.0-0.bpo.3-amd64 root=UUID=8135cc05-9b88-4aa1-be74-9c4d687bf956 ro crashkernel=64M I then reboot the host and run kdump-config status, which returns Ready to kdump: # kdump-config status current state : ready to kdump This is the output of kdump-config show: # kdump-config show USE_KDUMP:1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR:/var/crash crashkernel addr: 0x330 current state:ready to kdump kernel link: /usr/lib/debug/boot/vmlinux-3.2.0-0.bpo.3-amd64 kexec command: /sbin/kexec -p --command-line=BOOT_IMAGE=/boot/vmlinuz-3.2.0-0.bpo.3-amd64 root=UUID=8135cc05-9b88-4aa1-be74-9c4d687bf956 ro irqpoll maxcpus=1 --initrd=/boot/initrd.img-3.2.0-0.bpo.3-amd64 /boot/vmlinuz-3.2.0-0.bpo.3-amd64 Ensure that I have the sysrq trigger set up correctly by setting it to 1: echo 1 /proc/sys/kernel/sysrq (This is usually already set to 1, but I still do it to make sure) Then I simulate a crash: echo c /proc/sysrq-trigger On a Squeeze host, the panic occurs, but nothing else. I have to manually reset the machine to bring it back. On a RHEL6 host (slightly varied configuration), the kernel dumps the core as expected and reboots. One thing to also note is that when I have this config in place, sending a reboot operation to the system responds as expected, where it doesn’t fully reboot the machine, it just simply reloads the running kernel, so it does appear that things are half-working. I have tried this configuration on several machines, and they all react the same way. I've reached out to the package maintainer for the best place to ask this question for kdump-tools, but I haven't gotten a reply, so this was my best guess. I would greatly appreciate any help or insight into where I might find some assistance with this issue. Thanks. *Andrew Walsh* -- esta es mi vida e me la vivo hasta que dios quiera
RE: Difficulty configuring kdump
Yes, it appears that it was loaded as expected: # grep -i crash /proc/iomem 3300-36ff : Crash kernel Thanks for the response. *Andrew Walsh*
Re: Difficulty configuring kdump
Can you show us the screen when crash happen? after you give echo c /proc/sysrq-trigger 2013/7/29 Andrew Walsh awa...@permabit.com Yes, it appears that it was loaded as expected: # grep -i crash /proc/iomem 3300-36ff : Crash kernel Thanks for the response. *Andrew Walsh* -- esta es mi vida e me la vivo hasta que dios quiera
RE: Difficulty configuring kdump
Here is the output once I trigger the crash: [3518.067263] SysRq : Trigger a crash [3518.070526] BUG: unable to handle kernel NULL pointer dereference at (null) [3518.071338] IP: [812446fe] sysrq_handle_crash+0xd/0x16 [3518.072163] PGD 7c781067 PUD 7c780067 PMD 0 [3518.072973] Oops: 0002 [#1] SMP [3518.073772] CPU 0 [3518.073782] Modules linked in: autofs4 fuse nfs lockd fscache auth_rpcgss nfs_acl sunrpc dm_crypt snd_pcm snd_timer snd parport_pc tpm_tis tpm tpm_bios soundcore snd_page_alloc psmouse i2c_piix4 i2c_core parport evdev vmw_balloon shpchp pcspkr coretemp serio_raw processor ac container button power_supply thermal_sys ext3 jbd mbcache dm_mod raid10 raid456 async_raid6_recov asynx_pq raid6_pq async_xor xor asyn_memcpy async_tx raid1 raid0 multipath linear md_mod nbd sg sr_mod cdrom sd_mod ata_generic crc_t10dif ata_piix libata floppy e1000 crc32c_intel mptspi mptscih mptbase scsi_transport_spi scsi_mod [last unloaded: scsi_wait_scan] [3518.082803] [3518.083718] Pid: 2108, comm: bash Not tainted: 3.2.0-0.bpo.3-amd64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform [3518.084706] RIP: 0010:[f812446fe] [f812446fe] sysrq_handle_crash+0xd/0x16 [3518.085667] RSP: 0018880037abbe80 EFLAGS: 00010092 [3518.086610] RAX: 0010 RBX: 8164b660 RCX: 09e069e06 [3518.087580] RDX: RSI: 0046 RDI: 00063 [3518.088532] RBP: 0063 R08: R09: 0 [3518.089475] R10: 88007ab874d0 R11: 81433470 R12: 1 [3518.094532] R13: 0246 R14: 7fff09a49200 R15: 1 [3518.096509] FS: 7f4a16be5700() GS:88007fc0() knlGS: [3518.097494] CS: 0010 DS: ES: CR0: 80050033 [3518.098470] CR2: CR3: 7cba2000 CR4: 006f0 [3518.100774] DR0: DR1: DR2: 0 [3518.104353] DR3: DR6: 0ff0 DR7: 00400 [3518.105309] Process bash (pid: 2108, threadinfo 880037aba000, task 8800379430e0) [3518.106266] Stack: [3518.107217] 81244c90 0002 880037abbf58 [3518.108205] 0002 7fff09a4922c 81244d5d fff4 [3518.109197] 88007abf5480 88007c909ec0 8114df96 0002 [3518.110208] Call Trace: [3518.113574] [81244c90] ? __handle_sysrq+0xa9/0x141 [3518.115998] [81244d5d] ? write_sysrq_trigger+0x35/0x3d [3518.117025] [8114df96] ? proc_reg_write+0x7a/0x93 [3518.118032] [811065cd] ? vfs_write+0xa4/0xff [3518.119023] [811066de] ? sys_write+0x45/0x6e [3518.120019] [8136b292] ? system_call_fastpath+0x16/0x1b [3518.121003] Code: 00 01 8a 81 13 2f 80 81 19 d2 83 e0 8f f7 d2 83 e2 03 c1 e2 04 09 d0 88 81 13 2f 80 81 c3 c7 05 a1 d4 52 00 01 00 00 00 0f ae f8 c6 04 25 00 00 00 00 01 c3 8d 47 d0 83 f8 09 76 0d 8d 57 9f 31 [3518.125802] RIP [812446fe] sysrq_handle_crash+0xd/0x16 [3518.128071] RSP 880037abbe80 [3518.129085] DR2: *Andrew Walsh*
Re: Difficulty configuring kdump
Hello as you see in the kernel trace, maybe it's kernel BUG, i don't know if there is an open bug for this, but it can be useful if you try to do the same thing with an other kernel. 2013/7/29 Andrew Walsh awa...@permabit.com Here is the output once I trigger the crash: [3518.067263] SysRq : Trigger a crash [3518.070526] BUG: unable to handle kernel NULL pointer dereference at (null) [3518.071338] IP: [812446fe] sysrq_handle_crash+0xd/0x16 [3518.072163] PGD 7c781067 PUD 7c780067 PMD 0 [3518.072973] Oops: 0002 [#1] SMP [3518.073772] CPU 0 [3518.073782] Modules linked in: autofs4 fuse nfs lockd fscache auth_rpcgss nfs_acl sunrpc dm_crypt snd_pcm snd_timer snd parport_pc tpm_tis tpm tpm_bios soundcore snd_page_alloc psmouse i2c_piix4 i2c_core parport evdev vmw_balloon shpchp pcspkr coretemp serio_raw processor ac container button power_supply thermal_sys ext3 jbd mbcache dm_mod raid10 raid456 async_raid6_recov asynx_pq raid6_pq async_xor xor asyn_memcpy async_tx raid1 raid0 multipath linear md_mod nbd sg sr_mod cdrom sd_mod ata_generic crc_t10dif ata_piix libata floppy e1000 crc32c_intel mptspi mptscih mptbase scsi_transport_spi scsi_mod [last unloaded: scsi_wait_scan] [3518.082803] [3518.083718] Pid: 2108, comm: bash Not tainted: 3.2.0-0.bpo.3-amd64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform [3518.084706] RIP: 0010:[f812446fe] [f812446fe] sysrq_handle_crash+0xd/0x16 [3518.085667] RSP: 0018880037abbe80 EFLAGS: 00010092 [3518.086610] RAX: 0010 RBX: 8164b660 RCX: 09e069e06 [3518.087580] RDX: RSI: 0046 RDI: 00063 [3518.088532] RBP: 0063 R08: R09: 0 [3518.089475] R10: 88007ab874d0 R11: 81433470 R12: 1 [3518.094532] R13: 0246 R14: 7fff09a49200 R15: 1 [3518.096509] FS: 7f4a16be5700() GS:88007fc0() knlGS: [3518.097494] CS: 0010 DS: ES: CR0: 80050033 [3518.098470] CR2: CR3: 7cba2000 CR4: 006f0 [3518.100774] DR0: DR1: DR2: 0 [3518.104353] DR3: DR6: 0ff0 DR7: 00400 [3518.105309] Process bash (pid: 2108, threadinfo 880037aba000, task 8800379430e0) [3518.106266] Stack: [3518.107217] 81244c90 0002 880037abbf58 [3518.108205] 0002 7fff09a4922c 81244d5d fff4 [3518.109197] 88007abf5480 88007c909ec0 8114df96 0002 [3518.110208] Call Trace: [3518.113574] [81244c90] ? __handle_sysrq+0xa9/0x141 [3518.115998] [81244d5d] ? write_sysrq_trigger+0x35/0x3d [3518.117025] [8114df96] ? proc_reg_write+0x7a/0x93 [3518.118032] [811065cd] ? vfs_write+0xa4/0xff [3518.119023] [811066de] ? sys_write+0x45/0x6e [3518.120019] [8136b292] ? system_call_fastpath+0x16/0x1b [3518.121003] Code: 00 01 8a 81 13 2f 80 81 19 d2 83 e0 8f f7 d2 83 e2 03 c1 e2 04 09 d0 88 81 13 2f 80 81 c3 c7 05 a1 d4 52 00 01 00 00 00 0f ae f8 c6 04 25 00 00 00 00 01 c3 8d 47 d0 83 f8 09 76 0d 8d 57 9f 31 [3518.125802] RIP [812446fe] sysrq_handle_crash+0xd/0x16 [3518.128071] RSP 880037abbe80 [3518.129085] DR2: *Andrew Walsh* -- esta es mi vida e me la vivo hasta que dios quiera
Re: Difficulty configuring kdump
Thanks for the reply. However, I believe that's an expected behavior when simulating a crash. See the following doc, which states 'c' - Will perform a system crash by a NULL pointer dereference. A crashdump will be taken if configured. https://www.kernel.org/doc/Documentation/sysrq.txt *Andrew Walsh*