Re: Difficulty configuring kdump

2013-07-29 Thread emmanuel segura
Hello

After you reboot and configure your kdump, before you do the test crash
dump, check if your crash kernel was loaded corretly with grep -i crash
/proc/iomem




2013/7/29 Andrew Walsh awa...@permabit.com

 Hi all,



 I’ve got an issue trying to configure kdump on my debian systems, and I
 was hoping someone could help shine some light on the issue for me that
 might help me get it working.  I’ve done some extensive reading about how
 to configure it, but nothing has been fruitful in pointing me in the right
 direction.  The systems I am trying to get this working on are running
 debian squeeze with the 64-bit 3.2 backported kernel via squeeze-backports.



 Here's the scenario:

 I configure kdump /etc/default/kdump-tools as I would like (I've varied
 the location of /var/crash around in the event that partitioning or
 location had anything to do with it, with no success):

   USE_KDUMP=1
   KDUMP_COREDIR=/var/crash
   DEBUG_KERNEL=/usr/lib/debug/boot/vmlinux-3.2.0-0.bpo.3-amd64
   KDUMP_CMDLINE_APPEND=irqpoll maxcpus=1


 And then update the grub config

   For grub1 (for xen-hosts): Append crashkernel=64M@192M to kernel line
 on default boot
   For grub2: edit /etc/default/grub and append crashkernel=64M to
 GRUB_CMDLINE_LINUX_DEFAULT
 (I’ve noticed that if I keep quiet in there, kdump-tools fails to load
 as well)
   run update-grub
   (If there is a double space before crashkernel in the resulting
 grub.cfg or menu.lst, I noticed that I have to remove it manually)

 On one of the systems (the same one showing the output of kdump-config),
 here is the resulting kernel param in my grub.cfg:
   linux   /boot/vmlinuz-3.2.0-0.bpo.3-amd64
 root=UUID=8135cc05-9b88-4aa1-be74-9c4d687bf956 ro crashkernel=64M



 I then reboot the host and run kdump-config status, which returns Ready
 to kdump:

 # kdump-config status
 current state   : ready to kdump


 This is the output of kdump-config show:

 # kdump-config show
 USE_KDUMP:1
 KDUMP_SYSCTL: kernel.panic_on_oops=1
 KDUMP_COREDIR:/var/crash
 crashkernel addr: 0x330
 current state:ready to kdump

 kernel link:
   /usr/lib/debug/boot/vmlinux-3.2.0-0.bpo.3-amd64

 kexec command:
   /sbin/kexec -p
 --command-line=BOOT_IMAGE=/boot/vmlinuz-3.2.0-0.bpo.3-amd64
 root=UUID=8135cc05-9b88-4aa1-be74-9c4d687bf956 ro  irqpoll maxcpus=1
 --initrd=/boot/initrd.img-3.2.0-0.bpo.3-amd64
 /boot/vmlinuz-3.2.0-0.bpo.3-amd64


 Ensure that I have the sysrq trigger set up correctly by setting it to 1:

 echo 1  /proc/sys/kernel/sysrq

 (This is usually already set to 1, but I still do it to make sure)

 Then I simulate a crash:

 echo c  /proc/sysrq-trigger


 On a Squeeze host, the panic occurs, but nothing else. I have to manually
 reset the machine to bring it back.
 On a RHEL6 host (slightly varied configuration), the kernel dumps the core
 as expected and reboots.

 One thing to also note is that when I have this config in place, sending a
 reboot operation to the system responds as expected, where it doesn’t fully
 reboot the machine, it just simply reloads the running kernel, so it does
 appear that things are half-working.


 I have tried this configuration on several machines, and they all react
 the same way. I've reached out to the package maintainer for the best place
 to ask this question for kdump-tools, but I haven't gotten a reply, so this
 was my best guess.



 I would greatly appreciate any help or insight into where I might find
 some assistance with this issue.

 Thanks.

 *Andrew Walsh*




-- 
esta es mi vida e me la vivo hasta que dios quiera


RE: Difficulty configuring kdump

2013-07-29 Thread Andrew Walsh
Yes, it appears that it was loaded as expected:

# grep -i crash /proc/iomem

  3300-36ff : Crash kernel



Thanks for the response.

*Andrew Walsh*


Re: Difficulty configuring kdump

2013-07-29 Thread emmanuel segura
Can you show us the screen when crash happen? after you give echo c 
/proc/sysrq-trigger


2013/7/29 Andrew Walsh awa...@permabit.com

 Yes, it appears that it was loaded as expected:

 # grep -i crash /proc/iomem

   3300-36ff : Crash kernel



 Thanks for the response.

 *Andrew Walsh*




-- 
esta es mi vida e me la vivo hasta que dios quiera


RE: Difficulty configuring kdump

2013-07-29 Thread Andrew Walsh
Here is the output once I trigger the crash:



[3518.067263] SysRq : Trigger a crash

[3518.070526] BUG: unable to handle kernel NULL pointer dereference at
(null)

[3518.071338] IP: [812446fe] sysrq_handle_crash+0xd/0x16

[3518.072163] PGD 7c781067 PUD 7c780067 PMD 0

[3518.072973] Oops: 0002 [#1] SMP

[3518.073772] CPU 0

[3518.073782] Modules linked in: autofs4 fuse nfs lockd fscache auth_rpcgss
nfs_acl sunrpc dm_crypt snd_pcm snd_timer snd parport_pc tpm_tis tpm
tpm_bios soundcore snd_page_alloc psmouse i2c_piix4 i2c_core parport evdev
vmw_balloon shpchp pcspkr coretemp serio_raw processor ac container button
power_supply thermal_sys ext3 jbd mbcache dm_mod raid10 raid456
async_raid6_recov asynx_pq raid6_pq async_xor xor asyn_memcpy async_tx
raid1 raid0 multipath linear md_mod nbd sg sr_mod cdrom sd_mod ata_generic
crc_t10dif ata_piix libata floppy e1000 crc32c_intel mptspi mptscih mptbase
scsi_transport_spi scsi_mod [last unloaded: scsi_wait_scan]

[3518.082803]

[3518.083718] Pid: 2108, comm: bash Not tainted: 3.2.0-0.bpo.3-amd64 #1
VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform

[3518.084706] RIP: 0010:[f812446fe]  [f812446fe]
sysrq_handle_crash+0xd/0x16

[3518.085667] RSP: 0018880037abbe80  EFLAGS: 00010092

[3518.086610] RAX: 0010 RBX: 8164b660 RCX:
09e069e06

[3518.087580] RDX:  RSI: 0046 RDI:
00063

[3518.088532] RBP: 0063 R08:  R09:
0

[3518.089475] R10: 88007ab874d0 R11: 81433470 R12:
1

[3518.094532] R13: 0246 R14: 7fff09a49200 R15:
1

[3518.096509] FS:  7f4a16be5700() GS:88007fc0()
knlGS:

[3518.097494] CS:  0010 DS:  ES:  CR0: 80050033

[3518.098470] CR2:  CR3: 7cba2000 CR4:
006f0

[3518.100774] DR0:  DR1:  DR2:
0

[3518.104353] DR3:  DR6: 0ff0 DR7:
00400

[3518.105309] Process bash (pid: 2108, threadinfo 880037aba000, task
8800379430e0)

[3518.106266] Stack:

[3518.107217]  81244c90 0002 
880037abbf58

[3518.108205]  0002 7fff09a4922c 81244d5d
fff4

[3518.109197]  88007abf5480 88007c909ec0 8114df96
0002

[3518.110208] Call Trace:

[3518.113574]  [81244c90] ? __handle_sysrq+0xa9/0x141

[3518.115998]  [81244d5d] ? write_sysrq_trigger+0x35/0x3d

[3518.117025]  [8114df96] ? proc_reg_write+0x7a/0x93

[3518.118032]  [811065cd] ? vfs_write+0xa4/0xff

[3518.119023]  [811066de] ? sys_write+0x45/0x6e

[3518.120019]  [8136b292] ? system_call_fastpath+0x16/0x1b

[3518.121003] Code: 00 01 8a 81 13 2f 80 81 19 d2 83 e0 8f f7 d2 83 e2 03
c1 e2 04 09 d0 88 81 13 2f 80 81 c3 c7 05 a1 d4 52 00 01 00 00 00 0f ae f8
c6 04 25 00 00 00 00 01 c3 8d 47 d0 83 f8 09 76 0d 8d 57 9f 31

[3518.125802] RIP  [812446fe] sysrq_handle_crash+0xd/0x16

[3518.128071]  RSP 880037abbe80

[3518.129085] DR2: 





*Andrew Walsh*


Re: Difficulty configuring kdump

2013-07-29 Thread emmanuel segura
Hello

as you see in the kernel trace, maybe it's kernel BUG, i don't know if
there is an open bug for this, but it can be useful if you try to do the
same thing with an other kernel.


2013/7/29 Andrew Walsh awa...@permabit.com

 Here is the output once I trigger the crash:



 [3518.067263] SysRq : Trigger a crash

 [3518.070526] BUG: unable to handle kernel NULL pointer dereference at
 (null)

 [3518.071338] IP: [812446fe] sysrq_handle_crash+0xd/0x16

 [3518.072163] PGD 7c781067 PUD 7c780067 PMD 0

 [3518.072973] Oops: 0002 [#1] SMP

 [3518.073772] CPU 0

 [3518.073782] Modules linked in: autofs4 fuse nfs lockd fscache
 auth_rpcgss nfs_acl sunrpc dm_crypt snd_pcm snd_timer snd parport_pc
 tpm_tis tpm tpm_bios soundcore snd_page_alloc psmouse i2c_piix4 i2c_core
 parport evdev vmw_balloon shpchp pcspkr coretemp serio_raw processor ac
 container button power_supply thermal_sys ext3 jbd mbcache dm_mod raid10
 raid456 async_raid6_recov asynx_pq raid6_pq async_xor xor asyn_memcpy
 async_tx raid1 raid0 multipath linear md_mod nbd sg sr_mod cdrom sd_mod
 ata_generic crc_t10dif ata_piix libata floppy e1000 crc32c_intel mptspi
 mptscih mptbase scsi_transport_spi scsi_mod [last unloaded: scsi_wait_scan]

 [3518.082803]

 [3518.083718] Pid: 2108, comm: bash Not tainted: 3.2.0-0.bpo.3-amd64 #1
 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform

 [3518.084706] RIP: 0010:[f812446fe]  [f812446fe]
 sysrq_handle_crash+0xd/0x16

 [3518.085667] RSP: 0018880037abbe80  EFLAGS: 00010092

 [3518.086610] RAX: 0010 RBX: 8164b660 RCX:
 09e069e06

 [3518.087580] RDX:  RSI: 0046 RDI:
 00063

 [3518.088532] RBP: 0063 R08:  R09:
 0

 [3518.089475] R10: 88007ab874d0 R11: 81433470 R12:
 1

 [3518.094532] R13: 0246 R14: 7fff09a49200 R15:
 1

 [3518.096509] FS:  7f4a16be5700() GS:88007fc0()
 knlGS:

 [3518.097494] CS:  0010 DS:  ES:  CR0: 80050033

 [3518.098470] CR2:  CR3: 7cba2000 CR4:
 006f0

 [3518.100774] DR0:  DR1:  DR2:
 0

 [3518.104353] DR3:  DR6: 0ff0 DR7:
 00400

 [3518.105309] Process bash (pid: 2108, threadinfo 880037aba000, task
 8800379430e0)

 [3518.106266] Stack:

 [3518.107217]  81244c90 0002 
 880037abbf58

 [3518.108205]  0002 7fff09a4922c 81244d5d
 fff4

 [3518.109197]  88007abf5480 88007c909ec0 8114df96
 0002

 [3518.110208] Call Trace:

 [3518.113574]  [81244c90] ? __handle_sysrq+0xa9/0x141

 [3518.115998]  [81244d5d] ? write_sysrq_trigger+0x35/0x3d

 [3518.117025]  [8114df96] ? proc_reg_write+0x7a/0x93

 [3518.118032]  [811065cd] ? vfs_write+0xa4/0xff

 [3518.119023]  [811066de] ? sys_write+0x45/0x6e

 [3518.120019]  [8136b292] ? system_call_fastpath+0x16/0x1b

 [3518.121003] Code: 00 01 8a 81 13 2f 80 81 19 d2 83 e0 8f f7 d2 83 e2 03
 c1 e2 04 09 d0 88 81 13 2f 80 81 c3 c7 05 a1 d4 52 00 01 00 00 00 0f ae f8
 c6 04 25 00 00 00 00 01 c3 8d 47 d0 83 f8 09 76 0d 8d 57 9f 31

 [3518.125802] RIP  [812446fe] sysrq_handle_crash+0xd/0x16

 [3518.128071]  RSP 880037abbe80

 [3518.129085] DR2: 





 *Andrew Walsh*




-- 
esta es mi vida e me la vivo hasta que dios quiera


Re: Difficulty configuring kdump

2013-07-29 Thread Andrew Walsh
Thanks for the reply.

However, I believe that's an expected behavior when simulating a crash.

See the following doc, which states 'c' - Will perform a system crash by a
NULL pointer dereference. A crashdump will be taken if configured.

https://www.kernel.org/doc/Documentation/sysrq.txt

*Andrew Walsh*