** Changed in: makedumpfile (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1681909
Title:
[FEAT 18.10] dump is not captured in remote host when kdump over ssh
is configured on firestone.
Status in The Ubuntu-power-systems project:
Incomplete
Status in makedumpfile package in Ubuntu:
Incomplete
Bug description:
== Comment: #0 - PAVITHRA R. PRAKASH <[email protected]> - 2017-03-07
05:00:29 ==
---Problem Description---
Ubuntu 17.04: dump is not captured in remote host when kdump over ssh
is configured on firestone.
---Steps to Reproduce---
1. Configure kdump.
2. Check whether kdump is operational using ?# kdump-config show?.
3. Install ?kernel-debuginfo? and ?kernel-debuginfo-common? rpms.
4. Setup password less ssh connection, generate rsa key.
# ssh-keygen -t rsa
5. verify id_rsa and id_rsa.pub are created under /root/.ssh/
6. Edit /etc/default/kdump-tools and add below entries.
SSH="[email protected]"
SSH_KEY=/root/.ssh/id_rsa
7. Propagate RSA key.
# kdump-config propagate
8. Restart kdump service.
# kdump-config load
9. Trigger Crash using below commands.
# echo "1" > /proc/sys/kernel/sysrq
# echo "c" > /proc/sysrq-trigger
10. Verify dump is available in remote server in configured path.
Machine details
===========
$ ipmitool -I lanplus -H 9.47.70.3 -U ADMIN -P admin sol activate
$ ssh [email protected]
PW: shriya101
Attaching logs
== Comment: #1 - PAVITHRA R. PRAKASH <[email protected]> -
2017-03-07 05:01:42 ==
== Comment: #5 - PAVITHRA R. PRAKASH <[email protected]> - 2017-03-07
23:19:46 ==
Hi,
Attaching the logs.
Network info:
root@ltc-firep3:~# hwinfo --network
36: None 00.0: 10700 Loopback
[Created at net.126]
Unique ID: ZsBS.GQNx7L4uPNA
SysFS ID: /class/net/lo
Hardware Class: network interface
Model: "Loopback network interface"
Device File: lo
Link detected: yes
Config Status: cfg=new, avail=yes, need=no, active=unknown
37: None 00.0: 10701 Ethernet
[Created at net.126]
Unique ID: 2lHw.ndpeucax6V1
Parent ID: mIXc.aXC4wIvegH8
SysFS ID: /class/net/enP33p3s0f2
SysFS Device Link:
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.2
Hardware Class: network interface
Model: "Ethernet network interface"
Driver: "tg3"
Driver Modules: "tg3"
Device File: enP33p3s0f2
HW Address: 98:be:94:03:18:4a
Permanent HW Address: 98:be:94:03:18:4a
Link detected: no
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #15 (Ethernet controller)
38: None 00.0: 10701 Ethernet
[Created at net.126]
Unique ID: 7Onn.ndpeucax6V1
Parent ID: sx0U.aXC4wIvegH8
SysFS ID: /class/net/enP33p3s0f0
SysFS Device Link:
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.0
Hardware Class: network interface
Model: "Ethernet network interface"
Driver: "tg3"
Driver Modules: "tg3"
Device File: enP33p3s0f0
HW Address: 98:be:94:03:18:48
Permanent HW Address: 98:be:94:03:18:48
Link detected: yes
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #16 (Ethernet controller)
39: None 00.0: 10701 Ethernet
[Created at net.126]
Unique ID: VwX_.ndpeucax6V1
Parent ID: DUng.aXC4wIvegH8
SysFS ID: /class/net/enP33p3s0f3
SysFS Device Link:
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.3
Hardware Class: network interface
Model: "Ethernet network interface"
Driver: "tg3"
Driver Modules: "tg3"
Device File: enP33p3s0f3
HW Address: 98:be:94:03:18:4b
Permanent HW Address: 98:be:94:03:18:4b
Link detected: no
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #25 (Ethernet controller)
40: None 00.0: 10701 Ethernet
[Created at net.126]
Unique ID: bZ1s.ndpeucax6V1
Parent ID: J7HY.aXC4wIvegH8
SysFS ID: /class/net/enP33p3s0f1
SysFS Device Link:
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.1
Hardware Class: network interface
Model: "Ethernet network interface"
Driver: "tg3"
Driver Modules: "tg3"
Device File: enP33p3s0f1
HW Address: 98:be:94:03:18:49
Permanent HW Address: 98:be:94:03:18:49
Link detected: no
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #4 (Ethernet controller)
root@ltc-firep3:~#
Thanks,
Pavithra
== Comment: #6 - PAVITHRA R. PRAKASH <[email protected]> -
2017-03-07 23:20:47 ==
== Comment: #7 - PAVITHRA R. PRAKASH <[email protected]> - 2017-03-07
23:21:27 ==
== Comment: #8 - Urvashi Jawere <[email protected]> - 2017-03-08 02:48:15 ==
I am able to see some errors in syslog ;
auxiliary
Mar 7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed
for question 114.15.239:/home/ubuntu/test IN SOA: failed-auxiliary
Mar 7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed
for question 9.114.15.239:/home/ubuntu/test IN DS: failed-auxiliary
Mar 7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed
for question 9.114.15.239:/home/ubuntu/test IN SOA: failed-auxiliary
Mar 7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed
for question 9.114.15.239:/home/ubuntu/test IN A: failed-auxiliary
Mar 7 04:57:44 ltc-firep3 systemd-resolved[3486]: Server 9.12.16.2 does not
support DNSSEC, downgrading to non-DNSSEC mode.
Mar 7 04:57:44 ltc-firep3 kdump-config: /root/.ssh/id_rsa failed to be sent
to [email protected]:/home/ubuntu/test
Mar 7 04:58:04 ltc-firep3 systemd[1]: Reloading.
Mar 7 04:59:15 ltc-firep3 systemd[1]: Reloading.
Mar 7 04:59:16 ltc-firep3 kdump-config: propagated ssh key /root/.ssh/id_rsa
to server [email protected]
.
.
.
Mar 7 05:06:55 ltc-firep3 systemd[1]: Started Accounts Service.
Mar 7 05:06:56 ltc-firep3 kdump-tools[3498]: Starting kdump-tools: Modified
cmdline:root=UUID=1e76cfd5-988c-46f4-bdc4-39fe1ed01152 ro quiet splash irqpoll
nr_cpus=1 nousb systemd.unit=kdump-tools.service ata_piix.prefer_ms_hyperv=0
elfcorehdr=155136K
Mar 7 05:06:57 ltc-firep3 kdump-tools[3498]: * loaded kdump kernel
Mar 7 05:06:57 ltc-firep3 kdump-tools: /sbin/kexec -p
--command-line="root=UUID=1e76cfd5-988c-46f4-bdc4-39fe1ed01152 ro quiet splash
irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service
ata_piix.prefer_ms_hyperv=0" --initrd=/var/lib/kdump/initrd.img
/var/lib/kdump/vmlinuz
Mar 7 05:06:57 ltc-firep3 kdump-tools: loaded kdump kernel
Mar 7 05:06:57 ltc-firep3 systemd[1]: Started Kernel crash dump capture
service.
Mar 7 05:06:57 ltc-firep3 apport[3584]: ERROR: Cannot create report: [Errno
17] File exists: '/var/crash/linux-image-4.10.0-9-generic-201703060521.crash'
Mar 7 05:06:57 ltc-firep3 apport[3584]: ...done.
== Comment: #18 - Hari Krishna Bathini <[email protected]> - 2017-03-28
06:55:20 ==
Looks like tg3 module was not needed after all. Interesting thing though is
even after enP34p1s0f0 is up (ifup) and network.online target is reached,
network was not really active. It took about 30 seconds, after reaching
network.online target, for the network to be active, even on a normal boot.
Adding this wait time in kdump script, before saving dump, ensured that
vmcore is captured successful. Attaching the log for the same..
Not sure why enP34p1s0f0 is taking that long to configure/initialize. Even so,
this delay should be part of ifup/network-online.target if it is inevitable,
so that network is pingable after network-online.target
Thanks
Hari
== Comment: #19 - Hari Krishna Bathini <[email protected]> - 2017-03-28
07:01:52 ==
The workaround snippet adding delay in kdump script:
--- kdump-config.orig 2017-03-28 03:35:17.753542107 -0500
+++ kdump-config 2017-03-28 06:59:22.887576623 -0500
@@ -761,6 +761,7 @@
KDUMP_DMESGFILE="$KDUMP_STAMPDIR/dmesg.$KDUMP_STAMP"
ERROR=0
+ sleep 30
ssh -i $KDUMP_SSH_KEY $KDUMP_REMOTE_HOST mkdir -p $KDUMP_STAMPDIR
ERROR=$?
# If remote connections fails, no need to continue
---
Thanks
Hari
== Comment: #20 - PAVITHRA R. PRAKASH <[email protected]> - 2017-03-30
01:33:56 ==
(In reply to comment #19)
> The workaround snippet adding delay in kdump script:
>
>
> --- kdump-config.orig 2017-03-28 03:35:17.753542107 -0500
> +++ kdump-config 2017-03-28 06:59:22.887576623 -0500
> @@ -761,6 +761,7 @@
> KDUMP_DMESGFILE="$KDUMP_STAMPDIR/dmesg.$KDUMP_STAMP"
> ERROR=0
>
> + sleep 30
> ssh -i $KDUMP_SSH_KEY $KDUMP_REMOTE_HOST mkdir -p $KDUMP_STAMPDIR
> ERROR=$?
> # If remote connections fails, no need to continue
>
> ---
>
> Thanks
> Hari
With above workaround dump captured successfully in remote host.
Thanks,
Pavithra
== Comment: #22 - Hari Krishna Bathini <[email protected]> - 2017-04-10
22:14:27 ==
(In reply to comment #18)
> Created attachment 117088 [details]
> Console log of successful dump capture after adding a time delay of 'sleep
> 30'
>
> Looks like tg3 module was not needed after all. Interesting thing though is
> even after enP34p1s0f0 is up (ifup) and network.online target is reached,
> network was not really active. It took about 30 seconds, after reaching
> network.online target, for the network to be active, even on a normal boot.
> Adding this wait time in kdump script, before saving dump, ensured that
> vmcore is captured successful. Attaching the log for the same..
>
> Not sure why enP34p1s0f0 is taking that long to configure/initialize. Even
> so,
> this delay should be part of ifup/network-online.target if it is inevitable,
> so that network is pingable after network-online.target
Hi Canonical,
Since this falls outside the realm of kdump, should we add a NET_WAIT_TIME
field
in /etc/default/kdump-tools file that defaults to 0 but can be changed when
the
user sees timing troubles?
Thanks
Hari
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1681909/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp