Public bug reported:
Multiple SuperMicro based servers running 14.04 are experiencing
continual kernel errors since upgrading to 3.13.0-155 which quickly
leads to the system becoming unresponsive. The errors start immediately
after boot.
Small excerpt from the attached kern.log:
Aug 15 09:54:15 server kernel: [ 0.000000] CPU0 microcode updated early to
revision 0x713, date = 2018-01-26
...
Aug 15 09:54:17 server kernel: [ 14.381553] ipmi device interface
Aug 15 09:54:17 server kernel: [ 14.610493] NFS: Registering the id_resolver
key type
Aug 15 09:54:17 server kernel: [ 14.610504] Key type id_resolver registered
Aug 15 09:54:17 server kernel: [ 14.610505] Key type id_legacy registered
Aug 15 09:54:26 server kernel: [ 23.412042] BUG: Bad page map in process
plymouthd pte:8000000860a3d966 pmd:465c17067
Aug 15 09:54:26 server kernel: [ 23.442867] addr:00007fb8cc137000
vm_flags:08100073 anon_vma:ffff880866695ab0 mapping:ffff88086543a870 index:7
Aug 15 09:54:26 server kernel: [ 23.472375] vma->vm_ops->fault:
filemap_fault+0x0/0x400
Aug 15 09:54:26 server kernel: [ 23.484454] vma->vm_file->f_op->mmap:
ext4_file_mmap+0x0/0x60
Aug 15 09:54:26 server kernel: [ 23.496669] CPU: 4 PID: 523 Comm: plymouthd
Tainted: G B 3.13.0-155-generic #205-Ubuntu
Aug 15 09:54:26 server kernel: [ 23.496670] Hardware name: Supermicro
X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
Aug 15 09:54:26 server kernel: [ 23.496671] 0000000000000000
ffff880465f37d00 ffffffff8173983f 00007fb8cc137000
Aug 15 09:54:26 server kernel: [ 23.496675] ffff8808652c63c0
ffff880465f37d50 ffffffff8117e374 8000000860a3d966
Aug 15 09:54:26 server kernel: [ 23.496678] 0000000465c17067
0000000000000007 ffff880465c179b8 8000000860a3d966
Aug 15 09:54:26 server kernel: [ 23.496681] Call Trace:
Aug 15 09:54:26 server kernel: [ 23.496684] [<ffffffff8173983f>]
dump_stack+0x64/0x80
Aug 15 09:54:26 server kernel: [ 23.496687] [<ffffffff8117e374>]
print_bad_pte+0x1a4/0x250
Aug 15 09:54:26 server kernel: [ 23.496690] [<ffffffff8117f6ae>]
vm_normal_page+0x6e/0x80
Aug 15 09:54:26 server kernel: [ 23.496701] [<ffffffff8118ae5f>]
change_protection_range+0x55f/0x720
Aug 15 09:54:26 server kernel: [ 23.496706] [<ffffffff8118b085>]
change_protection+0x65/0xb0
Aug 15 09:54:26 server kernel: [ 23.496709] [<ffffffff811a164b>]
change_prot_numa+0x1b/0x40
Aug 15 09:54:26 server kernel: [ 23.496712] [<ffffffff810a60c2>]
task_numa_work+0x1d2/0x300
Aug 15 09:54:26 server kernel: [ 23.496714] [<ffffffff8108ef8f>]
task_work_run+0xaf/0xd0
Aug 15 09:54:26 server kernel: [ 23.496717] [<ffffffff81014ed7>]
do_notify_resume+0x97/0xb0
Aug 15 09:54:26 server kernel: [ 23.496720] [<ffffffff8174ad70>]
int_signal+0x12/0x17
...
Aug 15 09:54:54 server kernel: [ 50.902769] BUG: Bad rss-counter state
mm:ffff880466869880 idx:1 val:338
Aug 15 09:55:25 server kernel: [ 82.513872] BUG: Bad rss-counter state
mm:ffff880866d6b800 idx:1 val:249
...
Aug 15 09:56:30 server kernel: [ 144.954186] CPU: 18 PID: 4139 Comm: php-cgi
Tainted: G B 3.13.0-155-generic #205-Ubuntu
Aug 15 09:56:30 server kernel: [ 144.954189] 0000000000000000
ffff880467cefd00 ffffffff8173983f 0000000002d2e000
Aug 15 09:56:30 server kernel: [ 144.954193] 0000000468950067
0000000000002d2e ffff880468950970 800000042a764966
Aug 15 09:56:30 server kernel: [ 144.954195] [<ffffffff8173983f>]
dump_stack+0x64/0x80
Aug 15 09:56:30 server kernel: [ 144.954199] [<ffffffff8117f6ae>]
vm_normal_page+0x6e/0x80
Aug 15 09:56:30 server kernel: [ 144.954203] [<ffffffff8118b085>]
change_protection+0x65/0xb0
Aug 15 09:56:30 server kernel: [ 144.954207] [<ffffffff811a164b>]
change_prot_numa+0x1b/0x40
Aug 15 09:56:30 server kernel: [ 144.954211] [<ffffffff8108ef8f>]
task_work_run+0xaf/0xd0
Aug 15 09:56:30 server kernel: [ 144.954214] [<ffffffff817421b2>]
retint_signal+0x48/0x86
Aug 15 09:56:30 server kernel: [ 144.954216] addr:0000000002d2f000
vm_flags:08100073 anon_vma:ffff880467651c18 mapping: (null) index:2d2f
Aug 15 09:56:30 server kernel: [ 144.954218] CPU: 18 PID: 4139 Comm: php-cgi
Tainted: G B 3.13.0-155-generic #205-Ubuntu
Aug 15 09:56:30 server kernel: [ 144.954218] Hardware name: Supermicro
X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
Aug 15 09:56:30 server kernel: [ 144.954221] 0000000000000000
ffff880467cefd00 ffffffff8173983f 0000000002d2f000
Aug 15 09:56:30 server kernel: [ 144.954223] ffff880868111800
ffff880467cefd50 ffffffff8117e374 800000042a765966
Aug 15 09:56:30 server kernel: [ 144.954225] 0000000468950067
0000000000002d2f ffff880468950978 800000042a765966
Aug 15 09:56:30 server kernel: [ 144.954225] Call Trace:
Aug 15 09:56:30 server kernel: [ 144.954227] [<ffffffff8173983f>]
dump_stack+0x64/0x80
Aug 15 09:56:30 server kernel: [ 144.954229] [<ffffffff8117e374>]
print_bad_pte+0x1a4/0x250
Aug 15 09:56:30 server kernel: [ 144.954231] [<ffffffff8117f6ae>]
vm_normal_page+0x6e/0x80
Aug 15 09:56:30 server kernel: [ 144.954233] [<ffffffff8118ae5f>]
change_protection_range+0x55f/0x720
Aug 15 09:56:30 server kernel: [ 144.954235] [<ffffffff8118b085>]
change_protection+0x65/0xb0
Aug 15 09:56:30 server kernel: [ 144.954237] [<ffffffff81742655>] ?
error_entry+0x115/0x179
Aug 15 09:56:30 server kernel: [ 144.954239] [<ffffffff811a164b>]
change_prot_numa+0x1b/0x40
Aug 15 09:56:30 server kernel: [ 144.954241] [<ffffffff810a60c2>]
task_numa_work+0x1d2/0x300
Aug 15 09:56:30 server kernel: [ 144.954242] [<ffffffff8108ef8f>]
task_work_run+0xaf/0xd0
Aug 15 09:56:30 server kernel: [ 144.954244] [<ffffffff81014ed7>]
do_notify_resume+0x97/0xb0
Aug 15 09:56:30 server kernel: [ 144.954246] [<ffffffff817421b2>]
retint_signal+0x48/0x86
Aug 15 09:56:30 server kernel: [ 147.373769] addr:0000000000ec8000
vm_flags:08100073 anon_vma:ffff8808674497e0 mapping: (null) index:ec8
Aug 15 09:56:30 server kernel: [ 147.404530] Hardware name: Supermicro
X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
Aug 15 09:56:30 server kernel: [ 147.404545] ffff88086062c780
ffff880462bcfd50 ffffffff8117e374 80000004292ce966
Aug 15 09:56:30 server kernel: [ 147.404552] Call Trace:
Aug 15 09:56:30 server kernel: [ 147.404569] [<ffffffff8117e374>]
print_bad_pte+0x1a4/0x250
Aug 15 09:56:30 server kernel: [ 147.404594] [<ffffffff8118ae5f>]
change_protection_range+0x55f/0x720
Aug 15 09:56:30 server kernel: [ 147.404597] [<ffffffff8118b085>]
change_protection+0x65/0xb0
Aug 15 09:56:30 server kernel: [ 147.404607] [<ffffffff811a164b>]
change_prot_numa+0x1b/0x40
Aug 15 09:56:30 server kernel: [ 147.404618] [<ffffffff8108ef8f>]
task_work_run+0xaf/0xd0
Aug 15 09:56:30 server kernel: [ 147.404627] [<ffffffff817421b2>]
retint_signal+0x48/0x86
Aug 15 09:56:36 server kernel: [ 152.965390] BUG: Bad rss-counter state
mm:ffff8808651b4980 idx:1 val:272
Aug 15 09:56:38 server kernel: [ 155.091856] BUG: Bad rss-counter state
mm:ffff880866d6aa00 idx:0 val:23
Aug 15 09:56:38 server kernel: [ 155.459289] BUG: Bad rss-counter state
mm:ffff8808651b2300 idx:0 val:23
Aug 15 09:56:38 server kernel: [ 155.472278] BUG: Bad rss-counter state
mm:ffff8808651b2300 idx:1 val:793
Aug 15 09:56:38 server kernel: [ 155.613023] BUG: Bad rss-counter state
mm:ffff8804669e0700 idx:1 val:657
Aug 15 09:56:42 server kernel: [ 159.472398] BUG: Bad rss-counter state
mm:ffff880867d3e580 idx:0 val:2
Aug 15 09:56:42 server kernel: [ 159.483401] BUG: Bad rss-counter state
mm:ffff880867d3e580 idx:1 val:1740
Aug 15 09:56:44 server kernel: [ 161.445747] BUG: Bad rss-counter state
mm:ffff8804669e1180 idx:1 val:8655
Aug 15 09:56:54 server kernel: [ 171.619129] BUG: Bad rss-counter state
mm:ffff880466baa680 idx:1 val:7075
Aug 15 09:56:57 server kernel: [ 174.185697] BUG: Bad rss-counter state
mm:ffff880865bc7a80 idx:1 val:508
Aug 15 09:56:58 server kernel: [ 175.442721] BUG: Bad rss-counter state
mm:ffff880866d69f80 idx:0 val:23
Aug 15 09:56:58 server kernel: [ 175.450734] BUG: Bad rss-counter state
mm:ffff880866d69f80 idx:1 val:511
...
The system becomes unresponsive at this point.
The 'Bad page map' error occurs for some processes many times.
The issue is not present when reverting to 3.13.0-153.
Unable to provide output from `ubuntu-bug linux` due to system
instability.
# lsb_release -rd
Description: Ubuntu 14.04.5 LTS
Release: 14.04
# apt-cache policy linux-image-3.13.0-155-generic
linux-image-3.13.0-155-generic:
Installed: 3.13.0-155.205
Candidate: 3.13.0-155.205
Version table:
*** 3.13.0-155.205 0
500 http://gb.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64
Packages
500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64
Packages
100 /var/lib/dpkg/status
** Affects: linux (Ubuntu)
Importance: Undecided
Status: Confirmed
** Tags: trusty
** Attachment added: "kern.log"
https://bugs.launchpad.net/bugs/1787191/+attachment/5175683/+files/kern.log
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1787191
Title:
Crash due to BUG: Bad page map in process X & BUG: Bad rss-counter
state X
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787191/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs