[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2015-06-13 Thread Bianco Veigel
I'm also affected by this bug on Trusty with the latest kernel
(3.13.0-53-generic):

[500345.624596] BUG: soft lockup - CPU#0 stuck for 23s! [khugepaged:54]
[500345.630989] Modules linked in: ipt_REJECT btrfs ufs qnx4 
hfsplus[500345.636604] BUG: soft lockup - CPU#1 stuck for 23s! [kthreadd:2]
[500345.636605] Modules linked in: ipt_REJECT btrfs ufs qnx4 hfsplus hfs minix 
ntfs msdos jfs xt_multiport iptable_filter ip_tables x_tables kvm_amd edac_core 
kvm k10temp edac_mce_amd i2c_piix4 serio_raw shpchp mac_hid nfsd auth_rpcgss 
nfs_acl nfs lp binfmt_misc lockd sunrpc fscache parport xfs libcrc32c xts 
gf128mul dm_crypt raid10 raid0 multipath linear raid456 async_memcpy 
async_raid6_recov async_pq async_xor async_tx xor raid6_pq raid1 pata_acpi 
hid_generic usbhid hid radeon psmouse i2c_algo_bit ahci ttm libahci 
drm_kms_helper pata_atiixp mpt2sas drm r8169 raid_class scsi_transport_sas mii 
wmi
[500345.636627] CPU: 1 PID: 2 Comm: kthreadd Tainted: G  D   
3.13.0-53-generic #89-Ubuntu
[500345.636628] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./M3A785GMH/128M, BIOS P1.40 11/26/2009
[500345.636628] task: 880214bd9800 ti: 880214458000 task.ti: 
880214458000
[500345.636629] RIP: 0010:[8172ae05]  [8172ae05] 
_raw_spin_lock+0x35/0x50
[500345.636631] RSP: 0018:8802144599f8  EFLAGS: 0202
[500345.636632] RAX: 79f7 RBX: 8127f5bc RCX: 
87ea
[500345.636632] RDX: 87ec RSI: 87ec RDI: 
8800cdef5c80
[500345.636633] RBP: 8802144599f8 R08: 9038 R09: 
0107e5b3481c
[500345.636633] R10: feda1a8eb150d207 R11: 03eb R12: 
0046
[500345.636634] R13: 8115e67e R14: 880214459a20 R15: 

[500345.636635] FS:  7fa8ed61e740() GS:88021fc4() 
knlGS:
[500345.636636] CS:  0010 DS:  ES:  CR0: 8005003b
[500345.636636] CR2: 7f8263077000 CR3: 000210041000 CR4: 
07e0
[500345.636637] Stack:
[500345.636637]  880214459a18 8127ffb9 880107e5b2a8 
880107e5b2a8
[500345.636639]  880214459a30 81260091 880107e5b3f8 
880214459a50
[500345.636641]  8124614a 880107e5b2a8 880107e5b3b0 
880214459a78
[500345.636642] Call Trace:
[500345.636643]  [8127ffb9] ext4_es_lru_del+0x29/0x70
[500345.636644]  [81260091] ext4_clear_inode+0x41/0x90
[500345.636646]  [8124614a] ext4_evict_inode+0x8a/0x4d0
[500345.636647]  [811d9490] evict+0xb0/0x1b0
[500345.636649]  [811d95c9] dispose_list+0x39/0x50
[500345.636651]  [811da4f7] prune_icache_sb+0x47/0x60
[500345.636652]  [811c1875] super_cache_scan+0x105/0x170
[500345.636654]  [81161357] shrink_slab+0x1c7/0x370
[500345.636655]  [8116448d] do_try_to_free_pages+0x3ed/0x540
[500345.636657]  [811646cc] try_to_free_pages+0xec/0x180
[500345.636658]  [81159385] __alloc_pages_nodemask+0x7d5/0xb80
[500345.636660]  [810a7400] ? load_balance+0x120/0x890
[500345.636662]  [810652e3] copy_process.part.26+0x143/0x16b0
[500345.636663]  [8109d665] ? sched_clock_cpu+0xb5/0x100
[500345.636664]  [8108b5e0] ? kthread_create_on_node+0x1c0/0x1c0
[500345.63]  [81066a25] do_fork+0xd5/0x340
[500345.636667]  [81066cb6] kernel_thread+0x26/0x30
[500345.636668]  [8108c02a] kthreadd+0x15a/0x1c0
[500345.636670]  [8108bed0] ? kthread_create_on_cpu+0x60/0x60
[500345.636671]  [81733868] ret_from_fork+0x58/0x90
[500345.636673]  [8108bed0] ? kthread_create_on_cpu+0x60/0x60
[500345.636674] Code: 00 00 02 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d 
c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 83 e8 01 74 0a 
0f b7 0f 66 39 ca 75 f1 5d c3 66 66 66 90 66 66 90 eb da
[500345.648614] BUG: soft lockup - CPU#2 stuck for 23s! [aria2c:11547]
[500345.648614] Modules linked in: ipt_REJECT btrfs ufs qnx4 hfsplus hfs minix 
ntfs msdos jfs xt_multiport iptable_filter ip_tables x_tables kvm_amd edac_core 
kvm k10temp edac_mce_amd i2c_piix4 serio_raw shpchp mac_hid nfsd auth_rpcgss 
nfs_acl nfs lp binfmt_misc lockd sunrpc fscache parport xfs libcrc32c xts 
gf128mul dm_crypt raid10 raid0 multipath linear raid456 async_memcpy 
async_raid6_recov async_pq async_xor async_tx xor raid6_pq raid1 pata_acpi 
hid_generic usbhid hid radeon psmouse i2c_algo_bit ahci ttm libahci 
drm_kms_helper pata_atiixp mpt2sas drm r8169 raid_class scsi_transport_sas mii 
wmi
[500345.648637] CPU: 2 PID: 11547 Comm: aria2c Tainted: G  D   
3.13.0-53-generic #89-Ubuntu
[500345.648637] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./M3A785GMH/128M, BIOS P1.40 11/26/2009
[500345.648638] task: 8800cfb48000 ti: 88021455 task.ti: 
88021455
[500345.648639] RIP: 0010:[8172ae02]  [8172ae02] 
_raw_spin_lock+0x32/0x50
[500345.648640] RSP: 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2015-06-13 Thread Bianco Veigel
I've hit this bug the second time within one week on a 24/7 Server
system. Is there any workaround for this, currently I'm thinking about
switching from ext4 to ext3.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 7fb66906d8e0 CR3: 
00053b3dd000 CR4: 07f0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152749] Stack: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152750] 880a64afb758 
8174759e 880a64afb778 8128eb02 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152756] 8807572480b0 
880757248200 880a64afb798 81270595 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152759] 880a64afb798 
8807572480b0 880a64afb7c8 81257621 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152763] Call Trace: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152770] [8174759e] 
_raw_spin_lock+0xe/0x20 
  

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2015-06-13 Thread Rafael David Tinoco
Bianco,

based on my comment #1:

...
Finally we arrive at ext4_es_lru_del(struct inode *inode):

1023 spin_lock(sbi-s_es_lru_lock);

The task is hang and someone is holding this lock. I would have to have this 
core to know who and why.
...

Could you provide me a kdump ?

If you are not aware how to, I suggest you this:

http://www.inaddy.org/mini-howtos/dumps/using-ubuntu-crash-dump-with-
kdump

If the kdump is too big you can share it using a dropbox or something
like it.

Thank you

Rafael


** Summary changed:

- 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).
+ 3.11 and 3.13 EXT4 race condition - kernel panic - ext4_es_lru_del

** Changed in: linux (Ubuntu)
   Status: Incomplete = Confirmed

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) = Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 and 3.13 EXT4 race condition - kernel panic - ext4_es_lru_del

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2015-06-13 Thread Christopher M. Penalver
Bianco Veigel, it will help immensely if you filed a new report via a terminal:
ubuntu-bug linux

Please feel free to subscribe me to it.

** Changed in: linux (Ubuntu)
   Status: Confirmed = Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 7fb66906d8e0 CR3: 
00053b3dd000 CR4: 07f0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152749] Stack: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152750] 880a64afb758 
8174759e 880a64afb778 8128eb02 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152756] 8807572480b0 
880757248200 880a64afb798 81270595 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152759] 880a64afb798 
8807572480b0 880a64afb7c8 81257621 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152763] Call Trace: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152770] 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2015-01-04 Thread Launchpad Bug Tracker
[Expired for linux (Ubuntu) because there has been no activity for 60
days.]

** Changed in: linux (Ubuntu)
   Status: Incomplete = Expired

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in linux package in Ubuntu:
  Expired

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 7fb66906d8e0 CR3: 
00053b3dd000 CR4: 07f0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152749] Stack: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152750] 880a64afb758 
8174759e 880a64afb778 8128eb02 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152756] 8807572480b0 
880757248200 880a64afb798 81270595 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152759] 880a64afb798 
8807572480b0 880a64afb7c8 81257621 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152763] Call Trace: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152770] [8174759e] 
_raw_spin_lock+0xe/0x20 
  Oct 15 14:19:08 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Rafael David Tinoco
For the other back trace (the one I have the core and can use crash tool): 
* I'm still analysing this... 

PID: 4453 TASK: 8800362dddc0 CPU: 16 COMMAND: git 
#0 [880092791a60] machine_kexec at 8104b141 
#1 [880092791ad0] crash_kexec at 810d5a58 
#2 [880092791ba0] oops_end at 81748b38 
#3 [880092791bd0] no_context at 8172dd02 
#4 [880092791c20] __bad_area_nosemaphore at 8172dee4 
#5 [880092791c80] bad_area_nosemaphore at 8172df16 
#6 [880092791c90] __do_page_fault at 8174bc12 
#7 [880092791da0] do_page_fault at 8174bde7 
#8 [880092791dd0] page_fault at 81747e98 
[exception RIP: kmem_cache_alloc+102] 
RIP: 8119c316 RSP: 880092791e88 RFLAGS: 00010286 
RAX:  RBX: ffea RCX: 26be 
RDX: 26bd RSI: 00d0 RDI: 000173c0 
RBP: 880092791ed8 R8: 880c0bb173c0 R9: 1165 
R10: 7f2e4ac88dcc R11: 0246 R12: 880c0b403800 
R13: 880ce78ccc00 R14: 8108f636 R15: 00d0 
ORIG_RAX:  CS: 0010 SS: 0018 
#9 [880092791ee0] prepare_creds at 8108f636 
#10 [880092791f00] sys_faccessat at 811b22b4 
#11 [880092791f70] sys_access at 811b24b8 
#12 [880092791f80] system_call_fastpath at 8175099d 
RIP: 7f2e4d883097 RSP: 7f2e4ac87c88 RFLAGS: 0206 
RAX: 0015 RBX: 8175099d RCX:  
RDX: 007bde07 RSI:  RDI: 007bdd80 
RBP: 0001 R8: 0066 R9: 1165 
R10: 7f2e4ac88dcc R11: 0246 R12: 811b24b8 
R13: 880092791f78 R14: 0002 R15:  
ORIG_RAX: 0015 CS: 0033 SS: 002b 

We can see that ... system call access (which calls sys_faccessat) was
called by user mode (CS: 0033) to check wether this process has
permission to handle the given file:

FD FILE DENTRY INODE TYPE PATH 
3 880be925be00 88076dd34b40 88082a8434b0 REG 
/home001/jeongku.choi/8994_recent/.repo/project-objects/LG_apps/android/vendor/lge/apps/LGSettings.git/objects/pack/tmp_p
 

You can check that the file descriptor 880be925be00 is present into
frame #10 (because it is an argument to the system call and was pushed
into stack):

#10 [880092791f00] sys_faccessat at 811b22b4 
880092791f08: 0001 7f2e30001050 
880092791f18: 880be925be00  
880092791f28: 880092791f78 811b46b6 
880092791f38: 7f2e4ac87d00 7f2e4cce9590 
880092791f48: 7f2e4cce9590 0457 
880092791f58:  0002 
880092791f68: 880092791f78 811b24b8 

To check this file's permission... continuing on the backtrace... (next
comment)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  New

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Rafael David Tinoco
continuing on the back trace:

#8 [880092791dd0] page_fault at 81747e98 
[exception RIP: kmem_cache_alloc+102] 
RIP: 8119c316 RSP: 880092791e88 RFLAGS: 00010286 
RAX:  RBX: ffea RCX: 26be 
RDX: 26bd RSI: 00d0 RDI: 000173c0 
RBP: 880092791ed8 R8: 880c0bb173c0 R9: 1165 
R10: 7f2e4ac88dcc R11: 0246 R12: 880c0b403800 
R13: 880ce78ccc00 R14: 8108f636 R15: 00d0 
ORIG_RAX:  CS: 0010 SS: 0018 
#9 [880092791ee0] prepare_creds at 8108f636 
#10 [880092791f00] sys_faccessat at 811b22b4 
#11 [880092791f70] sys_access at 811b24b8 

In prepare_creds execution we called kmem_cache_alloc which
triggered a page fault that was handled by the same task... (so we had a
page fault inside kernel context). With the rest of the back trace:

#3 [880092791bd0] no_context at 8172dd02 
#4 [880092791c20] __bad_area_nosemaphore at 8172dee4 
#5 [880092791c80] bad_area_nosemaphore at 8172df16 
#6 [880092791c90] __do_page_fault at 8174bc12 
#7 [880092791da0] do_page_fault at 8174bde7 

But kernel couldn't handle this page fault .. things that might have
happened:

0) error code can be PF_RSVD, PF_USER or PF_PROT (to be checked) 
if not 
0.1) address for the page fault is not inside vmalloc area (return -1) (since 
its a page fault for a virtual address) 
1) this wasn't a spurious fault (for sure) (caused by cpu walking the page 
table by itself) 

kernel oops with no_context...

Im still working on this.. will come back here with comments soon.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  New

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Rafael David Tinoco
For This specific back trace...

Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152776] [8128eb02] 
ext4_es_lru_del+0x32/0x80 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152781] [81270595] 
ext4_clear_inode+0x45/0x90 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152786] [81257621] 
ext4_evict_inode+0x81/0x510 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152792] [811ce490] 
evict+0xc0/0x1d0 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152795] [811ce5e1] 
dispose_list+0x41/0x50 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152797] [8174756f] ? 
_raw_spin_trylock+0xf/0x30 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152800] [811cf5d5] 
prune_icache_sb+0x185/0x340 
... 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152842] [81173ff0] 
handle_mm_fault+0x2a0/0x3e0 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152847] [8174b9ff] 
__do_page_fault+0x1af/0x560 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152850] [811b8de7] ? 
cp_new_stat+0x107/0x120 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152856] [810ca06c] ? 
do_futex+0x7c/0x1b0 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152858] [811b91b5] ? 
SYSC_newstat+0x25/0x30 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152861] [8174bde7] 
do_page_fault+0x37/0x70 

Which I DO NOT have a dump for (not sure if there is any old dump for
this backtrace).

For this backtrace...

I can see memory pressure seems to be big and the page fault incurs in
(signalled also by the fact that the page reclaiming is going through
slowest_path : __alloc_pages_nodemask - alloc_pages_slowpath -
__alloc_pages_direct_reclaim - __perform_reclaim - try_to_free_pages)
- seeing that direct_reclaim is a synchronous and blocking way of
handling page faults. This explains why dropping caches might be helping
here, since the path for the page allocation is different on such
conditions.

Moving a bit...

Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152812] [8115af04] 
shrink_slab+0x154/0x300 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152816] [8115dbf8] 
do_try_to_free_pages+0x218/0x290 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152819] [8115dfa4] 
try_to_free_pages+0xe4/0x1a0 

trying to shrink slab structure to free some memory so a frame can be
allocated to the page fault that just happened.

Moving further.. it gets to the point that aged cache pages are scanned
(LRU) to check if they can be freed AND we got probably into a page
filebacked (pagecache / demand paging) that had its shrink function
called (to flush data, free cache, update filesystem master block, etc):

# update filesystem super block 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152800] [811cf5d5] 
prune_icache_sb+0x185/0x340 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152806] [811b7703] 
prune_super+0x193/0x1b0 

# calling inode evict function (to free inode page) 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152781] [81270595] 
ext4_clear_inode+0x45/0x90 
Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152786] [81257621] 
ext4_evict_inode+0x81/0x510 

Finally we arrive at ext4_es_lru_del(struct inode *inode):

1023 spin_lock(sbi-s_es_lru_lock);

The task is hang and someone is holding this lock. I would have to have
this core to know who and why.

Continuing preliminary analysis for other backtraces / dumps on other
comment...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  New

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Rafael David Tinoco
Small comment:

We might be dealing with 2 different situations/bugs for the two
different stack traces we got (possible race condition / deadlock for
the ext4 page flush for high memory pressure AND kmem_cache_alloc trying
to access wrong addresses)...

As I said, investigating both and checking if they can be correlated.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  New

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 7fb66906d8e0 CR3: 
00053b3dd000 CR4: 07f0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152749] Stack: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152750] 880a64afb758 
8174759e 880a64afb778 8128eb02 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152756] 8807572480b0 
880757248200 880a64afb798 81270595 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152759] 880a64afb798 
8807572480b0 880a64afb7c8 81257621 
  Oct 15 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Rafael David Tinoco
For the kmem_cache_alloc problem I found a really promising fix saying
that the commit:


commit ba5bb147330a8737b6b5a812cc774c79c070704b 
Author: Al Viro v...@zeniv.linux.org.uk 
Date: Thu Mar 21 02:21:19 2013 -0400 

pipe: take allocation and freeing of pipe_inode_info out of -i_mutex 


present from tags v3.10 to today

caused a use-after-free problem on VERY rare occasions (maybe until your
workload was discovered). This problem really looks like the problem we
are having here and is fixed by the following commit:


commit b0d8d2292160bb63de1972361ebed100c64b5b37 
Author: Linus Torvalds torva...@linux-foundation.org 
Date: Mon Dec 2 09:44:51 2013 -0800 

vfs: fix subtle use-after-free of pipe_inode_info

The pipe code was trying (and failing) to be very careful about freeing 
the pipe info only after the last access, with a pattern like: 

spin_lock(inode-i_lock); 
if (!--pipe-files) { 
inode-i_pipe = NULL; 
kill = 1; 
} 
spin_unlock(inode-i_lock); 
__pipe_unlock(pipe); 
if (kill) 
free_pipe_info(pipe); 

where the final freeing is done last.

HOWEVER. The above is actually broken, because while the freeing is 
done at the end, if we have two racing processes releasing the pipe 
inode info, the one that *doesn't* free it will decrement the -files 
count, and unlock the inode i_lock, but then still use the 
pipe_inode_info afterwards when it does the __pipe_unlock(pipe). 

This is *very* hard to trigger in practice, since the race window is 
very small, and adding debug options seems to just hide it by slowing 
things down. 

Simon originally reported this way back in July as an Oops in 
kmem_cache_allocate due to a single bit corruption (due to the final 
spin_unlock(pipe-mutex.wait_lock) incrementing a field in a different 
allocation that had re-used the free'd pipe-info), it's taken this long 
to figure out. 

Since the 'pipe-files' accesses aren't even protected by the pipe lock 
(we very much use the inode lock for that), the simple solution is to 
just drop the pipe lock early. And since there were two users of this 
pattern, create a helper function for it. 

Introduced commit ba5bb147330a (pipe: take allocation and freeing of 
pipe_inode_info out of -i_mutex). 

Reported-by: Simon Kirby s...@hostway.ca 
Reported-by: Ian Applegate i...@cloudflare.com 
Acked-by: Al Viro v...@zeniv.linux.org.uk 
Cc: sta...@kernel.org # v3.10+ 
Signed-off-by: Linus Torvalds torva...@linux-foundation.org 


The fix is contained in the following tags:

inaddy@inerddy:/bugs/kernel/upstream$ git tag --contains 
b0d8d2292160bb63de1972361ebed100c64b5b37 
v3.13 
v3.13-rc3 
v3.13-rc4 
v3.13-rc5 
v3.13-rc6 
v3.13-rc7 
v3.13-rc8 
v3.14 
v3.14-rc1 
v3.14-rc2 

And v3.13 might not suffer from this issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  Incomplete

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Joseph Salisbury
I'm going to mark this bug as invalid for now, since it is reported
against an EOL Saucy kernel.  Please mark the bug as confirmed if the
issue still persists with Trusty/3.13.

** Changed in: linux (Ubuntu)
   Importance: Undecided = Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  Incomplete

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 7fb66906d8e0 CR3: 
00053b3dd000 CR4: 07f0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152749] Stack: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152750] 880a64afb758 
8174759e 880a64afb778 8128eb02 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152756] 8807572480b0 
880757248200 880a64afb798 81270595 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152759] 880a64afb798 
8807572480b0 880a64afb7c8 81257621 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152763] Call Trace: 
  Oct 15 14:19:08 

[Kernel-packages] [Bug 1389787] Re: 3.11 memory consumption leads to HANG (not sure if 3.13 suffers from this).

2014-11-05 Thread Christopher M. Penalver
Rafael David Tinoco, thank you for reporting this and helping make
Ubuntu better. The Saucy enablement kernel has been EoL since August
2014 as per https://wiki.ubuntu.com/Kernel/LTSEnablementStack .

While it appears you have a thorough analysis of the root cause, your
best bet would be to upgrade these machines to the Trusty enablement
kernel.

Would that work for you?

** Changed in: linux (Ubuntu)
   Importance: Medium = Low

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1389787

Title:
  3.11 memory consumption leads to HANG (not sure if 3.13 suffers from
  this).

Status in “linux” package in Ubuntu:
  Incomplete

Bug description:
  It was brought to my attention the following stack traces (occurring
  several times on different machines):

  crash bt 
  PID: 2877 TASK: 881009b42ee0 CPU: 1 COMMAND: git 
  #0 [880c0c6bb9d0] machine_kexec at 8104b141 
  #1 [880c0c6bba40] crash_kexec at 810d5a58 
  #2 [880c0c6bbb10] oops_end at 81748b38 
  #3 [880c0c6bbb40] no_context at 8172dd02 
  #4 [880c0c6bbb90] __bad_area_nosemaphore at 8172dee4 
  #5 [880c0c6bbbf0] bad_area at 8172df5d 
  #6 [880c0c6bbc20] __do_page_fault at 8174bda8 
  #7 [880c0c6bbd30] do_page_fault at 8174bde7 
  #8 [880c0c6bbd60] page_fault at 81747e98 
  [exception RIP: kmem_cache_alloc_trace+106] 
  RIP: 8119b22a RSP: 880c0c6bbe18 RFLAGS: 00010206 
  RAX:  RBX: 88008407e0c0 RCX: 031b6bc9 
  RDX: 031b6bc8 RSI: 80d0 RDI: 000173c0 
  RBP: 880c0c6bbe68 R8: 88081fa373c0 R9:  
  R10:  R11: 0202 R12: 88081f403800 
  R13: 00629050 R14: 811bcfe4 R15: 80d0 
  ORIG_RAX:  CS: 0010 SS: 0018 
  #9 [880c0c6bbe70] alloc_pipe_info at 811bcfe4 
  #10 [880c0c6bbe90] get_pipe_inode at 811bd0aa 
  #11 [880c0c6bbeb0] create_pipe_files at 811bd648 
  #12 [880c0c6bbef0] __do_pipe_flags at 811bd7d2 
  #13 [880c0c6bbf30] sys_pipe2 at 811bd8f0 
  #14 [880c0c6bbf70] sys_pipe at 811bd980 
  #15 [880c0c6bbf80] system_call_fastpath at 8175099d 
  RIP: 7f2d245da7e7 RSP: 7fff6a4cfb18 RFLAGS: 00010283 
  RAX: 0016 RBX: 8175099d RCX: 001c 
  RDX: 7f2d248aeac0 RSI:  RDI: 7fff6a4cfad0 
  RBP: 0002 R8:  R9:  
  R10:  R11: 0202 R12: 811bd980 
  R13: 880c0c6bbf78 R14:  R15: 0232efc0 
  ORIG_RAX: 0016 CS: 0033 SS: 002b

  AND

  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152703] SysRq : Show backtrace of 
all active CPUs 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152708] sending NMI to all CPUs: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152714] NMI backtrace for cpu 0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152720] CPU: 0 PID: 13579 Comm: 
python Tainted: GF I 3.11.0-15-generic #25~precise1-Ubuntu 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152721] Hardware name: HP ProLiant 
BL460c G7, BIOS I27 05/05/2011 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152724] task: 880085f5 ti: 
880a64afa000 task.ti: 880a64afa000 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152725] RIP: 
0010:[81050ff5] [81050ff5] __ticket_spin_lock+0x25/0x30 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152736] RSP: 0018:880a64afb748 
EFLAGS: 0293 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152737] RAX: db25 RBX: 
880be8576480 RCX: 00018066002a 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152739] RDX: db2e RSI: 
0001 RDI: 880be8576480 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152740] RBP: 880a64afb748 R08: 
 R09: ea002407db00 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152741] R10: 8128d8ed R11: 
0001 R12: 8807572480b0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152743] R13: 8807572481b8 R14: 
 R15: 8805e9c77908 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152745] FS: 7ff00d4db700() 
GS:880c0ba0() knlGS: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152746] CS: 0010 DS:  ES:  
CR0: 80050033 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152748] CR2: 7fb66906d8e0 CR3: 
00053b3dd000 CR4: 07f0 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152749] Stack: 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152750] 880a64afb758 
8174759e 880a64afb778 8128eb02 
  Oct 15 14:19:08 LGEARND8B5 kernel: [796493.152756] 8807572480b0 
880757248200 880a64afb798 81270595 
  Oct 15 14:19:08