Re: kernel OOPS in MM(?)
Hello, On 2016-03-10 12:31, Evgenii Lepikhin wrote: > We need help to understand the source of the problem and may be to create a > bugreport. Here is crash report: > > Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel > paging request at 40008021 > Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [] > __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0 > Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: [#1] SMP > Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in: > tcm_loop iscsi_target_mod target_core_pscsi target_core_file > target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp > libis > csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod] > Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm > Tainted: G O 3.12.51-jl-2015-12-25 #1 > Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation > S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 > 02/26/2013 > Mar 10 04:03:51 l28 kernel: [2075560.435939] task: 880e622ccba0 ti: > 880eeb008000 task.ti: 880eeb008000 > Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[] > [] __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:880eeb009b38 > EFLAGS: 00010282 > Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: RBX: > RCX: a8a73dc2 > Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: a8a73dc1 RSI: > RDI: 00013500 > Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: 880eeb009b58 R08: > 88103fc13500 R09: 811a0267 > Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: 880eeb009d84 R11: > R12: 88081f803a00 > Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 40008021 R14: > 0250 R15: 880250e833b0 > Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 7fe2316dd700() > GS:88103fc0() knlGS: > Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: ES: CR0: > 80050033 > Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 40008021 CR3: > 000a20736000 CR4: 000407e0 > Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack: > Mar 10 04:03:51 l28 kernel: [2075560.439439] > 0250 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439663] 880eeb009b88 > 811a0267 881015fb7fe0 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439898] 880250e83490 > 880eeb009ba8 811a02f8 > Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace: > Mar 10 04:03:51 l28 kernel: [2075560.440257] [] > kmem_alloc+0x67/0xe0 > Mar 10 04:03:51 l28 kernel: [2075560.440365] [] > kmem_zalloc+0x18/0x40 > Mar 10 04:03:51 l28 kernel: [2075560.440473] [] > xfs_log_commit_cil+0x373/0x4c0 > Mar 10 04:03:51 l28 kernel: [2075560.440585] [] ? > xfs_bmap_search_multi_extents+0xe0/0x110 > Mar 10 04:03:51 l28 kernel: [2075560.440783] [] > xfs_trans_commit+0x6c/0x250 > Mar 10 04:03:51 l28 kernel: [2075560.440899] [] > xfs_bmap_finish+0xb7/0x1a0 Another issue on the same server, same instruction pointer: Mar 16 04:53:54 l28 kernel: [521052.387878] BUG: unable to handle kernel paging request at 40008021 Mar 16 04:53:54 l28 kernel: [521052.388022] IP: [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.388171] PGD 0 Mar 16 04:53:54 l28 kernel: [521052.388289] Oops: [#1] SMP Mar 16 04:53:54 l28 kernel: [521052.388410] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis csi_tcp libiscsi scsi_transport_iscsi fuse Mar 16 04:53:54 l28 kernel: [521052.388913] CPU: 6 PID: 5947 Comm: iscsi_trx Tainted: G O 3.12.51-jl-2015-12-25 #1 Mar 16 04:53:54 l28 kernel: [521052.389125] Hardware name: Intel Corporation S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Mar 16 04:53:54 l28 kernel: [521052.389351] task: 88081a3a6720 ti: 8808162de000 task.ti: 8808162de000 Mar 16 04:53:54 l28 kernel: [521052.389566] RIP: 0010:[] [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.389782] RSP: 0018:8808162dfd18 EFLAGS: 00010286 Mar 16 04:53:54 l28 kernel: [521052.389899] RAX: RBX: 880819a51800 RCX: 03b305d3 Mar 16 04:53:54 l28 kernel: [521052.390112] RDX: 03b305d2 RSI: RDI: 00013500 Mar 16 04:53:54 l28 kernel: [521052.390309] RBP: 8808162dfd38 R08: 88103fd13500 R09: a00e7072 Mar 16 04:53:54 l28 kernel: [521052.390503] R
Re: kernel OOPS in MM(?)
Hello, On 2016-03-10 12:31, Evgenii Lepikhin wrote: > We need help to understand the source of the problem and may be to create a > bugreport. Here is crash report: > > Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel > paging request at 40008021 > Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [] > __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0 > Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: [#1] SMP > Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in: > tcm_loop iscsi_target_mod target_core_pscsi target_core_file > target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp > libis > csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod] > Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm > Tainted: G O 3.12.51-jl-2015-12-25 #1 > Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation > S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 > 02/26/2013 > Mar 10 04:03:51 l28 kernel: [2075560.435939] task: 880e622ccba0 ti: > 880eeb008000 task.ti: 880eeb008000 > Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[] > [] __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:880eeb009b38 > EFLAGS: 00010282 > Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: RBX: > RCX: a8a73dc2 > Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: a8a73dc1 RSI: > RDI: 00013500 > Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: 880eeb009b58 R08: > 88103fc13500 R09: 811a0267 > Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: 880eeb009d84 R11: > R12: 88081f803a00 > Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 40008021 R14: > 0250 R15: 880250e833b0 > Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 7fe2316dd700() > GS:88103fc0() knlGS: > Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: ES: CR0: > 80050033 > Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 40008021 CR3: > 000a20736000 CR4: 000407e0 > Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack: > Mar 10 04:03:51 l28 kernel: [2075560.439439] > 0250 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439663] 880eeb009b88 > 811a0267 881015fb7fe0 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439898] 880250e83490 > 880eeb009ba8 811a02f8 > Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace: > Mar 10 04:03:51 l28 kernel: [2075560.440257] [] > kmem_alloc+0x67/0xe0 > Mar 10 04:03:51 l28 kernel: [2075560.440365] [] > kmem_zalloc+0x18/0x40 > Mar 10 04:03:51 l28 kernel: [2075560.440473] [] > xfs_log_commit_cil+0x373/0x4c0 > Mar 10 04:03:51 l28 kernel: [2075560.440585] [] ? > xfs_bmap_search_multi_extents+0xe0/0x110 > Mar 10 04:03:51 l28 kernel: [2075560.440783] [] > xfs_trans_commit+0x6c/0x250 > Mar 10 04:03:51 l28 kernel: [2075560.440899] [] > xfs_bmap_finish+0xb7/0x1a0 Another issue on the same server, same instruction pointer: Mar 16 04:53:54 l28 kernel: [521052.387878] BUG: unable to handle kernel paging request at 40008021 Mar 16 04:53:54 l28 kernel: [521052.388022] IP: [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.388171] PGD 0 Mar 16 04:53:54 l28 kernel: [521052.388289] Oops: [#1] SMP Mar 16 04:53:54 l28 kernel: [521052.388410] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis csi_tcp libiscsi scsi_transport_iscsi fuse Mar 16 04:53:54 l28 kernel: [521052.388913] CPU: 6 PID: 5947 Comm: iscsi_trx Tainted: G O 3.12.51-jl-2015-12-25 #1 Mar 16 04:53:54 l28 kernel: [521052.389125] Hardware name: Intel Corporation S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Mar 16 04:53:54 l28 kernel: [521052.389351] task: 88081a3a6720 ti: 8808162de000 task.ti: 8808162de000 Mar 16 04:53:54 l28 kernel: [521052.389566] RIP: 0010:[] [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.389782] RSP: 0018:8808162dfd18 EFLAGS: 00010286 Mar 16 04:53:54 l28 kernel: [521052.389899] RAX: RBX: 880819a51800 RCX: 03b305d3 Mar 16 04:53:54 l28 kernel: [521052.390112] RDX: 03b305d2 RSI: RDI: 00013500 Mar 16 04:53:54 l28 kernel: [521052.390309] RBP: 8808162dfd38 R08: 88103fd13500 R09: a00e7072 Mar 16 04:53:54 l28 kernel: [521052.390503] R
kernel OOPS in MM(?)
Hi, We need help to understand the source of the problem and may be to create a bugreport. Here is crash report: Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel paging request at 40008021 Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [] __kmalloc+0x69/0x100 Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0 Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: [#1] SMP Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod] Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm Tainted: G O 3.12.51-jl-2015-12-25 #1 Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Mar 10 04:03:51 l28 kernel: [2075560.435939] task: 880e622ccba0 ti: 880eeb008000 task.ti: 880eeb008000 Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[] [] __kmalloc+0x69/0x100 Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:880eeb009b38 EFLAGS: 00010282 Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: RBX: RCX: a8a73dc2 Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: a8a73dc1 RSI: RDI: 00013500 Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: 880eeb009b58 R08: 88103fc13500 R09: 811a0267 Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: 880eeb009d84 R11: R12: 88081f803a00 Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 40008021 R14: 0250 R15: 880250e833b0 Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 7fe2316dd700() GS:88103fc0() knlGS: Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: ES: CR0: 80050033 Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 40008021 CR3: 000a20736000 CR4: 000407e0 Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack: Mar 10 04:03:51 l28 kernel: [2075560.439439] 0250 0060 Mar 10 04:03:51 l28 kernel: [2075560.439663] 880eeb009b88 811a0267 881015fb7fe0 0060 Mar 10 04:03:51 l28 kernel: [2075560.439898] 880250e83490 880eeb009ba8 811a02f8 Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace: Mar 10 04:03:51 l28 kernel: [2075560.440257] [] kmem_alloc+0x67/0xe0 Mar 10 04:03:51 l28 kernel: [2075560.440365] [] kmem_zalloc+0x18/0x40 Mar 10 04:03:51 l28 kernel: [2075560.440473] [] xfs_log_commit_cil+0x373/0x4c0 Mar 10 04:03:51 l28 kernel: [2075560.440585] [] ? xfs_bmap_search_multi_extents+0xe0/0x110 Mar 10 04:03:51 l28 kernel: [2075560.440783] [] xfs_trans_commit+0x6c/0x250 Mar 10 04:03:51 l28 kernel: [2075560.440899] [] xfs_bmap_finish+0xb7/0x1a0 Mar 10 04:03:51 l28 kernel: [2075560.441017] [] xfs_itruncate_extents+0xe3/0x200 Mar 10 04:03:51 l28 kernel: [2075560.441131] [] xfs_inactive+0x27c/0x3a0 Mar 10 04:03:51 l28 kernel: [2075560.441275] [] ? wake_atomic_t_function+0x40/0x40 Mar 10 04:03:51 l28 kernel: [2075560.441386] [] xfs_fs_evict_inode+0x73/0x80 Mar 10 04:03:51 l28 kernel: [2075560.441498] [] evict+0xaa/0x1b0 Mar 10 04:03:51 l28 kernel: [2075560.441604] [] iput+0x103/0x1a0 Mar 10 04:03:51 l28 kernel: [2075560.441713] [] do_unlinkat+0x1cf/0x240 Mar 10 04:03:51 l28 kernel: [2075560.441823] [] ? SyS_newfstatat+0x25/0x30 Mar 10 04:03:51 l28 kernel: [2075560.441932] [] SyS_unlinkat+0x1d/0x40 Mar 10 04:03:51 l28 kernel: [2075560.442044] [] system_call_fastpath+0x16/0x1b Mar 10 04:03:51 l28 kernel: [2075560.442155] Code: 65 4c 03 04 25 48 bc 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 6b 48 85 c0 74 66 49 63 44 24 20 48 8d 4a 01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd 49 Mar 10 04:03:51 l28 kernel: [2075560.442904] RIP [] __kmalloc+0x69/0x100 Mar 10 04:03:51 l28 kernel: [2075560.443020] RSP Mar 10 04:03:51 l28 kernel: [2075560.443122] CR2: 40008021 Mar 10 04:03:51 l28 kernel: [2075560.443809] ---[ end trace 92cd2d4bad1896f4 ]--- Kernel 3.12.51. Gdb listing: (gdb) list *(__kmalloc+0x69) 0x810ee519 is in __kmalloc (mm/slub.c:260). [...] 258 static inline void *get_freepointer(struct kmem_cache *s, void *object) 259 { 260 return *(void **)(object + s->offset); 261 } What whould be the next step? Thank you. -- UNIX/Ocaml engineer at 1Gb.ru. Telegram: johnlepikhin
kernel OOPS in MM(?)
Hi, We need help to understand the source of the problem and may be to create a bugreport. Here is crash report: Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel paging request at 40008021 Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [] __kmalloc+0x69/0x100 Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0 Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: [#1] SMP Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod] Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm Tainted: G O 3.12.51-jl-2015-12-25 #1 Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Mar 10 04:03:51 l28 kernel: [2075560.435939] task: 880e622ccba0 ti: 880eeb008000 task.ti: 880eeb008000 Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[] [] __kmalloc+0x69/0x100 Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:880eeb009b38 EFLAGS: 00010282 Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: RBX: RCX: a8a73dc2 Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: a8a73dc1 RSI: RDI: 00013500 Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: 880eeb009b58 R08: 88103fc13500 R09: 811a0267 Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: 880eeb009d84 R11: R12: 88081f803a00 Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 40008021 R14: 0250 R15: 880250e833b0 Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 7fe2316dd700() GS:88103fc0() knlGS: Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: ES: CR0: 80050033 Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 40008021 CR3: 000a20736000 CR4: 000407e0 Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack: Mar 10 04:03:51 l28 kernel: [2075560.439439] 0250 0060 Mar 10 04:03:51 l28 kernel: [2075560.439663] 880eeb009b88 811a0267 881015fb7fe0 0060 Mar 10 04:03:51 l28 kernel: [2075560.439898] 880250e83490 880eeb009ba8 811a02f8 Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace: Mar 10 04:03:51 l28 kernel: [2075560.440257] [] kmem_alloc+0x67/0xe0 Mar 10 04:03:51 l28 kernel: [2075560.440365] [] kmem_zalloc+0x18/0x40 Mar 10 04:03:51 l28 kernel: [2075560.440473] [] xfs_log_commit_cil+0x373/0x4c0 Mar 10 04:03:51 l28 kernel: [2075560.440585] [] ? xfs_bmap_search_multi_extents+0xe0/0x110 Mar 10 04:03:51 l28 kernel: [2075560.440783] [] xfs_trans_commit+0x6c/0x250 Mar 10 04:03:51 l28 kernel: [2075560.440899] [] xfs_bmap_finish+0xb7/0x1a0 Mar 10 04:03:51 l28 kernel: [2075560.441017] [] xfs_itruncate_extents+0xe3/0x200 Mar 10 04:03:51 l28 kernel: [2075560.441131] [] xfs_inactive+0x27c/0x3a0 Mar 10 04:03:51 l28 kernel: [2075560.441275] [] ? wake_atomic_t_function+0x40/0x40 Mar 10 04:03:51 l28 kernel: [2075560.441386] [] xfs_fs_evict_inode+0x73/0x80 Mar 10 04:03:51 l28 kernel: [2075560.441498] [] evict+0xaa/0x1b0 Mar 10 04:03:51 l28 kernel: [2075560.441604] [] iput+0x103/0x1a0 Mar 10 04:03:51 l28 kernel: [2075560.441713] [] do_unlinkat+0x1cf/0x240 Mar 10 04:03:51 l28 kernel: [2075560.441823] [] ? SyS_newfstatat+0x25/0x30 Mar 10 04:03:51 l28 kernel: [2075560.441932] [] SyS_unlinkat+0x1d/0x40 Mar 10 04:03:51 l28 kernel: [2075560.442044] [] system_call_fastpath+0x16/0x1b Mar 10 04:03:51 l28 kernel: [2075560.442155] Code: 65 4c 03 04 25 48 bc 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 6b 48 85 c0 74 66 49 63 44 24 20 48 8d 4a 01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd 49 Mar 10 04:03:51 l28 kernel: [2075560.442904] RIP [] __kmalloc+0x69/0x100 Mar 10 04:03:51 l28 kernel: [2075560.443020] RSP Mar 10 04:03:51 l28 kernel: [2075560.443122] CR2: 40008021 Mar 10 04:03:51 l28 kernel: [2075560.443809] ---[ end trace 92cd2d4bad1896f4 ]--- Kernel 3.12.51. Gdb listing: (gdb) list *(__kmalloc+0x69) 0x810ee519 is in __kmalloc (mm/slub.c:260). [...] 258 static inline void *get_freepointer(struct kmem_cache *s, void *object) 259 { 260 return *(void **)(object + s->offset); 261 } What whould be the next step? Thank you. -- UNIX/Ocaml engineer at 1Gb.ru. Telegram: johnlepikhin
iSCSI target stops sending responses to login requests
Hello, We have several NAS servers with kernels 3.4.xx-3.13.xx. We've got a problem: after 1..4 months of work target servers stops responding to login requests. I made tcpdump on NAS when problem appears. 1. TCP session has been established, server receives login request several times (in iSCSI protocol header: immediate marker bit = 1, opcode = 0x03, text dataload submited) 2. Target sends response: TCP packet with incorrect checksum(!) 3. Handshake starts again, also without success 4. We reboot the target. Problem disappears for 1..4 months. The dump: 17:34:56.671179 IP (tos 0x0, ttl 128, id 14261, offset 0, flags [DF], proto TCP (6), length 52) 10.0.1.44.1212 > 10.0.2.22.3260: Flags [S], cksum 0xedb0 (correct), seq 3581240112, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0 0x: 4500 0034 37b5 4000 8006 abcd 0a00 012c E..47.@, 0x0010: 0a00 0216 04bc 0cbc d575 6330 .uc0 0x0020: 8002 2000 edb0 0204 05b4 0103 0308 0x0030: 0101 0402 17:34:56.671195 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52) 10.0.2.22.3260 > 10.0.1.44.1212: Flags [S.], cksum 0x1768 (incorrect -> 0x534b), seq 418867286, ack 3581240113, win 14600, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0 0x: 4500 0034 4000 4006 2383 0a00 0216 E..4..@.@.#. 0x0010: 0a00 012c 0cbc 04bc 18f7 6856 d575 6331 ...,..hV.uc1 0x0020: 8012 3908 1768 0204 05b4 0101 0402 ..9..h.. 0x0030: 0103 0307 17:34:56.671345 IP (tos 0x0, ttl 128, id 14262, offset 0, flags [DF], proto TCP (6), length 40) 10.0.1.44.1212 > 10.0.2.22.3260: Flags [.], cksum 0xcc25 (correct), ack 1, win 256, length 0 0x: 4500 0028 37b6 4000 8006 abd8 0a00 012c E..(7.@, 0x0010: 0a00 0216 04bc 0cbc d575 6331 18f7 6857 .uc1..hW 0x0020: 5010 0100 cc25 P% 17:34:56.671538 IP (tos 0x0, ttl 128, id 14263, offset 0, flags [DF], proto TCP (6), length 232) 10.0.1.44.1212 > 10.0.2.22.3260: Flags [P.], cksum 0xcc18 (correct), seq 1:193, ack 1, win 256, length 192 0x: 4500 00e8 37b7 4000 8006 ab17 0a00 012c E...7.@, 0x0010: 0a00 0216 04bc 0cbc d575 6331 18f7 6857 .uc1..hW 0x0020: 5018 0100 cc18 4300 0090 P...C... 0x0030: 4000 0137 0001 0019 0001 @..7 0x0040: 0001 0001 0x0050: 496e 6974 6961 746f Initiato 0x0060: 724e 616d 653d 6971 6e2e 3139 3931 2d30 rName=iqn.1991-0 0x0070: 352e 636f 6d2e 6d69 6372 6f73 6f66 743a 5.com.microsoft: 0x0080: 7334 3400 5365 7373 696f 6e54 7970 653d s44.SessionType= 0x0090: 4e6f 726d 616c 0054 6172 6765 744e 616d Normal.TargetNam 0x00a0: 653d 6971 6e2e 3230 3033 2d30 312e 6f72 e=iqn.2003-01.or 0x00b0: 672e 6c69 6e75 782d 6973 6373 692e 6c32 g.linux-iscsi.l2 0x00c0: 322e 7838 3636 343a 736e 2e39 3665 6239 2.x8664:sn.96eb9 0x00d0: 3035 6630 6461 6200 4175 7468 4d65 7468 05f0dab.AuthMeth 0x00e0: 6f64 3d43 4841 5000 od=CHAP. 17:34:56.981941 IP (tos 0x0, ttl 128, id 14264, offset 0, flags [DF], proto TCP (6), length 232) 10.0.1.44.1212 > 10.0.2.22.3260: Flags [P.], cksum 0xcc18 (correct), seq 1:193, ack 1, win 256, length 192 0x: 4500 00e8 37b8 4000 8006 ab16 0a00 012c E...7.@, 0x0010: 0a00 0216 04bc 0cbc d575 6331 18f7 6857 .uc1..hW 0x0020: 5018 0100 cc18 4300 0090 P...C... 0x0030: 4000 0137 0001 0019 0001 @..7 0x0040: 0001 0001 0x0050: 496e 6974 6961 746f Initiato 0x0060: 724e 616d 653d 6971 6e2e 3139 3931 2d30 rName=iqn.1991-0 0x0070: 352e 636f 6d2e 6d69 6372 6f73 6f66 743a 5.com.microsoft: 0x0080: 7334 3400 5365 7373 696f 6e54 7970 653d s44.SessionType= 0x0090: 4e6f 726d 616c 0054 6172 6765 744e 616d Normal.TargetNam 0x00a0: 653d 6971 6e2e 3230 3033 2d30 312e 6f72 e=iqn.2003-01.or 0x00b0: 672e 6c69 6e75 782d 6973 6373 692e 6c32 g.linux-iscsi.l2 0x00c0: 322e 7838 3636 343a 736e 2e39 3665 6239 2.x8664:sn.96eb9 0x00d0: 3035 6630 6461 6200 4175 7468 4d65 7468 05f0dab.AuthMeth 0x00e0: 6f64 3d43 4841 5000 od=CHAP. 17:34:57.590302 IP (tos 0x0, ttl 128, id 14307, offset 0, flags [DF], proto TCP (6), length 232) 10.0.1.44.1212 > 10.0.2.22.3260: Flags [P.], cksum 0xcc18 (correct), seq 1:193, ack 1, win 256, length 192 0x: 4500 00e8 37e3 4000 8006 aaeb
iSCSI target stops sending responses to login requests
Hello, We have several NAS servers with kernels 3.4.xx-3.13.xx. We've got a problem: after 1..4 months of work target servers stops responding to login requests. I made tcpdump on NAS when problem appears. 1. TCP session has been established, server receives login request several times (in iSCSI protocol header: immediate marker bit = 1, opcode = 0x03, text dataload submited) 2. Target sends response: TCP packet with incorrect checksum(!) 3. Handshake starts again, also without success 4. We reboot the target. Problem disappears for 1..4 months. The dump: 17:34:56.671179 IP (tos 0x0, ttl 128, id 14261, offset 0, flags [DF], proto TCP (6), length 52) 10.0.1.44.1212 10.0.2.22.3260: Flags [S], cksum 0xedb0 (correct), seq 3581240112, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0 0x: 4500 0034 37b5 4000 8006 abcd 0a00 012c E..47.@, 0x0010: 0a00 0216 04bc 0cbc d575 6330 .uc0 0x0020: 8002 2000 edb0 0204 05b4 0103 0308 0x0030: 0101 0402 17:34:56.671195 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52) 10.0.2.22.3260 10.0.1.44.1212: Flags [S.], cksum 0x1768 (incorrect - 0x534b), seq 418867286, ack 3581240113, win 14600, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0 0x: 4500 0034 4000 4006 2383 0a00 0216 E..4..@.@.#. 0x0010: 0a00 012c 0cbc 04bc 18f7 6856 d575 6331 ...,..hV.uc1 0x0020: 8012 3908 1768 0204 05b4 0101 0402 ..9..h.. 0x0030: 0103 0307 17:34:56.671345 IP (tos 0x0, ttl 128, id 14262, offset 0, flags [DF], proto TCP (6), length 40) 10.0.1.44.1212 10.0.2.22.3260: Flags [.], cksum 0xcc25 (correct), ack 1, win 256, length 0 0x: 4500 0028 37b6 4000 8006 abd8 0a00 012c E..(7.@, 0x0010: 0a00 0216 04bc 0cbc d575 6331 18f7 6857 .uc1..hW 0x0020: 5010 0100 cc25 P% 17:34:56.671538 IP (tos 0x0, ttl 128, id 14263, offset 0, flags [DF], proto TCP (6), length 232) 10.0.1.44.1212 10.0.2.22.3260: Flags [P.], cksum 0xcc18 (correct), seq 1:193, ack 1, win 256, length 192 0x: 4500 00e8 37b7 4000 8006 ab17 0a00 012c E...7.@, 0x0010: 0a00 0216 04bc 0cbc d575 6331 18f7 6857 .uc1..hW 0x0020: 5018 0100 cc18 4300 0090 P...C... 0x0030: 4000 0137 0001 0019 0001 @..7 0x0040: 0001 0001 0x0050: 496e 6974 6961 746f Initiato 0x0060: 724e 616d 653d 6971 6e2e 3139 3931 2d30 rName=iqn.1991-0 0x0070: 352e 636f 6d2e 6d69 6372 6f73 6f66 743a 5.com.microsoft: 0x0080: 7334 3400 5365 7373 696f 6e54 7970 653d s44.SessionType= 0x0090: 4e6f 726d 616c 0054 6172 6765 744e 616d Normal.TargetNam 0x00a0: 653d 6971 6e2e 3230 3033 2d30 312e 6f72 e=iqn.2003-01.or 0x00b0: 672e 6c69 6e75 782d 6973 6373 692e 6c32 g.linux-iscsi.l2 0x00c0: 322e 7838 3636 343a 736e 2e39 3665 6239 2.x8664:sn.96eb9 0x00d0: 3035 6630 6461 6200 4175 7468 4d65 7468 05f0dab.AuthMeth 0x00e0: 6f64 3d43 4841 5000 od=CHAP. 17:34:56.981941 IP (tos 0x0, ttl 128, id 14264, offset 0, flags [DF], proto TCP (6), length 232) 10.0.1.44.1212 10.0.2.22.3260: Flags [P.], cksum 0xcc18 (correct), seq 1:193, ack 1, win 256, length 192 0x: 4500 00e8 37b8 4000 8006 ab16 0a00 012c E...7.@, 0x0010: 0a00 0216 04bc 0cbc d575 6331 18f7 6857 .uc1..hW 0x0020: 5018 0100 cc18 4300 0090 P...C... 0x0030: 4000 0137 0001 0019 0001 @..7 0x0040: 0001 0001 0x0050: 496e 6974 6961 746f Initiato 0x0060: 724e 616d 653d 6971 6e2e 3139 3931 2d30 rName=iqn.1991-0 0x0070: 352e 636f 6d2e 6d69 6372 6f73 6f66 743a 5.com.microsoft: 0x0080: 7334 3400 5365 7373 696f 6e54 7970 653d s44.SessionType= 0x0090: 4e6f 726d 616c 0054 6172 6765 744e 616d Normal.TargetNam 0x00a0: 653d 6971 6e2e 3230 3033 2d30 312e 6f72 e=iqn.2003-01.or 0x00b0: 672e 6c69 6e75 782d 6973 6373 692e 6c32 g.linux-iscsi.l2 0x00c0: 322e 7838 3636 343a 736e 2e39 3665 6239 2.x8664:sn.96eb9 0x00d0: 3035 6630 6461 6200 4175 7468 4d65 7468 05f0dab.AuthMeth 0x00e0: 6f64 3d43 4841 5000 od=CHAP. 17:34:57.590302 IP (tos 0x0, ttl 128, id 14307, offset 0, flags [DF], proto TCP (6), length 232) 10.0.1.44.1212 10.0.2.22.3260: Flags [P.], cksum 0xcc18 (correct), seq 1:193, ack 1, win 256, length 192 0x: 4500 00e8 37e3 4000 8006 aaeb 0a00