Bug#1010073: Bug 1010073: kernel 4.19: nvme read overhead sometimes, system hangs

2022-06-29 Thread Андрій Василишин
ср, 29 черв. 2022 р. о 16:32 Ben Hutchings  пише:

> On Thu, 9 Jun 2022 15:34:17 Андрій Василишин 
> wrote:
> > Because it is the latest kernel which supports aufs.
> > Problem gone when I change to default  parameters NIC Mellanox
> Technologies
> > MT28908 Family [ConnectX-6]
> > ethtool -C enp161s0f0np0 rx-usecs 8 rx-frames 128 tx-usecs 8 tx-frames
> 128
> [...]
>
> So this seems to be a problem with the out-of-tree network driver you
> are using.  You should ask Mellanox for support, as there's nothing we
> can do about that.
>
> Ben.
>
> --
> Ben Hutchings
> Reality is just a crutch for people who can't handle science fiction.
>

Yes and no.
Problem reappeared.  Helped disable sendfile in nginx


Bug#1010073: Bug 1010073: kernel 4.19: nvme read overhead sometimes, system hangs

2022-06-29 Thread Ben Hutchings
On Thu, 9 Jun 2022 15:34:17 Андрій Василишин 
wrote:
> Because it is the latest kernel which supports aufs.
> Problem gone when I change to default  parameters NIC Mellanox Technologies
> MT28908 Family [ConnectX-6]
> ethtool -C enp161s0f0np0 rx-usecs 8 rx-frames 128 tx-usecs 8 tx-frames 128
[...]

So this seems to be a problem with the out-of-tree network driver you
are using.  You should ask Mellanox for support, as there's nothing we
can do about that.

Ben.

-- 
Ben Hutchings
Reality is just a crutch for people who can't handle science fiction.


signature.asc
Description: This is a digitally signed message part


Bug#1010073: Bug 1010073: kernel 4.19: nvme read overhead sometimes, system hangs

2022-06-18 Thread Андрій Василишин
 problem repeats:
Jun 17 23:28:06 nl100 kernel: [89832.101712] Modules linked in: binfmt_misc
msr amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul
efi_pstore crc32_pclmul ghash_clmulni_intel efivars pcspkr ipmi_ssif
nls_ascii nls_cp437 vfat fat ast ttm joydev drm_kms_helper drm ccp
i2c_algo_bit evdev rng_core sp5100_tco ipmi_si ipmi_devintf ipmi_msghandler
pcc_cpufreq acpi_cpufreq button tcp_bbr sch_fq aufs(OE) efivarfs ip_tables
x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb
hid_generic usbhid hid crc32c_intel aesni_intel aes_x86_64 crypto_simd
mlx5_core(OE) cryptd glue_helper mlxfw(OE) psample ahci mlxdevm(OE)
auxiliary(OE) libahci xhci_pci xhci_hcd mlx_compat(OE) libata nvme usbcore
devlink scsi_mod nvme_core i2c_piix4 usb_common
Jun 17 23:28:06 nl100 kernel: [89832.101756] CPU: 51 PID: 96472 Comm: nginx
Tainted: GW  OEL4.19.0-20-amd64 #1 Debian 4.19.235-1
Jun 17 23:28:06 nl100 kernel: [89832.101757] Hardware name: Supermicro AS
-1124US-TNRP/H12DSU-iN, BIOS 2.3a 03/03/2022
Jun 17 23:28:06 nl100 kernel: [89832.101764] RIP:
0010:_raw_spin_unlock_irqrestore+0x11/0x20
Jun 17 23:28:06 nl100 kernel: [89832.101767] Code: d8 48 3d 90 d0 03 00 76
cc 80 4d 00 08 eb 98 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00
0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00
0f 1f 44 00 00 8b 07
Jun 17 23:28:06 nl100 kernel: [89832.101767] RSP: 0018:91564d8c3e88
EFLAGS: 0282 ORIG_RAX: ff13
Jun 17 23:28:06 nl100 kernel: [89832.101769] RAX: 0066 RBX:
d27afe4c4020 RCX: 8040003b
Jun 17 23:28:06 nl100 kernel: [89832.101769] RDX: 8040003c RSI:
0282 RDI: 0282
Jun 17 23:28:06 nl100 kernel: [89832.101770] RBP: 005b R08:
 R09: b3ef6000
Jun 17 23:28:06 nl100 kernel: [89832.101770] R10: 910d717a7c00 R11:
0001 R12: 915637c5f858
Jun 17 23:28:06 nl100 kernel: [89832.101771] R13: 0282 R14:
915637c5f140 R15: d27afe4c6028
Jun 17 23:28:06 nl100 kernel: [89832.101772] FS:  77783b80()
GS:91564d8c() knlGS:
Jun 17 23:28:06 nl100 kernel: [89832.101772] CS:  0010 DS:  ES: 
CR0: 80050033
Jun 17 23:28:06 nl100 kernel: [89832.101773] CR2: 77f8f8c0 CR3:
0176fe4a8000 CR4: 00340ee0
Jun 17 23:28:06 nl100 kernel: [89832.101773] Call Trace:
Jun 17 23:28:06 nl100 kernel: [89832.101776]  
Jun 17 23:28:06 nl100 kernel: [89832.101781]  fq_flush_timeout+0x6a/0x90
Jun 17 23:28:06 nl100 kernel: [89832.101784]  ? fq_ring_free+0xd0/0xd0
Jun 17 23:28:06 nl100 kernel: [89832.101788]  call_timer_fn+0x2b/0x130
Jun 17 23:28:06 nl100 kernel: [89832.101790]  run_timer_softirq+0x1c7/0x3e0
Jun 17 23:28:06 nl100 kernel: [89832.101794]  ?
recalibrate_cpu_khz+0x10/0x10
Jun 17 23:28:06 nl100 kernel: [89832.101795]  ? ktime_get+0x3a/0xa0
Jun 17 23:28:06 nl100 kernel: [89832.101797]  __do_softirq+0xde/0x2d8
Jun 17 23:28:06 nl100 kernel: [89832.101800]  irq_exit+0xba/0xc0
Jun 17 23:28:06 nl100 kernel: [89832.101802]
 smp_apic_timer_interrupt+0x74/0x140
Jun 17 23:28:06 nl100 kernel: [89832.101804]  apic_timer_interrupt+0xf/0x20
Jun 17 23:28:06 nl100 kernel: [89832.101805]  
Jun 17 23:28:06 nl100 kernel: [89832.101806] RIP:
0010:_raw_spin_unlock_irqrestore+0x11/0x20
Jun 17 23:28:06 nl100 kernel: [89832.101807] Code: d8 48 3d 90 d0 03 00 76
cc 80 4d 00 08 eb 98 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00
0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00
0f 1f 44 00 00 8b 07
Jun 17 23:28:06 nl100 kernel: [89832.101807] RSP: 0018:b27b1e963638
EFLAGS: 0293 ORIG_RAX: ff13
Jun 17 23:28:06 nl100 kernel: [89832.101808] RAX: 9155ca566880 RBX:
915637c5f140 RCX: 9155ca566880
Jun 17 23:28:06 nl100 kernel: [89832.101809] RDX: 9155d0a7ce40 RSI:
0293 RDI: 0293
Jun 17 23:28:06 nl100 kernel: [89832.101809] RBP: 0078 R08:
0158 R09: 9155dbe9de80
Jun 17 23:28:06 nl100 kernel: [89832.101810] R10:  R11:
914aa5ee6000 R12: 9155dbe9de80
Jun 17 23:28:06 nl100 kernel: [89832.101810] R13: 0080 R14:
ff80 R15: ff80
Jun 17 23:28:06 nl100 kernel: [89832.101812]  alloc_iova+0x11f/0x140
Jun 17 23:28:06 nl100 kernel: [89832.101813]  alloc_iova_fast+0x56/0x250
Jun 17 23:28:06 nl100 kernel: [89832.101817]  ? __kmalloc+0x180/0x220
Jun 17 23:28:06 nl100 kernel: [89832.101820]  ? mempool_alloc+0x67/0x190
Jun 17 23:28:06 nl100 kernel: [89832.101821]
 dma_ops_alloc_iova.isra.28+0x4b/0x70
Jun 17 23:28:06 nl100 kernel: [89832.101822]  map_sg+0x73/0x1f0
Jun 17 23:28:06 nl100 kernel: [89832.101827]  nvme_queue_rq+0x1e7/0x9e0
[nvme]
Jun 17 23:28:06 nl100 kernel: [89832.101831]  ?
__sbitmap_queue_get+0x24/0x90
Jun 17 23:28:06 nl100 kernel: [89832.101834]  ? blk_mq_get_tag+0x236/0x260
Jun 17 23:28:06 nl100 kernel: [89832.101835]  ? 

Bug#1010073: Bug 1010073: kernel 4.19: nvme read overhead sometimes, system hangs

2022-06-09 Thread Андрій Василишин
Because it is the latest kernel which supports aufs.
Problem gone when I change to default  parameters NIC Mellanox Technologies
MT28908 Family [ConnectX-6]
ethtool -C enp161s0f0np0 rx-usecs 8 rx-frames 128 tx-usecs 8 tx-frames 128


вт, 7 черв. 2022 р. о 18:35 Diederik de Haas  пише:

> Control: reassign -1 src:linux 4.19.235-1
> Control: tag -1 moreinfo
>
> On 23 Apr 2022 21:59:32 +0300 Андрій Василишин 
> wrote:
> > Package: linux-image-4.19.0-20-amd64
> > Version: 4.19.235-1
> >
> > ...
> >
> > Hardware name: Supermicro AS-1124US-TNRP/H12DSU-iN, BIOS 2.3a 03/03/2022
>
> https://www.supermicro.com/en/Aplus/system/1U/1124/AS-1124US-TNRP.cfm
> specifications (and BIOS date) indicate this is quite a new board.
> Yet you're running it with a 4.19 kernel from *OldStable* !
>
> Why?
>
> Can you reproduce this issue with at least the 5.10 kernel from OldStable
> backports, but preferably with a recent kernel from Testing/Unstable.



-- 
WBR, Andrey Vasilishin


Bug#1010073: Bug 1010073: kernel 4.19: nvme read overhead sometimes, system hangs

2022-06-07 Thread Diederik de Haas
Control: reassign -1 src:linux 4.19.235-1
Control: tag -1 moreinfo

On 23 Apr 2022 21:59:32 +0300 Андрій Василишин  wrote:
> Package: linux-image-4.19.0-20-amd64
> Version: 4.19.235-1
> 
> ...
> 
> Hardware name: Supermicro AS-1124US-TNRP/H12DSU-iN, BIOS 2.3a 03/03/2022

https://www.supermicro.com/en/Aplus/system/1U/1124/AS-1124US-TNRP.cfm 
specifications (and BIOS date) indicate this is quite a new board.
Yet you're running it with a 4.19 kernel from *OldStable* !

Why?

Can you reproduce this issue with at least the 5.10 kernel from OldStable 
backports, but preferably with a recent kernel from Testing/Unstable.

signature.asc
Description: This is a digitally signed message part.