From: Alan Brady <[email protected]>
Date: Mon, 29 Jan 2024 16:59:16 -0800

> The motivation for this series has two primary goals. We want to enable
> support of multiple simultaneous messages and make the channel more
> robust. The way it works right now, the driver can only send and receive
> a single message at a time and if something goes really wrong, it can
> lead to data corruption and strange bugs.

Have you tested v3?
I have this on my system (net-next + your series), no other patches applied:

> [alobakin@GK3153-KR1-CYP-38282-U39-ETH1 ~]$ sudo modprobe idpf
> [   89.785966] idpf 0000:83:00.0: Device HW Reset initiated
> [alobakin@GK3153-KR1-CYP-38282-U39-ETH1 ~]$ [   90.241658] BUG: unable to 
> handle page fault for address: ff8e1df482000000
> [   90.241704] #PF: supervisor write access in kernel mode
> [   90.241728] #PF: error_code(0x0002) - not-present page
> [   90.241751] PGD 107ffc8067 P4D 107ffc7067 PUD 207fdc8067 PMD 0 
> [   90.241782] Oops: 0002 [#1] PREEMPT SMP NOPTI
> [   90.241805] CPU: 32 PID: 847 Comm: kworker/32:1 Kdump: loaded Not tainted 
> 6.8.0-rc1-libeth+ #1
> [   90.241841] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, 
> BIOS SE5C620.86B.01.01.0008.2305172341 05/17/2023
> [   90.241879] Workqueue: idpf-0000:83:00.0-vc_ev idpf_vc_event_task [idpf]
> [   90.241932] RIP: 0010:__free_pages_ok+0x338/0x4f0
> [   90.241962] Code: e6 06 45 31 e4 41 bd 40 00 00 00 45 85 ff 74 13 4b 8d 34 
> 28 4c 89 c7 e8 36 97 00 00 8b 74 24 04 41 01 c4 66 90 4c 8b 44 24 08 <43> 81 
> 24 28 00 00 80 c0 49 83 c5 40 4d 39 ee 75 d0 e9 7c fd ff ff
> [   90.242027] RSP: 0018:ff3f281b098d7b78 EFLAGS: 00010246
> [   90.242053] RAX: 0000000000100000 RBX: ff8e1df400000000 RCX: 
> 0000000000000034
> [   90.242084] RDX: 0000000000000d80 RSI: 0000000000000034 RDI: 
> ff8e1df481d980c0
> [   90.242115] RBP: 0000000000000000 R08: ff8e1df481d980c0 R09: 
> 0000000000000000
> [   90.242145] R10: ff25062537f9fe00 R11: 0000000000000020 R12: 
> 0000000000000000
> [   90.242174] R13: 0000000000267f40 R14: 0000000004000000 R15: 
> 0000000000000000
> [   90.242206] FS:  0000000000000000(0000) GS:ff2506253ec00000(0000) 
> knlGS:0000000000000000
> [   90.242240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   90.242266] CR2: ff8e1df482000000 CR3: 000000207d420006 CR4: 
> 0000000000771ef0
> [   90.242297] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [   90.242327] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
> 0000000000000400
> [   90.242357] PKRU: 55555554
> [   90.242378] Call Trace:
> [   90.242393]  <TASK>
> [   90.242410]  ? __die_body+0x68/0xb0
> [   90.242437]  ? page_fault_oops+0x3a6/0x400
> [   90.242467]  ? exc_page_fault+0xb2/0x1b0
> [   90.242496]  ? asm_exc_page_fault+0x26/0x30
> [   90.242527]  ? __free_pages_ok+0x338/0x4f0
> [   90.242554]  idpf_mb_clean+0xc1/0x110 [idpf]
> [   90.242600]  idpf_send_mb_msg+0x50/0x1b0 [idpf]
> [   90.242643]  idpf_vc_xn_exec+0x189/0x350 [idpf]
> [   90.242688]  idpf_vc_core_init+0x32c/0x6d0 [idpf]
> [   90.242735]  idpf_vc_event_task+0x2da/0x390 [idpf]
> [   90.242779]  process_scheduled_works+0x251/0x460
> [   90.242807]  worker_thread+0x21c/0x2d0
> [   90.242830]  ? __pfx_worker_thread+0x10/0x10
> [   90.242855]  kthread+0xe8/0x110
> [   90.242878]  ? __pfx_kthread+0x10/0x10
> [   90.242902]  ret_from_fork+0x37/0x50
> [   90.242925]  ? __pfx_kthread+0x10/0x10
> [   90.243802]  ret_from_fork_asm+0x1b/0x30
> [   90.244598]  </TASK>
> [   90.245361] Modules linked in: idpf libeth rpcrdma rdma_cm ib_cm iw_cm 
> ib_core qrtr rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency 
> intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal 
> intel_powerclamp coretemp kvm_intel binfmt_misc vfat fat kvm irqbypass rapl 
> ipmi_ssif iTCO_wdt intel_cstate intel_pmc_bxt iTCO_vendor_support dax_hmem 
> cxl_acpi nfsd ioatdma intel_uncore mei_me isst_if_mmio cxl_core i2c_i801 
> isst_if_mbox_pci mei acpi_ipmi pcspkr intel_vsec isst_if_common i2c_smbus 
> joydev dca ipmi_si intel_pch_thermal ipmi_devintf auth_rpcgss ipmi_msghandler 
> acpi_power_meter acpi_pad nfs_acl lockd sunrpc loop grace zram xfs 
> crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic 
> ghash_clmulni_intel nvme bnxt_en sha512_ssse3 ast nvme_core sha256_ssse3 
> sha1_ssse3 i2c_algo_bit wmi scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables 
> ip_tables dm_multipath fuse
> [   90.252381] CR2: ff8e1df482000000
> [   90.253263] ---[ end trace 0000000000000000 ]---
> [   90.314201] pstore: backend (erst) writing error (-28)
> [   90.314686] RIP: 0010:__free_pages_ok+0x338/0x4f0
> [   90.314970] Code: e6 06 45 31 e4 41 bd 40 00 00 00 45 85 ff 74 13 4b 8d 34 
> 28 4c 89 c7 e8 36 97 00 00 8b 74 24 04 41 01 c4 66 90 4c 8b 44 24 08 <43> 81 
> 24 28 00 00 80 c0 49 83 c5 40 4d 39 ee 75 d0 e9 7c fd ff ff
> [   90.315511] RSP: 0018:ff3f281b098d7b78 EFLAGS: 00010246
> [   90.315778] RAX: 0000000000100000 RBX: ff8e1df400000000 RCX: 
> 0000000000000034
> [   90.316043] RDX: 0000000000000d80 RSI: 0000000000000034 RDI: 
> ff8e1df481d980c0
> [   90.316308] RBP: 0000000000000000 R08: ff8e1df481d980c0 R09: 
> 0000000000000000
> [   90.316573] R10: ff25062537f9fe00 R11: 0000000000000020 R12: 
> 0000000000000000
> [   90.316838] R13: 0000000000267f40 R14: 0000000004000000 R15: 
> 0000000000000000
> [   90.317104] FS:  0000000000000000(0000) GS:ff2506253ec00000(0000) 
> knlGS:0000000000000000
> [   90.317368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   90.317631] CR2: ff8e1df482000000 CR3: 000000207d420006 CR4: 
> 0000000000771ef0
> [   90.317897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [   90.318163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
> 0000000000000400
> [   90.318427] PKRU: 55555554
> [   90.318687] note: kworker/32:1[847] exited with irqs disabled
> [   90.319202] BUG: kernel NULL pointer dereference, address: 0000000000000008

[...]

Thanks,
Olek

Reply via email to