Hello,
I'm triggering a reproducible panic using a DWC3 in gadget mode
(intel edison board). Built kernel is a 4.14 with all patches from
Andy Shevchenko's tree for the edison (including ones to the dwc3 to
skip EP1 & 8). It is available here:
https://github.com/vpelletier/linux/tree/eds_4.14
The setup is a rather simple test implementation for bidirectional
piping:
Bus 001 Device 105: ID 1d6b:0104 Linux Foundation Multifunction Composite Gadget
Couldn't open device, some information will be missing
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.10
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x1d6b Linux Foundation
idProduct 0x0104 Multifunction Composite Gadget
bcdDevice 4.14
iManufacturer 1
iProduct 2
iSerial 3
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 32
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 4
bmAttributes 0x80
(Bus Powered)
MaxPower 500mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02 EP 2 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Test program is here (highlighted is the blocking call described below):
https://github.com/vpelletier/python-functionfs/blob/da45cb2435b26e65d8385b67a7395be04a65461c/examples/usbcat/device.py#L134
What seem to be the relevant pieces are:
- at least one AIO transfers submitted for reading from EP2OUT
- upon receiving data from stdin, a synchronous write happens on EP2IN,
which blocks if host did not submit a transfer (normal)
- SIGQUIT to interrupt the write while it's blocking
A very short while after SIGQUIT, the kernel emits a BUG, which content
and severity (dropping to kdb command line, or sometimes continuing
seemingly normally), but seem to always be inside
dwc3_gadget_ep_dequeue (2 traces below, where kernel survived and
another boot where it panic'ed).
Only writing without submitted AIO transfers does not cause any panic
on interruption.
What should I do to debug further ?
Case where the kernel survived:
[ 382.200896] BUG: scheduling while atomic: screen/1808/0x00000100
[ 382.207124] 4 locks held by screen/1808:
[ 382.211266] #0: (rcu_callback){....}, at: [<c10b4ff0>]
rcu_process_callbacks+0x260/0x440
[ 382.219949] #1: (rcu_read_lock_sched){....}, at: [<c1358ba0>]
percpu_ref_switch_to_atomic_rcu+0xb0/0x130
[ 382.230034] #2: (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>]
free_ioctx_users+0x23/0xd0
[ 382.230096] #3: (&(&ffs->eps_lock)->rlock){....}, at: [<f81e7710>]
ffs_aio_cancel+0x20/0x60 [usb_f_fs]
[ 382.230160] Modules linked in: usb_f_fs libcomposite configfs bnep btsdio
bluetooth ecdh_generic brcmfmac brcmutil intel_powerclamp coretemp dwc3
kvm_intel ulpi udc_core kvm irqbypass crc32_pclmul crc32c_intel pcbc dwc3_pci
aesni_intel aes_i586 crypto_simd cryptd ehci_pci ehci_hcd gpio_keys usbcore
basincove_gpadc industrialio usb_common
[ 382.230407] CPU: 1 PID: 1808 Comm: screen Not tainted 4.14.0-edison+ #117
[ 382.230416] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542
2015.01.21:18.19.48
[ 382.230425] Call Trace:
[ 382.230438] <SOFTIRQ>
[ 382.230466] dump_stack+0x47/0x62
[ 382.230498] __schedule_bug+0x61/0x80
[ 382.230522] __schedule+0x43/0x7a0
[ 382.230587] schedule+0x5f/0x70
[ 382.230625] dwc3_gadget_ep_dequeue+0x14c/0x270 [dwc3]
[ 382.230669] ? do_wait_intr_irq+0x70/0x70
[ 382.230724] usb_ep_dequeue+0x19/0x90 [udc_core]
[ 382.230770] ffs_aio_cancel+0x37/0x60 [usb_f_fs]
[ 382.230798] kiocb_cancel+0x31/0x40
[ 382.230822] free_ioctx_users+0x4d/0xd0
[ 382.230858] percpu_ref_switch_to_atomic_rcu+0x10a/0x130
[ 382.230881] ? percpu_ref_exit+0x40/0x40
[ 382.230904] rcu_process_callbacks+0x2b3/0x440
[ 382.230965] __do_softirq+0xf8/0x26b
[ 382.231011] ? __softirqentry_text_start+0x8/0x8
[ 382.231033] do_softirq_own_stack+0x22/0x30
[ 382.231042] </SOFTIRQ>
[ 382.231071] irq_exit+0x45/0xc0
[ 382.231089] smp_apic_timer_interrupt+0x13c/0x150
[ 382.231118] apic_timer_interrupt+0x35/0x3c
[ 382.231132] EIP: __copy_user_ll+0xe2/0xf0
[ 382.231142] EFLAGS: 00210293 CPU: 1
[ 382.231154] EAX: bfd4508c EBX: 00000004 ECX: 00000003 EDX: f3d8fe50
[ 382.231165] ESI: f3d8fe51 EDI: bfd4508d EBP: f3d8fe14 ESP: f3d8fe08
[ 382.231176] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 382.231265] core_sys_select+0x25f/0x320
[ 382.231346] ? __wake_up_common_lock+0x62/0x80
[ 382.231399] ? tty_ldisc_deref+0x13/0x20
[ 382.231438] ? ldsem_up_read+0x1b/0x40
[ 382.231459] ? tty_ldisc_deref+0x13/0x20
[ 382.231479] ? tty_write+0x29f/0x2e0
[ 382.231514] ? n_tty_ioctl+0xe0/0xe0
[ 382.231541] ? tty_write_unlock+0x30/0x30
[ 382.231566] ? __vfs_write+0x22/0x110
[ 382.231604] ? security_file_permission+0x2f/0xd0
[ 382.231635] ? rw_verify_area+0xac/0x120
[ 382.231677] ? vfs_write+0x103/0x180
[ 382.231711] SyS_select+0x87/0xc0
[ 382.231739] ? SyS_write+0x42/0x90
[ 382.231781] do_fast_syscall_32+0xd6/0x1a0
[ 382.231836] entry_SYSENTER_32+0x47/0x71
[ 382.231848] EIP: 0xb7f75b05
[ 382.231857] EFLAGS: 00000246 CPU: 1
[ 382.231868] EAX: ffffffda EBX: 00000400 ECX: bfd4508c EDX: bfd4510c
[ 382.231878] ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: bfd45020
[ 382.231889] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 382.232281] softirq: huh, entered softirq 9 RCU c10b4d90 with preempt_count
00000100, exited with 00000000?
Case where I ended up on kdb:
[46045.741643] BUG: scheduling while atomic: swapper/0/0/0x00000100
[46045.747881] 4 locks held by swapper/0/0:
[46045.752071] #0: (rcu_callback){....}, at: [<c10b4ff0>]
rcu_process_callbacks+0x260/0x440
[46045.760748] #1: (rcu_read_lock_sched){....}, at: [<c1358ba0>]
percpu_ref_switch_to_atomic_rcu+0xb0/0x130
[46045.770810] #2: (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>]
free_ioctx_users+0x23/0xd0
[46045.779903] #3: (&(&ffs->eps_lock)->rlock){....}, at: [<f8269710>]
ffs_aio_cancel+0x20/0x60 [usb_f_fs]
[46045.789793] Modules linked in: usb_f_fs libcomposite configfs bnep btsdio
bluetooth ecdh_generic brcmfmac brcmutil dwc3 ulpi intel_powerclamp coretemp
udc_core kvm_intel kvm irqbypass crc32_pclmul crc32c_intel pcbc dwc3_pci
ehci_pci ehci_hcd aesni_intel aes_i586 crypto_simd cryptd usbcore
basincove_gpadc gpio_keys industrialio usb_common
[46045.820989] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-edison+ #117
[46045.827950] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542
2015.01.21:18.19.48
[46045.836950] Call Trace:
[46045.839474] <SOFTIRQ>
[46045.841929] dump_stack+0x47/0x62
[46045.845359] __schedule_bug+0x61/0x80
[46045.849140] __schedule+0x43/0x7a0
[46045.852693] schedule+0x5f/0x70
[46045.855951] dwc3_gadget_ep_dequeue+0x14c/0x270 [dwc3]
[46045.861260] ? do_wait_intr_irq+0x70/0x70
[46045.865422] usb_ep_dequeue+0x19/0x90 [udc_core]
[46045.870198] ffs_aio_cancel+0x37/0x60 [usb_f_fs]
[46045.874953] kiocb_cancel+0x31/0x40
[46045.878552] free_ioctx_users+0x4d/0xd0
[46045.882517] percpu_ref_switch_to_atomic_rcu+0x10a/0x130
[46045.887980] ? percpu_ref_exit+0x40/0x40
[46045.892019] rcu_process_callbacks+0x2b3/0x440
[46045.896633] __do_softirq+0xf8/0x26b
[46045.900342] ? __softirqentry_text_start+0x8/0x8
[46045.905092] do_softirq_own_stack+0x22/0x30
[46045.909386] </SOFTIRQ>
[46045.911924] irq_exit+0x45/0xc0
[46045.915164] smp_apic_timer_interrupt+0x13c/0x150
[46045.920008] apic_timer_interrupt+0x35/0x3c
[46045.924315] EIP: cpuidle_enter_state+0x26b/0x330
[46045.929055] EFLAGS: 00000202 CPU: 0
[46045.932641] EAX: 00000000 EBX: c1bfd680 ECX: 00000002 EDX: 00000000
[46045.939068] ESI: 00077454 EDI: 000029e0 EBP: c1b7df4c ESP: c1b7df18
[46045.945496] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[46045.951148] cpuidle_enter+0x14/0x20
[46045.954835] call_cpuidle+0x35/0x40
[46045.958436] do_idle+0x114/0x180
[46045.961771] cpu_startup_entry+0x25/0x30
[46045.965814] rest_init+0xfa/0x100
[46045.969243] start_kernel+0x37c/0x381
[46045.973025] i386_start_kernel+0xa3/0xa7
[46045.977071] startup_32_smp+0x164/0x166
[46046.008333] BUG: unable to handle kernel NULL pointer dereference at 00000001
[46046.015673] IP: 0x1
[46046.017830] *pde = 00000000
[46046.020799] Oops: 0010 [#1] SMP
Entering kdb (current=0xc1b8eb40, pid 0) on processor 0 Oops: (null)
due to oops @ 0x1
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.0-edison+ #117
Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542
2015.01.21:18.19.48
task: c1b8eb40 task.stack: c1b7c000
EIP: 0x1
EFLAGS: 00010046 CPU: 0
EAX: f69a4d80 EBX: 00000000 ECX: f6dcd000 EDX: 00000033
ESI: f692f0e0 EDI: f6fa3f58 EBP: f6833f40 ESP: f6833ec0
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 80050033 CR2: 00000001 CR3: 34c85000 CR4: 001006d0
Call Trace:
<SOFTIRQ>
? __lock_acquire.isra.25+0x69a/0x840
? rcu_cbs_completed+0x24/0x50
? cpu_needs_another_gp+0x5c/0x70
? rcu_process_callbacks+0x1c5/0x440
? __softirqentry_text_start+0x8/0x8
? do_softirq_own_stack+0x22/0x30
</SOFTIRQ>
Code: Bad EIP value.
Regards,
--
Vincent Pelletier
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html