On 6/24/20 2:11 AM, Greg KH wrote:
On Tue, Jun 23, 2020 at 11:04:30PM -0400, Andrey Grodzovsky wrote:
On 6/23/20 2:05 AM, Greg KH wrote:
On Tue, Jun 23, 2020 at 12:51:00AM -0400, Andrey Grodzovsky wrote:
On 6/22/20 12:45 PM, Greg KH wrote:
On Mon, Jun 22, 2020 at 12:07:25PM -0400, Andrey Grodzovsky wrote:
On 6/22/20 7:21 AM, Greg KH wrote:
On Mon, Jun 22, 2020 at 11:51:24AM +0200, Daniel Vetter wrote:
On Sun, Jun 21, 2020 at 02:03:05AM -0400, Andrey Grodzovsky wrote:
Track sysfs files in a list so they all can be removed during pci remove
since otherwise their removal after that causes crash because parent
folder was already removed during pci remove.
Huh?  That should not happen, do you have a backtrace of that crash?
2 examples in the attached trace.
Odd, how did you trigger these?
By manually triggering PCI remove from sysfs

cd /sys/bus/pci/devices/0000\:05\:00.0 && echo 1 > remove
For some reason, I didn't think that video/drm devices could handle
hot-remove like this.  The "old" PCI hotplug specification explicitly
said that video devices were not supported, has that changed?

And this whole issue is probably tied to the larger issue that Daniel
was asking me about, when it came to device lifetimes and the drm layer,
so odds are we need to fix that up first before worrying about trying to
support this crazy request, right?  :)

[  925.738225 <    0.188086>] BUG: kernel NULL pointer dereference, address: 
0000000000000090
[  925.738232 <    0.000007>] #PF: supervisor read access in kernel mode
[  925.738236 <    0.000004>] #PF: error_code(0x0000) - not-present page
[  925.738240 <    0.000004>] PGD 0 P4D 0
[  925.738245 <    0.000005>] Oops: 0000 [#1] SMP PTI
[  925.738249 <    0.000004>] CPU: 7 PID: 2547 Comm: amdgpu_test Tainted: G     
   W  OE     5.5.0-rc7-dev-kfd+ #50
[  925.738256 <    0.000007>] Hardware name: System manufacturer System Product 
Name/RAMPAGE IV FORMULA, BIOS 4804 12/30/2013
[  925.738266 <    0.000010>] RIP: 0010:kernfs_find_ns+0x18/0x110
[  925.738270 <    0.000004>] Code: a6 cf ff 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 
00 00 66 66 66 66 90 41 57 41 56 49 89 f6 41 55 41 54 49 89 fd 55 53 49 89 d4 <0f> b7 
af 90 00 00 00 8b 05 8f ee 6b 01 48 8b 5f 68 66 83 e5 20 41
[  925.738282 <    0.000012>] RSP: 0018:ffffad6d0118fb00 EFLAGS: 00010246
[  925.738287 <    0.000005>] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
2098a12076864b7e
[  925.738292 <    0.000005>] RDX: 0000000000000000 RSI: ffffffffb6606b31 RDI: 
0000000000000000
[  925.738297 <    0.000005>] RBP: ffffffffb6606b31 R08: ffffffffb5379d10 R09: 
0000000000000000
[  925.738302 <    0.000005>] R10: ffffad6d0118fb38 R11: ffff9a75f64820a8 R12: 
0000000000000000
[  925.738307 <    0.000005>] R13: 0000000000000000 R14: ffffffffb6606b31 R15: 
ffff9a7612b06130
[  925.738313 <    0.000006>] FS:  00007f3eca4e8700(0000) 
GS:ffff9a763dbc0000(0000) knlGS:0000000000000000
[  925.738319 <    0.000006>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  925.738323 <    0.000004>] CR2: 0000000000000090 CR3: 0000000035e5a005 CR4: 
00000000000606e0
[  925.738329 <    0.000006>] Call Trace:
[  925.738334 <    0.000005>]  kernfs_find_and_get_ns+0x2e/0x50
[  925.738339 <    0.000005>]  sysfs_remove_group+0x25/0x80
[  925.738344 <    0.000005>]  sysfs_remove_groups+0x29/0x40
[  925.738350 <    0.000006>]  free_msi_irqs+0xf5/0x190
[  925.738354 <    0.000004>]  pci_disable_msi+0xe9/0x120
So the PCI core is trying to clean up attributes that it had registered,
which is fine.  But we can't seem to find the attributes?  Were they
already removed somewhere else?

that's odd.
Yes, as i pointed above i am emulating device remove from sysfs and this
triggers pci device remove sequence and as part of that my specific device
folder (05:00.0) is removed from the sysfs tree.
But why are things being removed twice?

Not sure I understand what removed twice ? I remove only once per sysfs 
attribute.
This code path shows that the kernel is trying to remove a file that is
not present, so someone removed it already...

thanks,

gre k-h


That a mystery for me too...

Andrey


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to