On 03/29/2019 12:10 PM, Jan Harkes wrote: > I was testing Coda on the 5.1-rc2 kernel and noticed that when I run a > binary out of /coda, the binary would never exit and the system would > detect a soft lockup. I narrowed it down to a very simple reproducible > case of running a statically linked executable (busybox) from /coda with > the cwd outside of Coda, so the only Coda file reference is from the > executable itself. > > I knew I definitely had never seen this problem with the stable kernel > on Ubuntu xenial (4.4) so I bisected between v4.4 and v5.1-rc2 and ended > up at > > # first bad commit: [925b9cd1b89a94b7124d128c80dfc48f78a63098] > # locking/rwsem: Make owner store task pointer of last owning reader > > When I revert this particular commit on 5.1-rc2, I am not able to > reproduce the problem anymore. > > The puzzling thing to me is that a lot of that particular patch touches > codepaths that are not even enabled in the kernels that I run, because I > do not have CONFIG_RWSEM_DEBUG enabled. > > $ grep RWSEM .config > CONFIG_RWSEM_XCHGADD_ALGORITHM=y > CONFIG_RWSEM_SPIN_ON_OWNER=y > # CONFIG_DEBUG_RWSEMS is not set > > And this patch is for rwsem, while my soft lockup is on a spinlock. > So either I have a race in fs/coda that got somehow uncovered by this > patch, or something else is going on here but I have not been able to > figure it out. > > Jan
Without CONFIG_DEBUG_RWSEMS, the only behavioral change of this patch is to do an unconditional write of task_structure pointer into sem->owner after acquiring the read lock in down_read(). Before this patch, it does conditional write of 0x1 into sem->owner if it was not 0x1. The only possible scenario that I can think of that can cause the soft lockup you see is use-after-free of memory objects. Cheers, Longman

