On Tue, 2018-09-11 at 14:01 +0100, Dmitry Safonov wrote: > On Tue, 2018-09-11 at 14:02 +0200, Peter Zijlstra wrote: > > On Tue, Sep 11, 2018 at 02:48:21AM +0100, Dmitry Safonov wrote: > > > It seems like when ldsem_down_read() fails with timeout, it > > > misses > > > update for sem->wait_readers. By that reason, when writer finally > > > releases write end of the semaphore __ldsem_wake_readers() does > > > adjust > > > sem->count with wrong value: > > > sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS) > > > > > > I.e, if update comes with 1 missed wait_readers decrement, sem- > > > > count > > > > > > will be 0x100000001 which means that there is active reader and > > > it'll > > > make any further writer to fail in acquiring the semaphore. > > > > > > It looks like, this is a dead-code, because ldsem_down_read() is > > > never > > > called with timeout different than MAX_SCHEDULE_TIMEOUT, so it > > > might be > > > worth to delete timeout parameter and error path fall-back.. > > > > You might want to think about ditching that ldsem thing entirely, > > and > > use a regular rwsem ? > > Yeah, but AFAICS, regular rwsem will need to have a timeout then (for > write). So, I thought fixing this pile would be simpler than adding > timeout and probably writer-priority to generic rwsem? > > And I guess, we still will need fixes for stable for the bugs here.. > > I expect that timeouts are ABI, while the gain of adding priority may > be measured. I'll give it a shot (adding timeout/priority for linux- > next) to rwsem if you say it's acceptable.
Actually, priority looks quite simple: we can add writers in the head of wait_list and it probably may work. Timeout looks also not a rocket science. So, I can try to do that if you say it's acceptable (with the gain measures). After this can of worms that I need to fix regardless. -- Thanks, Dmitry