On Sun, Sep 17, 2017 at 09:34:01AM -0700, Linus Torvalds wrote: > Now, I suspect most (all?) do, but that's a historical artifact rather > than "design". In particular, the VFS layer used to do the locking for > the filesystems, to guarantee the POSIX requirements (POSIX requires > that writes be seen atomically). > > But that lock was pushed down into the filesystems, since some > filesystems really wanted to have parallel writes (particularly for > direct IO, where that POSIX serialization requirement doesn't exist). > > That's all many years ago, though. New filesystems are likely to have > copied the pattern from old ones, but even then.. > > Also, it's worth noting that "inode->i_rwlock" isn't even well-defined > as a lock. You can have the question of *which* inode gets talked > about when you have things like eoverlayfs etc. Normally it would be > obvious, but sometimes you'd use "file->f_mapping->host" (which is the > same thing in the simple cases), and sometimes it really wouldn't be > obvious at all.. > > So... I'm really not at all convinced that i_rwsem is sensible. It's > one of those things that are "mostly right for the simple cases", > but...
The thing pretty much common to all of them is that write() might need to modify permissions (suid removal), which brings ->i_rwsem in one way or another - notify_change() needs that held...