On Tue, Jul 16, 2013 at 05:46:58PM +0200, Jan Kara wrote:
> and ext4_setattr() does (again under i_mutex):
> ext4_inode_block_unlocked_dio(inode);
> inode_dio_wait(inode);
> ext4_inode_resume_unlocked_dio(inode);
Ah, I missed this; thanks for pointing this out.
> So either DIO gets i_mutex first and then ext4_setattr() waits for it to
> complete, or truncate completes before unlocked DIO is able to get & drop
> i_mutex.
>
> OTOH unlocked DIO *read* might be vulnerable to a race with truncate. That
> never acquires i_mutex so if the DIO read arrives after ext4_setattr() goes
> through inode_dio_wait(), we can have the read and truncate racing and read
> possibly submitting read of a just truncated block (which can get
> reallocated in theory while the IO is running).
>
> So something like what you do in the patch is likely needed, just the
> justification is somewhat different and you should also rip out / adjust
> the other synchronizations we have in ext4_setattr(), ext4_ext_direct_IO()
> and ext4_ind_direct_IO().
Ok, I'll drop my current patch for now, and revisit this when I have a
bit more time. I agree with your analysis, but fortunately it sounds
like this race is going to be pretty hard to hit in practice ---
especially, since with the journal enabled, we won't allow the block
to get reused until the next commit boundary. The situation where we
would need to worry would be dioread_nolock combined with no journal
mode.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html