On Mon, 2007-02-19 at 21:31 +0000, Jörn Engel wrote:
> Looks like I really write the first log-structured filesystem for Linux.
> At least I can into a fairly arcane race that seems to be generic to all
> of them.
> 
> Writing when space is tight may involve calling the garbage collector.
> The garbage collector will iget() random inodes, either to verify if a
> block is valid or to copy the block around.  At this point, all writes
> to LogFS are serialized.
> 
> __sync_single_inode() will first lock a random inode, then call
> write_inode(), then unlock the inode.  So we can get this:
> 
> 
> __sync_single_inode()                 garbage collector
> ---------------------------------------------------------------------
> inode->i_state |= I_LOCK;             ...
> ...                                   mutex_lock(&super->s_w_mutex);
> write_inode(inode, wait);             ...
>   ...                                 iget(sb, ino);
>   mutex_lock(&super->s_w_mutex);      ...
>   ...                                   wait_on_inode(inode);
>   mutex_unlock(&super->s_w_mutex);    
>   ...                                 
> ...
> inode->i_state &= ~I_LOCK;
> 
> 
> And once in a blue moon, those two will race for the same inode.  As far
> as I can see, the race can only get fixed in two ways:
> 1. Never iget() inside the garbage collector.  That would require having
>    a private inode cache for LogFS.
> 2. Synchonize __sync_single_inode() and the garbage collector somehow.
> 
> Variant 1 would result in double caching for the same object, something
> I would like to avoid.  So does anyone have suggestions how variant 2
> could be achieved?  Essentially what I need is a way to say "don't sync
> any inodes right now, I'll be back in 5 milliseconds or so".

It'd be nice if you could drop s_w_mutex when the garbage collector
calls i_get().

Otherwise, you may be able to call ilookup5_nowait() in the garbage
collector, and skip that inode if I_LOCK is set.

> 
> Jörn
> 
-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to