Hi Marcelo,

I also agree that in code there is no scenario, wherein we can have
__sync_one() and iput() having reference of same inode simultaneously
leading to below mentioned corruption.  

Thanks a lot for your patience and time,
Rahul

-----Original Message-----
From: Marcelo Tosatti [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 10, 2005 4:56 PM
To: Srivastava, Rahul
Cc: Ernie Petrides; [EMAIL PROTECTED]; linux-fsdevel@vger.kernel.org
Subject: Re: FW: oops in 2.4.25 prune_icache() called from kswapd



Hi Rahul,

My previous description was incomplete, AFAICS there's more important
thing which guarantees consistency.

On Tue, Aug 09, 2005 at 12:00:54PM -0500, Srivastava, Rahul wrote:
> Hi,
> 
> Please consider following scenario (with the patch/fix from Larry):
> 
> -> engine 0: calls iput() and lock inode_lock. iput removes the inode
> from the i_list and unlocks inode_lock
>      
> ---> engine 1: grab inode_lock and calls __sync_one()

Inodes which reach __sync_one() have been found through any of the type 
lists (dirty, in use, etc), which are walked with the inode_lock held. 

iput() deletes the inode from the i_list before proceeding, so they 
are unreachable via the type lists after the inode lock is released.

Go ahead and try to prove me wrong -- what you're doing is very welcome
(sincerely, I dont fully understand the inode cache).

> -> engine 0: calls clear_inode(), get past the call to 
> -> "wait_on_inode()"
> which looks if I_LOCK is set.
> /* From this point onwards clear_inode() and the remainder of iput()
> does not care about I_LOCK or inode_lock. */
> Now with new changes, it will wait for inode_lock before setting the
> state to I_CLEAR. So now we are waiting for inode_lock on engine 0.
> 
> ---> engine 1: Sets I_LOCK and release the inode_lock
>
> 
> -> engine 0: We now get the lock and set the state to I_CLEAR, release
> the lock and free the inode (though on engine 1 we have set I_LOCK and

> are thinking that no one will destroy this inode).
> 
> Though numerous kind of corruption is possible now, I am sighting one 
> example here: Under low memory condition it is possible that the inode

> from inode_cachep (freed inode cache), will be returned to system 
> memory (subject to all the objects in that particular slab is freed). 
> And that memory chuck (which we were just now using for inode) is 
> allocated to some other process. Suppose, this new process which just 
> got this newly allocated chunk, goes and clears the field, which was 
> earlier i_state, to NULL, or some other value (other than value which 
> suggests I_FREEING or I_CLEAR is set)
> 
> ---> engine 1: We get the spin lock, clears I_LOCK (even though we 
> ---> don't
> own this memory chunk anymore), see (As per above mentioned example
> scenario) that I_FREEING or I_CLEAR is not set and insert this freed 
> inode into the list!! This way we will still end up in corrupted list 
> (as per above example scenario).
> 
> Please correct me if I am wrong somewhere.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to