On Wed, Jan 23, 2013 at 04:58:27PM +0800, Miao Xie wrote:
> On wed, 23 Jan 2013 16:17:53 +0800, Liu Bo wrote:
> > On Wed, Jan 23, 2013 at 02:33:27PM +0800, Miao Xie wrote:
> >> On wed, 23 Jan 2013 14:06:21 +0800, Liu Bo wrote:
> >>> On Wed, Jan 23, 2013 at 12:44:49PM +0800, Miao Xie wrote:
> >>>> No, we can't. The other tasks which flush the delalloc data may remove 
> >>>> the inode
> >>>> from the delalloc list/splice list. If we release the lock, we will meet 
> >>>> the race
> >>>> between list traversing and list_del().
> >>>
> >>> OK, then please merge patch 1 and 4 so that we can backport 1 less patch
> >>> at least.
> >>
> >> I don't think we should merge these two patch because they do two 
> >> different things - one
> >> is bug fix, and the other is just a improvement, and this improvement 
> >> changes the logic
> >> of the code and might be argumentative for some developers. So 2 patches 
> >> is  better than one,
> >> I think.
> > 
> > Sorry, this is right only when patch 1 really fixes the problem Alex 
> > reported.
> > 
> > But the fact is
> > 
> > 1)  patch 1 is not enough to fix the bug, it just fixes the OOM of
> > allocating 'struct btrfs_delalloc_work' while the original OOM of allocating
> > requests remains.  We can still get the same inode over and over again
> > and then stuck in btrfs_start_delalloc_inodes() because we 'goto again' to
> > make sure we flush all inodes listed in fs_info->delalloc_inodes.
> > 
> > 2)  patch 4 fixes 1)'s problems by removing 'goto again'.
> > 
> > Am I missing something?
> 
> In fact, there are two different problems.
> One is OOM bug. it is a serious problem and also is an regression, so we 
> should fix it
> as soon as possible.
> The other one is that we may fetch the same inode again and again if someone 
> write data
> into it endlessly. This problem is not so urgent to deal with. and perhaps 
> some developers
> think it is not a problem, we should flush that inode since there are dirty 
> pages in it.
> So we need split it from the patch of the 1st problem.

All right, I'm ok with this.

But the TWO different problems are both due to fetching the same inode more
than once, and the solutions are indeed same, "fetch every inode on
the list just once", and they are in the same code.

It is definitely a bug/problem if any users complains about their box
getting stuck.  It is KERNEL that should be blamed.

thanks,
liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to