On Fri, Aug 07, 2009 at 05:07:32PM +0800, Yan Zheng wrote: > On 08/07/2009 03:19 PM, Jens Axboe wrote: > > On Fri, Aug 07 2009, Yan Zheng wrote: > >> On 08/07/2009 02:50 PM, Jens Axboe wrote: > >>> On Fri, Aug 07 2009, Yan Zheng wrote: > >>>> invalidate_inode_pages2_range may return -EBUSY occasionally > >>>> which results Oops. This patch fixes the issue by moving > >>>> invalidate_inode_pages2_range into a loop and keeping calling > >>>> it until the return value is not -EBUSY. > >>>> > >>>> Signed-off-by: Yan Zheng <[email protected]> > >>>> > >>>> --- > >>>> diff -urp 1/fs/btrfs/relocation.c 2/fs/btrfs/relocation.c > >>>> --- 1/fs/btrfs/relocation.c 2009-07-29 10:03:04.367858774 +0800 > >>>> +++ 2/fs/btrfs/relocation.c 2009-08-07 13:26:43.882147138 +0800 > >>>> @@ -2553,8 +2553,13 @@ int relocate_inode_pages(struct inode *i > >>>> last_index = (start + len - 1) >> PAGE_CACHE_SHIFT; > >>>> > >>>> /* make sure the dirty trick played by the caller work */ > >>>> - ret = invalidate_inode_pages2_range(inode->i_mapping, > >>>> - first_index, last_index); > >>>> + while (1) { > >>>> + ret = invalidate_inode_pages2_range(inode->i_mapping, > >>>> + first_index, > >>>> last_index); > >>>> + if (ret != -EBUSY) > >>>> + break; > >>>> + cond_resched(); > >>>> + } > >>> If it returns EBUSY, would it not make more sense to call > >>> filemap_write_and_wait_range() instead of hammering on invalidate? > >>> > >> The pages to invalidate are not dirty, they are from page read-ahead. > >> Actually I have no idea how invalidate_inode_pages2_range can return > >> -EBUSY here. (the only user of the inode is the balancer, and it does > >> not hold references to the pages) > > > > Weird, I looked it up, and it already does a writeback wait. But I guess > > that's not your issue. Patch still looks like a hack though, it would be > > far better to figure out why it returns EBUSY and fix/wait appropriately > > for that to pass. > > > > EBUSY is from the EXTENT_LOCK test in try_release_extent_state. The test > can be true is because some codes call lock_extent while corresponding > pages are not all locked. (one example is btrfs_finish_ordered_io)
Ok, please use schedule_timeout(HZ/10) instead then. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
