Re: NFSD on XFS with RT subvolume

2008-02-07 Thread David Chinner
On Wed, Feb 06, 2008 at 04:08:58PM +0200, Rabeeh Khoury wrote: > > > > > > Exporting an XFS volume with kernel NFSD when real-time subvolume is > > > enabled hangs the kernel. > > > > > > I'm using vanilla LK 2.6.22.7; first I create the XFS volume with > two > > > partitions of 20GB each with exte

Re: [RFC] ext3 freeze feature

2008-01-25 Thread David Chinner
On Sat, Jan 26, 2008 at 04:35:26PM +1100, David Chinner wrote: > On Fri, Jan 25, 2008 at 07:59:38PM +0900, Takashi Sato wrote: > > The points of the implementation are followings. > > - Add calls of the freeze function (freeze_bdev) and > > the unfreeze function (tha

Re: [RFC] ext3 freeze feature

2008-01-25 Thread David Chinner
On Fri, Jan 25, 2008 at 07:59:38PM +0900, Takashi Sato wrote: > The points of the implementation are followings. > - Add calls of the freeze function (freeze_bdev) and > the unfreeze function (thaw_bdev) in ext3_ioctl(). > > - ext3_freeze_timeout() which calls the unfreeze function (thaw_bdev) >

Re: [RFC] ext3 freeze feature

2008-01-25 Thread David Chinner
On Fri, Jan 25, 2008 at 09:42:30PM +0900, Takashi Sato wrote: > >I am also wondering whether we should have system call(s) for these: > > > >On Jan 25, 2008 12:59 PM, Takashi Sato <[EMAIL PROTECTED]> wrote: > >>+ case EXT3_IOC_FREEZE: { > > > >>+ case EXT3_IOC_THAW: { > > > >And just co

Re: [patch 01/26] mount options: add documentation

2008-01-25 Thread David Chinner
> In message <[EMAIL PROTECTED]>, Miklos Szeredi writes: > > From: Miklos Szeredi <[EMAIL PROTECTED]> > > > > This series addresses the problem of showing mount options in > > /proc/mounts. [...] > > The following filesystems still need fixing: CIFS, NFS, XFS, Unionfs, > > Reiser4. For CIFS, NFS

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread David Chinner
On Tue, Jan 22, 2008 at 12:05:11AM -0700, Andreas Dilger wrote: > On Jan 22, 2008 14:38 +1100, David Chinner wrote: > > On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: > > > I discussed this with Ted at one point also. This is a generic problem, > >

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread David Chinner
On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: > On Jan 16, 2008 13:30 -0800, Valerie Henson wrote: > > I have a partial solution that sort of blindly manages the buffer > > cache. First, the user passes e2fsck a parameter saying how much > > memory is available as buffer cache.

Re: [RFC] Parallelize IO for e2fsck

2008-01-17 Thread David Chinner
On Wed, Jan 16, 2008 at 01:30:43PM -0800, Valerie Henson wrote: > Hi y'all, > > This is a request for comments on the rewrite of the e2fsck IO > parallelization patches I sent out a few months ago. The mechanism is > totally different. Previously IO was parallelized by issuing IOs from > multipl

Re: [PATCH 09/13] writeback: requeue_io() on redirtied inode

2008-01-16 Thread David Chinner
On Tue, Jan 15, 2008 at 08:36:46PM +0800, Fengguang Wu wrote: > Redirtied inodes could be seen in really fast writes. > They should really be synced as soon as possible. > > redirty_tail() could delay the inode for up to 30s. > Kill the delay by using requeue_io() instead. That's actually bad for

Re: [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck)

2008-01-15 Thread David Chinner
On Tue, Jan 15, 2008 at 09:16:53PM +0100, Pavel Machek wrote: > Hi! > > > > What are ext3 expectations of disk (is there doc somewhere)? For > > > example... if disk does not lie, but powerfail during write damages > > > the sector -- is ext3 still going to work properly? > > > > Nope. However th

Re: [0/4] DST: Distributed storage.

2007-12-17 Thread David Chinner
On Mon, Dec 17, 2007 at 06:03:38PM +0300, Evgeniy Polyakov wrote: > DST passed all FS tests in LTP with XFS (modulo MAX_LOCK_DEPTH too low bug: > [ 8398.605691] BUG: MAX_LOCK_DEPTH too low! > [ 8398.609641] turning off the locking correctness validator. Evgeniy, can you please start reporting thes

Re: [patch 01/19] Define functions for page cache handling

2007-12-03 Thread David Chinner
On Mon, Dec 03, 2007 at 02:10:20PM -0800, Andrew Morton wrote: > On Fri, 30 Nov 2007 09:34:49 -0800 > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > We use the macros PAGE_CACHE_SIZE PAGE_CACHE_SHIFT PAGE_CACHE_MASK > > and PAGE_CACHE_ALIGN in various places in the kernel. Many times > > commo

Re: [patch 07/19] Use page_cache_xxx in mm/migrate.c

2007-12-03 Thread David Chinner
On Fri, Nov 30, 2007 at 09:34:55AM -0800, Christoph Lameter wrote: > Use page_cache_xxx in mm/migrate.c > > Reviewed-by: Dave Chinner <[EMAIL PROTECTED]> > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > --- > mm/migrate.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Ind

Re: Race between generic_forget_inode() and sync_sb_inodes()?

2007-11-29 Thread David Chinner
On Fri, Nov 30, 2007 at 09:07:06AM +1100, Neil Brown wrote: > > Hi David, > > On Friday November 30, [EMAIL PROTECTED] wrote: > > > > > > I came across this because I've been making changes to XFS to avoid the > > inode hash, and I've found that I need to remove the inode from the > > dirty lis

Race between generic_forget_inode() and sync_sb_inodes()?

2007-11-29 Thread David Chinner
If we are in the process of dropping an inode and it is hashed, generic_forget_inode() will mark it I_WILL_FREE and drop the inode_lock before calling write_inode_now(). However, at this point, the inode is still on the sb->s_dirty_list so sync_sb_inodes() could see it and try to write it back. i

Re: [patch 00/19] Page cache: Replace PAGE_CACHE_xx with inline functions

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:10:52PM -0800, Christoph Lameter wrote: > This patchset cleans up page cache handling by replacing > open coded shifts and adds with inline function calls. > > The ultimate goal is to replace all uses of PAGE_CACHE_xxx in the > kernel through the use of these functions.

Re: [patch 14/19] Use page_cache_xxx in ext2

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 08:15:26PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > I don't think that gives the same return value. The return value > > is supposed to be clamped at a maximum of page_cache_size(mapping). > > Ok.

Re: [patch 05/19] Use page_cache_xxx in mm/rmap.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 08:09:39PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > And the other two occurrences of this in the first patch? > > Ahh... Ok they are also in rmap.c: > > > > rmap: simplify page_referenced_file

Re: [patch 18/19] Use page_cache_xxx for fs/xfs

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 08:06:30PM -0800, Christoph Lameter wrote: > Is this correct? Yup, looks good now. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL

Re: [patch 17/19] Use page_cache_xxx in fs/reiserfs

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 08:02:01PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > > unsigned long start = 0; > > > unsigned long blocksize = p_s_inode->i_sb->s_blocksize; > > > - unsigned long offset = (p_

Re: [patch 16/19] Use page_cache_xxx in fs/ext4

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 07:58:45PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > These three should use the pagesize variable. > > ext4: use pagesize variable instead of the inline function > > Signed-off-by: Christoph L

Re: [patch 14/19] Use page_cache_xxx in ext2

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 07:55:40PM -0800, Christoph Lameter wrote: > ext2: Simplify some functions > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > > --- > fs/ext2/dir.c |9 ++--- > 1 file changed, 2 insertions(+), 7 deletions(-) > > Index: mm/fs/ext2/dir.c > =

Re: [patch 13/19] Use page_cache_xxx in fs/splice.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 07:50:16PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > On Wed, Nov 28, 2007 at 05:11:05PM -0800, Christoph Lameter wrote: > > > @@ -453,7 +454,7 @@ fill_it: > > >*/ >

Re: [patch 10/19] Use page_cache_xxx in fs/buffer.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 07:48:08PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > > - while (index > (curidx = (curpos = *bytes)>>PAGE_CACHE_SHIFT)) { > > > - zerofrom = curpos & ~PAGE_CACHE_MASK; > > > + whi

Re: [patch 05/19] Use page_cache_xxx in mm/rmap.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 07:30:54PM -0800, Christoph Lameter wrote: > On Thu, 29 Nov 2007, David Chinner wrote: > > > > unsigned int mapcount; > > > struct address_space *mapping = page->mapping; > > > - pgoff_t pgoff = page->index << (PAGE_CAC

Re: [patch 18/19] Use page_cache_xxx for fs/xfs

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 07:28:17PM -0800, Christoph Lameter wrote: > In other words the following patch? > Index: mm/fs/xfs/linux-2.6/xfs_aops.c > === > --- mm.orig/fs/xfs/linux-2.6/xfs_aops.c 2007-11-28 19:13:13.323382722 > -08

Re: [patch 17/19] Use page_cache_xxx in fs/reiserfs

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:11:09PM -0800, Christoph Lameter wrote: > @@ -2000,11 +2001,13 @@ static int grab_tail_page(struct inode * > /* we want the page with the last byte in the file, >** not the page that will hold the next byte for appending >*/ > - unsigned long ind

Re: [patch 16/19] Use page_cache_xxx in fs/ext4

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:11:08PM -0800, Christoph Lameter wrote: > @@ -1677,6 +1676,7 @@ static int ext4_journalled_writepage(str > handle_t *handle = NULL; > int ret = 0; > int err; > + int pagesize = page_cache_size(inode->i_mapping); > > if (ext4_journal_current_h

Re: [patch 14/19] Use page_cache_xxx in ext2

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:11:06PM -0800, Christoph Lameter wrote: > Use page_cache_xxx functions in fs/ext2/* > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > --- > fs/ext2/dir.c | 40 +++- > 1 file changed, 23 insertions(+), 17 deletions(-) > > I

Re: [patch 13/19] Use page_cache_xxx in fs/splice.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:11:05PM -0800, Christoph Lameter wrote: > @@ -453,7 +454,7 @@ fill_it: >*/ > while (page_nr < nr_pages) > page_cache_release(pages[page_nr++]); > - in->f_ra.prev_pos = (loff_t)index << PAGE_CACHE_SHIFT; > + in->f_ra.prev_pos = page_cach

Re: [patch 10/19] Use page_cache_xxx in fs/buffer.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:11:02PM -0800, Christoph Lameter wrote: > @@ -914,10 +914,11 @@ struct buffer_head *alloc_page_buffers(s > { > struct buffer_head *bh, *head; > long offset; > + unsigned int page_size = page_cache_size(page->mapping); > > try_again: > head = NULL

Re: [patch 05/19] Use page_cache_xxx in mm/rmap.c

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:10:57PM -0800, Christoph Lameter wrote: > Use page_cache_xxx in mm/rmap.c > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > --- > mm/rmap.c | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) > > Index: mm/mm/rmap.c > ==

Re: [patch 01/19] Define functions for page cache handling

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:10:53PM -0800, Christoph Lameter wrote: > +/* > + * Index of the page starting on or after the given position. > + */ > +static inline pgoff_t page_cache_next(struct address_space *a, > + loff_t pos) > +{ > + return page_cache_index(a, pos + page_cache_siz

Re: [patch 18/19] Use page_cache_xxx for fs/xfs

2007-11-28 Thread David Chinner
On Wed, Nov 28, 2007 at 05:11:10PM -0800, Christoph Lameter wrote: > Use page_cache_xxx for fs/xfs > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > --- > fs/xfs/linux-2.6/xfs_aops.c | 55 > +++- > fs/xfs/linux-2.6/xfs_lrw.c |4 +-- > 2 fil

Re: writeout stalls in current -git

2007-11-20 Thread David Chinner
or each inode in the cluster we > > > > have to write. > > > > Works for me. The only remaining stalls are sub second and look > > > completely valid, considering the amount of files being removed. > > > > > Tested-by: Torsten Kaiser <[EMAIL

Re: [PATCH 2/2] FIEMAP ioctl for ext4

2007-11-12 Thread David Chinner
On Tue, Nov 13, 2007 at 02:30:06AM +0530, Kalpak Shah wrote: > Recently there was discussion about an "FIle Extent MAP"(FIEMAP) ioctl for > efficiently mapping the extents and holes of a file. This will be many times > more efficient than FIBMAP by cutting down the number of ioctls. > > This pat

Re: writeout stalls in current -git

2007-11-07 Thread David Chinner
On Wed, Nov 07, 2007 at 08:15:06AM +0100, Torsten Kaiser wrote: > On 11/7/07, David Chinner <[EMAIL PROTECTED]> wrote: > > Ok, so it's not synchronous writes that we are doing - we're just > > submitting bio's tagged as WRITE_SYNC to get the I/O issued quickly.

Re: [RFC] fs io with struct page instead of iovecs

2007-11-07 Thread David Chinner
On Wed, Nov 07, 2007 at 09:02:05AM -0800, Zach Brown wrote: > Badari Pulavarty wrote: > > On Tue, 2007-11-06 at 17:43 -0800, Zach Brown wrote: > >> At the FS meeting at LCE there was some talk of doing O_DIRECT writes from > >> the > >> kernel with pages instead of with iovecs. T > > > > Why ? W

Re: writeout stalls in current -git

2007-11-06 Thread David Chinner
On Wed, Nov 07, 2007 at 10:31:14AM +1100, David Chinner wrote: > On Tue, Nov 06, 2007 at 10:53:25PM +0100, Torsten Kaiser wrote: > > On 11/6/07, David Chinner <[EMAIL PROTECTED]> wrote: > > > Rather than vmstat, can you use something like iostat to show how busy > >

Re: writeout stalls in current -git

2007-11-06 Thread David Chinner
On Tue, Nov 06, 2007 at 10:53:25PM +0100, Torsten Kaiser wrote: > On 11/6/07, David Chinner <[EMAIL PROTECTED]> wrote: > > Rather than vmstat, can you use something like iostat to show how busy your > > disks are? i.e. are we seeing RMW cycles in the raid5 or some such issue.

Re: writeout stalls in current -git

2007-11-05 Thread David Chinner
On Mon, Nov 05, 2007 at 07:27:16PM +0100, Torsten Kaiser wrote: > On 11/5/07, David Chinner <[EMAIL PROTECTED]> wrote: > > Ok, so it's probably a side effect of the writeback changes. > > > > Attached are two patches (two because one was in a separate patchset

Re: writeout stalls in current -git

2007-11-04 Thread David Chinner
On Sun, Nov 04, 2007 at 12:19:19PM +0100, Torsten Kaiser wrote: > On 11/2/07, David Chinner <[EMAIL PROTECTED]> wrote: > > That's stalled waiting on the inode cluster buffer lock. That implies > > that the inode lcuser is already being written out and the inode has

Re: writeout stalls in current -git

2007-11-02 Thread David Chinner
On Fri, Nov 02, 2007 at 08:22:10PM +0100, Torsten Kaiser wrote: > [ 630.00] SysRq : Emergency Sync > [ 630.12] Emergency Sync complete > [ 632.85] SysRq : Show Blocked State > [ 632.85] taskPC stack pid father > [ 632.85] pdflush D 8100

Re: Proposal to improve filesystem/block snapshot interaction

2007-10-30 Thread David Chinner
On Wed, Oct 31, 2007 at 03:01:58PM +1100, Greg Banks wrote: > On Wed, Oct 31, 2007 at 10:56:52AM +1100, David Chinner wrote: > > On Tue, Oct 30, 2007 at 03:16:06PM +1100, Neil Brown wrote: > > > On Tuesday October 30, [EMAIL PROTECTED] wrote: > > > > BIO_HINT_RELEA

Re: Proposal to improve filesystem/block snapshot interaction

2007-10-30 Thread David Chinner
On Tue, Oct 30, 2007 at 03:16:06PM +1100, Neil Brown wrote: > On Tuesday October 30, [EMAIL PROTECTED] wrote: > > BIO_HINT_RELEASE > > The bio's block extent is no longer in use by the filesystem > > and will not be read in the future. Any storage used to back > > the extent may be rel

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread David Chinner
On Mon, Oct 29, 2007 at 01:45:07PM -0600, Andreas Dilger wrote: > By request on #linuxfs, here is the FIEMAP spec that we used to implement > the FIEMAP support for ext4. There was an ext4 patch posted on August 29 > to linux-ext4 entitled "[PATCH] FIEMAP ioctl". Link: http://marc.info/?l=linux-

Re: [PATCH] VFS: new fgetattr() file operation

2007-10-25 Thread David Chinner
On Fri, Oct 26, 2007 at 01:10:14AM +0200, Miklos Szeredi wrote: > > On Wed, Oct 24, 2007 at 05:27:04PM +0200, Miklos Szeredi wrote: > > > > >> Wouldn't you be better off by attempting to implement an "open > > > > >> by ino" operation and an operation to get the generation count > > > > >> for the

Re: [PATCH] VFS: new fgetattr() file operation

2007-10-25 Thread David Chinner
On Wed, Oct 24, 2007 at 05:27:04PM +0200, Miklos Szeredi wrote: > > >> Wouldn't you be better off by attempting to implement an "open > > >> by ino" operation and an operation to get the generation count > > >> for the file and then modifying the network protocol of interest > > >> to use these as

Re: More Large blocksize benchmarks

2007-10-15 Thread David Chinner
On Mon, Oct 15, 2007 at 08:22:31PM -0400, Chris Mason wrote: > Hello everyone, > > I'm stealing the cc list and reviving and old thread because I've > finally got some numbers to go along with the Btrfs variable blocksize > feature. The basic idea is to create a read/write interface to > map a ra

Re: XFS regression?

2007-10-15 Thread David Chinner
On Mon, Oct 15, 2007 at 03:28:34PM +0530, Bhagi rathi wrote: > Thanks Dave for the response. Thinking futher, why is that xfs_iunpin has > to mark the inode dirty? Because the inode has been modified, and instead of sprinkling mark_inode_dirty_sync() all over the code, we can do it in a single s

Re: XFS regression?

2007-10-14 Thread David Chinner
On Fri, Oct 12, 2007 at 12:36:01PM +0100, Andrew Clayton wrote: > On Fri, 12 Oct 2007 10:26:13 +1000, David Chinner wrote: > > > You can breath again. Here's a test patch (warning - may harm > > heh > > > kittens - not fully tested or verified) that solves both

Re: XFS regression?

2007-10-14 Thread David Chinner
On Sat, Oct 13, 2007 at 07:05:17PM +0530, Bhagi rathi wrote: > David, Can you let me know the use after free problem? I want to understand > how the life cycle of linux inode > and xfs inode are related to log flush. Log I/O completion: -> xfs_trans_commited -> xfs_iunpin(xfs inode)

Re: XFS regression?

2007-10-11 Thread David Chinner
On Fri, Oct 12, 2007 at 07:53:53AM +1000, David Chinner wrote: > On Thu, Oct 11, 2007 at 03:15:12PM +0100, Andrew Clayton wrote: > > On Thu, 11 Oct 2007 11:01:39 +1000, David Chinner wrote: > > > > > So it's almost certainly pointing at an elevator or driver ch

Re: XFS regression?

2007-10-11 Thread David Chinner
On Thu, Oct 11, 2007 at 03:15:12PM +0100, Andrew Clayton wrote: > On Thu, 11 Oct 2007 11:01:39 +1000, David Chinner wrote: > > > So it's almost certainly pointing at an elevator or driver change, not an > > XFS change. > > h

Re: XFS regression?

2007-10-10 Thread David Chinner
On Wed, Oct 10, 2007 at 03:27:42PM +0100, Andrew Clayton wrote: > Hi, > > (Seeing as I haven't been able to subscribe or post to the XFS mailing > list, I'll try here) > > I'll try not to flood with information on the first post. > > In trying to track down this issue here: > http://www.spinics.

Re: SLUB performance regression vs SLAB

2007-10-04 Thread David Chinner
On Thu, Oct 04, 2007 at 03:07:18PM -0700, David Miller wrote: > From: Chuck Ebbert <[EMAIL PROTECTED]> Date: Thu, 04 Oct 2007 17:47:48 > -0400 > > > On 10/04/2007 05:11 PM, David Miller wrote: > > > From: Chuck Ebbert <[EMAIL PROTECTED]> Date: Thu, 04 Oct 2007 17:02:17 > > > -0400 > > > > > >> Ho

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-19 Thread David Chinner
On Wed, Sep 19, 2007 at 04:04:30PM +0200, Andrea Arcangeli wrote: > On Wed, Sep 19, 2007 at 03:09:10PM +1000, David Chinner wrote: > > Ok, let's step back for a moment and look at a basic, fundamental > > constraint of disks - seek capacity. A decade ago, a terabyte of > &g

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-18 Thread David Chinner
On Tue, Sep 18, 2007 at 06:06:52PM -0700, Linus Torvalds wrote: > > especially as the Linux > > kernel limitations in this area are well known. There's no "16K mess" > > that SGI is trying to clean up here (and SGI have offered both IA64 and > > x86_64

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-18 Thread David Chinner
On Tue, Sep 18, 2007 at 11:00:40AM +0100, Mel Gorman wrote: > We still lack data on what sort of workloads really benefit from large > blocks (assuming there are any that cannot also be solved by improving > order-0). No we don't. All workloads benefit from larger block sizes when you've got a btr

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread David Chinner
On Fri, Sep 14, 2007 at 06:48:55AM +1000, Nick Piggin wrote: > On Thursday 13 September 2007 12:01, Nick Piggin wrote: > > On Thursday 13 September 2007 23:03, David Chinner wrote: > > > Then just do operations on directories with lots of files in them > > > (tens of

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-13 Thread David Chinner
On Thu, Sep 13, 2007 at 03:23:21AM +1000, Nick Piggin wrote: > On Thursday 13 September 2007 11:49, David Chinner wrote: > > On Wed, Sep 12, 2007 at 01:27:33AM +1000, Nick Piggin wrote: > > > > I just gave 4 things which combined might easily reduce xfs vmap overhead >

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread David Chinner
On Wed, Sep 12, 2007 at 01:27:33AM +1000, Nick Piggin wrote: > > IOWs, we already play these vmap harm-minimisation games in the places > > where we can, but still the overhead is high and something we'd prefer > > to be able to avoid. > > I don't think you've looked nearly far enough with all thi

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread David Chinner
On Tue, Sep 11, 2007 at 04:00:17PM +1000, Nick Piggin wrote: > > > OTOH, I'm not sure how much buy-in there was from the filesystems guys. > > > Particularly Christoph H and XFS (which is strange because they already > > > do vmapping in places). > > > > I think they use vmapping because they have

Re: [PATCH 0/6] writeback time order/delay fixes take 3

2007-08-28 Thread David Chinner
On Tue, Aug 28, 2007 at 11:08:20AM -0400, Chris Mason wrote: > On Wed, 29 Aug 2007 00:55:30 +1000 > David Chinner <[EMAIL PROTECTED]> wrote: > > On Fri, Aug 24, 2007 at 09:55:04PM +0800, Fengguang Wu wrote: > > > On Thu, Aug 23, 2007 at 12:33:06PM +1000, David Chinner wr

Re: [PATCH 0/6] writeback time order/delay fixes take 3

2007-08-28 Thread David Chinner
On Fri, Aug 24, 2007 at 09:55:04PM +0800, Fengguang Wu wrote: > On Thu, Aug 23, 2007 at 12:33:06PM +1000, David Chinner wrote: > > On Wed, Aug 22, 2007 at 09:18:41AM +0800, Fengguang Wu wrote: > > > On Tue, Aug 21, 2007 at 08:23:14PM -0400, Chris Mason wrote: > > > No

Re: [PATCH 0/6] writeback time order/delay fixes take 3

2007-08-22 Thread David Chinner
On Wed, Aug 22, 2007 at 08:42:01AM -0400, Chris Mason wrote: > I think we should assume a full scan of s_dirty is impossible in the > presence of concurrent writers. We want to be able to pick a start > time (right now) and find all the inodes older than that start time. > New things will come in

Re: [PATCH 0/6] writeback time order/delay fixes take 3

2007-08-22 Thread David Chinner
On Wed, Aug 22, 2007 at 09:18:41AM +0800, Fengguang Wu wrote: > On Tue, Aug 21, 2007 at 08:23:14PM -0400, Chris Mason wrote: > Notes: > (1) I'm not sure inode number is correlated to disk location in > filesystems other than ext2/3/4. Or parent dir? The correspond to the exact location on disk

Re: [PATCH 3/6] writeback: remove pages_skipped accounting in __block_write_full_page()

2007-08-12 Thread David Chinner
On Sun, Aug 12, 2007 at 05:11:23PM +0800, Fengguang Wu wrote: > Miklos Szeredi <[EMAIL PROTECTED]> and me identified a writeback bug: > Basicly they are > - during the dd: ~16M > - after 30s: ~4M > - after 5s: ~4M > - after 5s: ~176M > > The box has 2G memory. > > Question 1: > Ho

Re: kupdate weirdness

2007-08-01 Thread David Chinner
On Wed, Aug 01, 2007 at 10:45:16PM +0200, Miklos Szeredi wrote: > The following strange behavior can be observed: > > 1. large file is written > 2. after 30 seconds, nr_dirty goes down by 1024 > 3. then for some time (< 30 sec) nothing happens (disk idle) > 4. then nr_dirty again goes down by 1024

Re: [RFC] basic delayed allocation in VFS

2007-07-29 Thread David Chinner
On Sun, Jul 29, 2007 at 04:09:20PM +0400, Alex Tomas wrote: > David Chinner wrote: > >On Fri, Jul 27, 2007 at 11:51:56AM +0400, Alex Tomas wrote: > >But this is really irrelevant - the issue at hand is what we want > >for VFS level delalloc support. IMO, that mechanism needs t

Re: [RFC] basic delayed allocation in VFS

2007-07-29 Thread David Chinner
On Fri, Jul 27, 2007 at 11:51:56AM +0400, Alex Tomas wrote: > David Chinner wrote: > >Using a new API for new functionality is a bad thing? > > if existing API can be used ... Sure, but using the existing APIs is no good if the only filesystem in the kernel that supports delalloc

Re: [RFC] basic delayed allocation in VFS

2007-07-26 Thread David Chinner
[please don't top post!] On Thu, Jul 26, 2007 at 05:33:08PM +0400, Alex Tomas wrote: > Jeff Garzik wrote: > >The XFS one is proven and the work was already completed. > > > >What were the specific technical issues that made it unsuitable for ext4? > > > >I would rather not reinvent the wheel, part

Re: fallocate() man page

2007-07-24 Thread David Chinner
and send a new draft of the page back to me. > > Thanks for going through the manpage and improving it! > > My comments are below in between ... tags. Does this Q&A really need to be encoded in nroff comments? ;) > .\" FIXME Amit: I need author and license informatio

Re: [PATCH] coda: kill file_count abuse

2007-07-19 Thread David Chinner
On Fri, Jul 20, 2007 at 04:16:31AM +0100, Al Viro wrote: > On Fri, Jul 20, 2007 at 12:36:01PM +1000, David Chinner wrote: > > To the context that dropped the last reference. It can't be > > reported to anything else > > Oh, for fsck sake... > > Send a datag

Re: [PATCH] coda: kill file_count abuse

2007-07-19 Thread David Chinner
On Fri, Jul 20, 2007 at 01:53:16AM +0100, Al Viro wrote: > On Fri, Jul 20, 2007 at 10:45:34AM +1000, David Chinner wrote: > > On Thu, Jul 19, 2007 at 06:16:00PM -0400, Jan Harkes wrote: > > > On Thu, Jul 19, 2007 at 11:45:08PM +0200, Christoph Hellwig wrote: > > > >

Re: [PATCH] coda: kill file_count abuse

2007-07-19 Thread David Chinner
On Thu, Jul 19, 2007 at 06:16:00PM -0400, Jan Harkes wrote: > On Thu, Jul 19, 2007 at 11:45:08PM +0200, Christoph Hellwig wrote: > > ->release is the proper way to detect the last close of a file, > > file_count should never be used in filesystems. > > Has been tried, the problem with that once ->

Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-18 Thread David Chinner
tten region requires a node split, that could result > in the allocation of new meta data which obviously could fail if the disk is > truly full. % git-log 84e1e99f112dead8f9ba036c02d24a9f5ce7f544 |head -10 commit 84e1e99f112dead8f9ba036c02d24a9f5ce7f544 Author: David Chinner <[EMAIL PRO

[PATCH] introduce fallocate support into xfs_io

2007-07-15 Thread David Chinner
FYI. Initial support for fallocate-based pre-allocation in xfs_io for testing. This currently only works on ia64 because of the hard coded syscall number and will require autoconf magic to conditionally compile in this support. This allows simple command-line based testing of fallocate based allo

[PATCH] xfs: implement fallocate V2

2007-07-15 Thread David Chinner
Initial implementation of ->fallocate for XFS. Version 2: o Make allocation and setting the file size atomic. o Drop deallocate/punch functionality o use mode field appropriately to determine if size needs changing. --- fs/xfs/linux-2.6/xfs_iops.c | 47

[PATCH] ia64 fallocate system call

2007-07-15 Thread David Chinner
sys_fallocate for ia64. This uses the empty slot originally reserved for move_pages. Signed-Off-By: Dave Chinner <[EMAIL PROTECTED]> --- arch/ia64/kernel/entry.S |2 +- include/asm-ia64/unistd.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: 2.6.x-xfs-new/arch/ia64/kern

Re: [PATCH 1/6][TAKE7] manpage for fallocate

2007-07-13 Thread David Chinner
On Fri, Jul 13, 2007 at 06:16:01PM +0530, Amit K. Arora wrote: > Following is the modified version of the manpage originally submitted by > David Chinner. Please use `nroff -man fallocate.2 | less` to view. > > This includes changes suggested by Heikki Orsila and Barry Naujok.

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-07-12 Thread David Chinner
On Thu, Jul 12, 2007 at 12:58:13PM +0530, Suparna Bhattacharya wrote: > > Why don't we just merge the interface for preallocation (essentially > enough to satisfy posix_fallocate() and the simple XFS requirement for > space reservation without changing file size), which there is clear agreement >

[PATCH, RESEND] Teach do_mpage_readpage() about unwritten buffers

2007-07-11 Thread David Chinner
Teach do_mpage_readpage() about unwritten extents so we can always map them in get_blocks rather than they are are holes on read. Allows setup_swap_extents() to use preallocated files on XFS filesystems for swap files without ever needing to convert them. Signed-Off-By: Dave Chinner <[EMAIL PROTEC

[PATCH 2 of 2] Make XFS use block_page_mkwrite

2007-07-11 Thread David Chinner
Implement ->page_mkwrite in XFS. Signed-Off-By: Dave Chinner <[EMAIL PROTECTED]> --- fs/xfs/linux-2.6/xfs_file.c | 16 1 file changed, 16 insertions(+) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c === --- 2.6

[PATCH 1 of 2] block_page_mkwrite V2

2007-07-11 Thread David Chinner
Generic page_mkwrite functionality. Filesystems that make use of the VM ->page_mkwrite() callout will generally use the same core code to implement it. There are several tricky truncate-related issues that we need to deal with here as we cannot take the i_mutex as we normally would for these paths

block_page_mkwrite? (Re: fault vs invalidate race (Re: -mm merge plans for 2.6.23))

2007-07-11 Thread David Chinner
On Thu, Jul 12, 2007 at 10:54:57AM +1000, Nick Piggin wrote: > Andrew Morton wrote: > > The fault-vs-invalidate race fix. I have belatedly learned that these > > need > > more work, so their state is uncertain. > > The more work may turn out being too much for you (although it is nothing > exact

Re: vm/fs meetup details

2007-07-08 Thread David Chinner
On Sat, Jul 07, 2007 at 12:45:35PM +0200, Jörn Engel wrote: > On Fri, 6 July 2007 13:40:03 -0700, Christoph Lameter wrote: > > > > An interesting topic is certainly > > > > 1. Large buffer support > > > > 2. icache/dentry/buffer_head defragmentation. > > Oh certainly! I should dust off my dca

Re: vm/fs meetup details

2007-07-06 Thread David Chinner
On Fri, Jul 06, 2007 at 12:26:23PM +0200, Jörn Engel wrote: > On Fri, 6 July 2007 20:01:10 +1000, David Chinner wrote: > > On Fri, Jul 06, 2007 at 04:26:51AM +0200, Nick Piggin wrote: > > > > > > Keep in mind that the way to get the most out of this meeting is for

Re: vm/fs meetup details

2007-07-06 Thread David Chinner
On Fri, Jul 06, 2007 at 04:26:51AM +0200, Nick Piggin wrote: > On Thu, Jul 05, 2007 at 05:40:57PM -0400, Rik van Riel wrote: > > David Chinner wrote: > > >On Thu, Jul 05, 2007 at 01:40:08PM -0700, Zach Brown wrote: > > >>>- repair driven design, we know what it

Re: vm/fs meetup details

2007-07-05 Thread David Chinner
On Thu, Jul 05, 2007 at 01:40:08PM -0700, Zach Brown wrote: > >- repair driven design, we know what it is (Val told us), but > > how does it apply to the things we are currently working on? > > should we do more of it? > > I'm sure Chris and I could talk about the design elements in btrfs > th

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-07-01 Thread David Chinner
On Sat, Jun 30, 2007 at 11:21:11AM +0100, Christoph Hellwig wrote: > On Tue, Jun 26, 2007 at 04:02:47PM +0530, Amit K. Arora wrote: > > > Can you clarify - what is the current behaviour when ENOSPC (or some other > > > error) is hit? Does it keep the current fallocate() or does it free it? > > >

Re: [RFC] fsblock

2007-06-28 Thread David Chinner
On Thu, Jun 28, 2007 at 08:20:31AM -0400, Chris Mason wrote: > On Thu, Jun 28, 2007 at 04:44:43AM +0200, Nick Piggin wrote: > > That's true but I don't think an extent data structure means we can > > become too far divorced from the pagecache or the native block size > > -- what will end up happeni

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-28 Thread David Chinner
On Thu, Jun 28, 2007 at 11:49:13PM +0530, Amit K. Arora wrote: > On Wed, Jun 27, 2007 at 09:18:04AM +1000, David Chinner wrote: > > On Tue, Jun 26, 2007 at 11:34:13AM -0400, Andreas Dilger wrote: > > > On Jun 26, 2007 16:02 +0530, Amit K. Arora wrote: > > > > On M

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-27 Thread David Chinner
On Thu, Jun 28, 2007 at 09:28:36AM +1000, Nathan Scott wrote: > On Wed, 2007-06-27 at 23:36 +1000, David Chinner wrote: > > Allows setup_swap_extents() to use preallocated files on XFS > > filesystems for swap files without ever needing to convert them. > > Using

Re: [RFC] fsblock

2007-06-27 Thread David Chinner
On Wed, Jun 27, 2007 at 07:50:56AM -0400, Chris Mason wrote: > On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote: > > On Tue, Jun 26, 2007 at 08:34:49AM -0400, Chris Mason wrote: > > > On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: > > > >

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-27 Thread David Chinner
On Tue, Jun 26, 2007 at 11:49:15PM -0400, Andreas Dilger wrote: > On Jun 27, 2007 09:14 +1000, David Chinner wrote: > > Someone on the XFs list had an interesting request - preallocated > > swap files. You can't use unwritten extents for this because > > of sys_sw

Re: [RFC] fsblock

2007-06-26 Thread David Chinner
On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote: > I think using fsblock to drive the IO and keep the pagecache flags > uptodate and using a btree in the filesystem to manage extents of block > allocations wouldn't be a bad idea though. Do any filesystems actually > do this? Yes. XFS.

Re: [PATCH 7/7][TAKE5] ext4: support new modes

2007-06-26 Thread David Chinner
estamps when punching out data blocks or preallocating new ones. > Hmm.. I personally will call it a bug in XFS code then. :) No, I'd call it useful. :) > > > I think, modifying ctime/mtime should be dependent on the other flags. > > > E.g., if we do not zero out data bl

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread David Chinner
On Tue, Jun 26, 2007 at 11:42:50AM -0400, Andreas Dilger wrote: > On Jun 26, 2007 16:15 +0530, Amit K. Arora wrote: > > On Mon, Jun 25, 2007 at 03:52:39PM -0600, Andreas Dilger wrote: > > > In XFS one of the (many) ALLOC modes is to zero existing data on allocate. > > > For ext4 all this would mea

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread David Chinner
On Mon, Jun 25, 2007 at 03:52:39PM -0600, Andreas Dilger wrote: > On Jun 25, 2007 19:15 +0530, Amit K. Arora wrote: > > +#define FA_FL_DEALLOC 0x01 /* default is allocate */ > > +#define FA_FL_KEEP_SIZE0x02 /* default is extend/shrink size */ > > +#define FA_FL_DEL_DATA 0x04 /* defaul

  1   2   >