Re: impact of 4k sector size on the IO & FS stack

2007-03-11 Thread Andreas Dilger
le to use a copy of an ext3 filesystems with 1kB blocksize onto a 4kB sector size device - the ext3 code will detect this and refuse to mount. At that point you need to do a tar/untar (or whatever) to copy the data instead of a raw partition copy. Cheers, Andreas -- Andreas Dilger Principal Software E

Re: impact of 4k sector size on the IO & FS stack

2007-03-12 Thread Andreas Dilger
how many transfers /ended/ on an odd sector, > thus determining how many RMW cycles the tail of an average I/O requires. I'd guess a vast majority of IO will have the end similarly misaligned as the start. Very little filesystem IO is 512 bytes, possibly excluding XFS in an unusual

Re: Ext3: changes to increase the speed?

2007-04-01 Thread Andreas Dilger
eed increase? Likely the mke2fs is enabling the "dir_index" feature by default now. This shows dramatic performance improvements with > 1 files per directory. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: s

Re: Interface for the new fallocate() system call

2007-04-06 Thread Andreas Dilger
; mode) > { > return sys_fallocate(fd, offset, len, mode); > } Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: ext3, BKL, journal replay, multiple non-bind mounts of same device

2007-04-10 Thread Andreas Dilger
best bet is to go back through GIT and/or BK or search the mailing lists to see when and why that was added. It appears to have been 2.6.11, but I don't know why. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send th

Re: [PATCH 5/13] ext4: use zero_user_page

2007-04-10 Thread Andreas Dilger
ernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Would have been better to CC the filesystem maintainers directly (which was one of the reasons Andrew wanted per-fs patches so they can be Ack/Nack independently. Looks good in any case, Signed-off-by: Andreas Dilger <[EMAIL PROTECTED]> > diff -

[RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-12 Thread Andreas Dilger
a callback in the extent tree iterator so it is very efficient. I believe it implements all that is needed to allow this interface to be mapped onto XFS_IOC_BMAP internally (or vice versa). Even for block-mapped filesystems, they can at least improve over the ->bmap() case by skipping holes in files t

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-12 Thread Andreas Dilger
On Apr 12, 2007 12:22 +0100, Anton Altaparmakov wrote: > On 12 Apr 2007, at 12:05, Andreas Dilger wrote: > >I'm interested in getting input for implementing an ioctl to > >efficiently map file extents & holes (FIEMAP) instead of looping > >over FIBMAP a billion tim

Re: Interface for the new fallocate() system call

2007-04-18 Thread Andreas Dilger
mode" can then be made part of it. We need at least mode="unallocate" or a separate funallocate() call to allow allocated-but-unwritten blocks to be unallocated without actually punching out written data. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster Fi

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-18 Thread Andreas Dilger
On Apr 16, 2007 18:01 +1000, Timothy Shimmin wrote: > --On 12 April 2007 5:05:50 AM -0600 Andreas Dilger <[EMAIL PROTECTED]> > wrote: > >struct fiemap_extent { > > __u64 fe_start; /* starting offset in bytes */ > > __u64 fe_len;

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-18 Thread Andreas Dilger
On Apr 16, 2007 21:22 +1000, David Chinner wrote: > On Thu, Apr 12, 2007 at 05:05:50AM -0600, Andreas Dilger wrote: > > struct fiemap_extent { > > __u64 fe_start; /* starting offset in bytes */ > > __u64 fe_len; /* length in bytes */

Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Andreas Dilger
ue to inode block references, or because of e.g. directories referencing inodes in another chunk. Also, is it considered a cross-chunk reference if a directory entry is referencing an inode in another group? Should there be a continuation inode in the local group, or is the directory entry itself e

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-25 Thread Andreas Dilger
re only a fsck of the corrupt chunk is done would not find the cnode references. Maybe there needs to be per-chunk info which contains a list/bitmap of other chunks that have cnodes shared with each chunk? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Ext2/3 block remapping tool

2007-04-27 Thread Andreas Dilger
1 7.1; avg. 7.22 > Start with fcache (see thread http://lkml.org/lkml/2006/5/15/46 for details > on fcache): > 11.3 11 10.3 10.8 10.6; avg. 10.8 > Start with blocks remapped with e2remapblocks: > 13.5 15 13 14.5 14.5; avg. 14.1 > (after remapping, data was stored in 20 conting

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-30 Thread Andreas Dilger
itself, for the non-verbose mode of filefrag, and for pre-allocating a buffer large enough to hold the file if that is important. I'm also going to add a FIEMAP_FLAG_LAST to mark the last extent in the file, so that iterators using a small buffer don't need to retry to get the last ext

Re: Ext2/3 block remapping tool

2007-04-30 Thread Andreas Dilger
On Apr 30, 2007 08:09 -0400, Theodore Tso wrote: > On Fri, Apr 27, 2007 at 12:09:42PM -0600, Andreas Dilger wrote: > > I'd prefer that such functionality be integrated with Takashi's online > > defrag tool, since it needs virtually the same functionality. For that >

Re: Ext2/3 block remapping tool

2007-05-01 Thread Andreas Dilger
On May 01, 2007 11:28 -0400, Theodore Tso wrote: > On Tue, May 01, 2007 at 12:01:42AM -0600, Andreas Dilger wrote: > > Except one other issue with online shrinking is that we need to move > > inodes on occasion and this poses a bunch of other problems over just > > rema

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-05-01 Thread Andreas Dilger
On May 01, 2007 14:22 +1000, David Chinner wrote: > On Mon, Apr 30, 2007 at 04:44:01PM -0600, Andreas Dilger wrote: > > Hmm, I'd thought "offline" would migrate to EXTENT_UNKNOWN, but I didn't > > I disagree - why would you want to indicate the state is unkno

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-05-01 Thread Andreas Dilger
, and is much better than having version numbers for the interface. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majord

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-05-03 Thread Andreas Dilger
tl(DMAPI_FORCE_READ); ioctl(FIEMAP)" if an application actually needs the data to be present instead of just returning mapping info that includes "UNMAPPED. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/5] fallocate system call

2007-05-03 Thread Andreas Dilger
. I think I'd agree - it may be useful to allow preallocation beyond EOF for some kinds of applications (e.g. PVR preallocating live TV in 10 minute segments or something, but not knowing in advance how long the show will actually be recorded or the final encoded size). Cheers, Andreas -- A

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-07 Thread Andreas Dilger
then truncate_mutex is not needed. > > + ret = ext4_ext_get_blocks(handle, inode, block, > > + max_blocks, &map_bh, > > + EXT4_CREATE_UNINITIALIZED_EXT, 0); > > + BUG_ON(!ret); >

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-07 Thread Andreas Dilger
do manual zero-filling of the file in userspace. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-07 Thread Andreas Dilger
On May 07, 2007 19:02 -0400, Jeff Garzik wrote: > Andreas Dilger wrote: > >Actually, this is a non-issue. The reason that it is handled for > >extent-only is that this is the only way to allocate space in the > >filesystem without doing the explicit zeroing. > > P

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-08 Thread Andreas Dilger
.g. indirect blocks). One of the design goals for sys_fallocate() was to allow FA_DELALLOC to deallocate unwritten extents in a safe manner. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux

Re: [PATCH 2/2] file capabilities: accomodate >32 bit capabilities

2007-05-08 Thread Andreas Dilger
n old kernel is relatively harmless if the old kernel doesn't know what they are. It's like having a key to a door that you don't know where it is. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-05-09 Thread Andreas Dilger
l allocation/ >unallocation ? I would say yes. If glibc does the fallback fallocate via write() the mtime/ctime will be updated, so it makes sense to be consistent for both methods. Also, it just makes sense from the "this file was modified" point of view. Cheers, Andreas -- Andr

Re: [PATCH 2/2] file capabilities: accomodate >32 bit capabilities

2007-05-10 Thread Andreas Dilger
On May 08, 2007 16:49 -0500, Serge E. Hallyn wrote: > Quoting Andreas Dilger ([EMAIL PROTECTED]): > > One of the important use cases I can see today is the ability to > > split the heavily-overloaded e.g. CAP_SYS_ADMIN into much more fine > > grained attributes. > >

Re: [RFC][PATCH 13/14] ext3 whiteout support

2007-05-14 Thread Andreas Dilger
It isn't listed in the e2fsprogs repo. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://v

Re: [PATCH 0/5][TAKE2] fallocate system call

2007-05-14 Thread Andreas Dilger
cation for whatever reason (interrupt, out of space, etc) like a regular write(2) call. In this case the return type needs to also be an loff_t to match @len. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the lin

Re: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Andreas Dilger
your kernel has CONFIG_LBD enabled. The kernel doesn't check if the block layer can actually write to a block device > 2TB. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdev

Re: [RFC 4/5] inode reservation v0.1 (benchmark result)

2007-05-24 Thread Andreas Dilger
what the mapping turns out to be - the goal is to place inodes with a similar hash into nearby inodes, and this heuristic works relatively well for that. Once the given leaf block's inode range is full then new inodes can be allocated from a new window as it was done for the newly-created d

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-25 Thread Andreas Dilger
lock is all about in the end). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [patch 2/2] i_version update - ext4 part

2007-05-29 Thread Andreas Dilger
n then knfsd can just access it directly. I don't think there is any API to access it from userspace. One option is to add a virtual EA like user.inode_version and have the kernel fill this in from i_version. Lustre will manipulate the ei->i_fs_version directly. Cheers, Andreas -- And

Re: [RFC] obsoleting /etc/mtab

2007-05-31 Thread Andreas Dilger
On May 31, 2007 17:11 -0700, H. Peter Anvin wrote: > NFS takes a binary option block anyway. However, that's the exception, > not the rule. There was recently a patch submitted to linux-fsdevel to change NFS to use text option parsing. Cheers, Andreas -- Andreas Dilger Princip

Re: Read/write counts

2007-06-04 Thread Andreas Dilger
- some applications assume that the amount requested == amount read/written and don't even check whether that is actually the case or not. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linu

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-06-14 Thread Andreas Dilger
not. In some (primitive) implementations it might no longer be possible to distinguish between unwritten extents and zero-filled blocks, though at this point DEALLOC of zero-filled blocks might not be harmful either. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Sy

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-06-14 Thread Andreas Dilger
On Jun 14, 2007 22:04 +1000, David Chinner wrote: > On Thu, Jun 14, 2007 at 03:14:58AM -0600, Andreas Dilger wrote: > > > B FA_DEALLOCATE > > > removes the underlying disk space with the given range. The disk space > > > shall be removed regardless of it'

Re: Versioning file system

2007-06-18 Thread Andreas Dilger
he filesystem and RAID layers can move beyond "ignorance is bliss" when talking to each other would be great. Not rebuilding empty parts of the fs, limit parity resync to parts of the fs that were in the previous transaction, use fs-supplied checksums to verify on-disk data is correct,

Re: [34/37] Large blocksize support in ramfs

2007-06-20 Thread Andreas Dilger
order = simple_strtoul(options, NULL, 10); This is probably a bad name for a mount option. What about "order=10"? Otherwise you prevent any other option from being used in the future. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsub

Re: [36/37] Large blocksize support for ext2

2007-06-20 Thread Andreas Dilger
would break. There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to 32kB (AFAIK), but I haven't looked into this in a while. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsu

Re: [36/37] Large blocksize support for ext2

2007-06-20 Thread Andreas Dilger
past 64kB blocksize in any case. > > There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to > > 32kB (AFAIK), but I haven't looked into this in a while. > > I'd love to see such a patch. That is also useful for arches that have > PAGE_SIZE &

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-25 Thread Andreas Dilger
ot sure if this makes sense at all. > On Mon, Jun 25, 2007 at 07:15:00PM +0530, Amit K. Arora wrote: > > Implement new flags and values for mode argument. > > > > This patch implements the new flags and values for the "mode" argument > > of the fallocate syste

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-25 Thread Andreas Dilger
* default is keep existing data */ so that it doesn't imply this is only for DEALLOC. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAI

Re: [PATCH 7/7][TAKE5] ext4: support new modes

2007-06-25 Thread Andreas Dilger
anging ctime, if that is required even though the file is not visibly changing. Maybe the ctime update should be implicit if the size or mtime are changing? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "

Re: vm/fs meetup in september?

2007-06-25 Thread Andreas Dilger
SIZE. I'll let the rest of you duke it out as long as at least one of them makes it into the kernel. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the b

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread Andreas Dilger
On Jun 26, 2007 16:02 +0530, Amit K. Arora wrote: > On Mon, Jun 25, 2007 at 03:46:26PM -0600, Andreas Dilger wrote: > > Can you clarify - what is the current behaviour when ENOSPC (or some other > > error) is hit? Does it keep the current fallocate() or does it free it? > >

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread Andreas Dilger
On Jun 26, 2007 16:15 +0530, Amit K. Arora wrote: > On Mon, Jun 25, 2007 at 03:52:39PM -0600, Andreas Dilger wrote: > > In XFS one of the (many) ALLOC modes is to zero existing data on allocate. > > For ext4 all this would mean is calling ext4_ext_mark_uninitialized() on > &

Re: [PATCH 7/7][TAKE5] ext4: support new modes

2007-06-26 Thread Andreas Dilger
hanism we now have - we can encode the various different behaviours in any way we want and leave it to the caller. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Patent or not patent a new idea

2007-06-26 Thread Andreas Dilger
o be efficient. Unfortunately, Linux still has distinct DM and MD layers and there doesn't seem to be any work to combine the two into a more powerful single layer. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: s

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread Andreas Dilger
should succeed in most near-OOM conditions). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [AppArmor 00/44] AppArmor security module overview

2007-06-27 Thread Andreas Dilger
Any chance you can remove linux-fsdevel from the CC list? I don't think this has anything to do with filesystems. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" i

Re: [PATCH 0/6][TAKE5] fallocate system call

2007-06-28 Thread Andreas Dilger
incremental over the previous set). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-07-01 Thread Andreas Dilger
(default keep prealloc) */ The other possible flags that were proposed, to avoid confusing backup and HSM applications when preallocated space is added or removed from a file (you don't want a backup app to re-backup a file that was migrated via HSM): FA_FL_NO_MTIME 0x10 /* keep same mt

Re: [EXT4 set 4][PATCH 4/5] i_version:ext4 inode version update

2007-07-03 Thread Andreas Dilger
do we set the qutoa file inodes version to 1 > during write ? Hmm, I thought we had previously fixed this? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-03 Thread Andreas Dilger
t systems we only ever had a 32-bit in-memory version anyway so using only the low 32 bits of i_version in f_version is no more racy than in the past. For 64-bit systems using the full on-disk i_version is possible. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Syste

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-03 Thread Andreas Dilger
be ordered between all inodes, is set by Lustre to be a transaction number). Instead of trying to incorporate this unused code into ext4 we just turn off the ext4 version code and let Lustre control this directly. It may even be that NFSv4 will need to control the version numbers itself... Ch

Re: [EXT4 set 3][PATCH 1/1] ext4 nanosecond timestamp

2007-07-04 Thread Andreas Dilger
t;Missed this one. > >Thanks. Will update ext4 patch queue tonight with this fix. > > IIRC in the conference call it was decided to not to apply this patch. > Andreas may be able to update better. I wasn't on the most recent concall, and I've forgotten the details

Re: [ANNOUNCE] util-linux-ng 2.13-rc1

2007-07-05 Thread Andreas Dilger
brary so that e2fsprogs could use it. The only issue is the increased maintenance and packaging of separate libraries. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-06 Thread Andreas Dilger
having it stored into the superblock in s_flags is probably a good idea. Kalpak, do you think you could get a patch that adds e.g. EXT4_FLAGS_NO_INODE_VERSION (like EXT4_FLAGS_SIGNED_HASH in e2fsprogs). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To

Re: [EXT4 set 3][PATCH 1/1] ext4 nanosecond timestamp

2007-07-11 Thread Andreas Dilger
.info/?l=linux-ext4&m=115091699809181&w=2 > > Andreas or Kalpak, is changelog from the original patch is accurate to > apply here? Mostly, yes, but the name of the feature flag has changed. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.

Re: [EXT4 set 3][PATCH 1/1] ext4 nanosecond timestamp

2007-07-11 Thread Andreas Dilger
topic, but they aren't attached to the patch. s_want_extra_isize is just an override for sizeof(ext4_inode) in case the sysadmin wants to reserve more fields in new inodes. There is also s_min_extra_isize which is what the kernel and e2fsck guarantee that will be available in all in-use inodes,

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-11 Thread Andreas Dilger
internal-only changes which need ext4_mark_inode_dirty(). We had a patch to disable ext4 inode versioning by a flag the superblock, but we dropped it at the last minute because it needed some updates and we didn't want to wait on that for submitting these changes upstream. Cheers, Andreas -- And

Re: [EXT4 set 4][PATCH 2/5] i_version: Add hi 32 bit inode version on ext4 on-disk inode

2007-07-11 Thread Andreas Dilger
On Jul 10, 2007 16:30 -0700, Andrew Morton wrote: > > Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> > > Signed-off-by: Andreas Dilger <[EMAIL PROTECTED]> > > Signed-off-by: Kalpak Shah <[EMAIL PROTECTED]> > > --- >

Re: [EXT4 set 4][PATCH 3/5] i_version:ext4 inode version read/store

2007-07-11 Thread Andreas Dilger
t; > There's no comparison with EXT4_GOOD_OLD_INODE_SIZE here... Because this is the in-memory version and it is always valid (set to zero if there is extra space in the on-disk inode). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 4][PATCH 4/5] i_version:ext4 inode version update

2007-07-11 Thread Andreas Dilger
are confusing i_generation (the instance of this inode number) with i_version (whether this file has been modified)? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel"

Re: [EXT4 set 4][PATCH 5/5] i_version: noversion mount option to disable inode version updates

2007-07-11 Thread Andreas Dilger
rsion updates on disk default to OFF unless NFSv4 has exported the filesystem at least once, and then it should set a persistent flag in the superblock indicating that i_version updates are needed. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To u

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-11 Thread Andreas Dilger
> + EXT4_I(inode)->i_file_acl); > > + error = -EIO; > > + goto cleanup; > > + } > > + base = BHDR(bh); > > + first = BFIRST(bh); > > + end = bh->b_data + bh->

Re: [EXT4 set 7][PATCH 1/1]Remove 32000 subdirs limit.

2007-07-11 Thread Andreas Dilger
> > + inode->i_nlink = 1; > > + EXT4_SET_RO_COMPAT_FEATURE(inode->i_sb, > > + EXT4_FEATURE_RO_COMPAT_DIR_NLINK); > > + } > > + } > > +} > > Why do we set EXT4_FEATURE_RO_COMPAT_DIR_NLINK if i_nlink==2? Because that

Re: [EXT4 set 8][PATCH 1/1]Add journal checksums

2007-07-11 Thread Andreas Dilger
k with a single copy for now. > > > @@ -328,6 +360,7 @@ static int do_one_pass(journal_t *journa > > > unsigned intsequence; > > > int blocktype; > > > int tag_bytes = journal_tag_bytes(journal); > >

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-11 Thread Andreas Dilger
for NFSv4 because it only uses the inequality check. Having the full 64 bits available eliminates the risk of collisions, and given that the spec mandates a 64-bit version I'm sure someone will take full advantage of it in NFS at some point. Cheers, Andreas -- Andreas Dilger Principal Softw

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-07-12 Thread Andreas Dilger
y just dropping the FALLOC_FL_DEALLOCATE and FALLOC_FL_DEL_DATA from the interface. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PRO

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-13 Thread Andreas Dilger
the same process, since the journal handle is also held in current->journal_info so the handle does not need to be passed as an argument all over the VFS. > This seems to boot... albeit I did not push it hard. Can you please also make a patch for jbd2. Cheers, Andreas -- Andreas Dilger Pri

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-13 Thread Andreas Dilger
;buffer" and "b_entry_name" are leaked in ext4_expand_extra_isize() if the while loop is run more than one time (again a relatively rare event). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-16 Thread Andreas Dilger
t; > cleanup: > kfree(b_entry_name); I don't think you should have brelse(bh) inside the loop, since it is allocated before the loop starts. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the li

Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-19 Thread Andreas Dilger
ites are guaranteed to never require > allocation of file data." ? > --Mark In the worst case, the unwritten extent could be zero-filled before the write is done, so no exent split is needed. We discussed this recently for the ext4 fallocate, but didn't consider it important e

Re: [RFC] basic delayed allocation in VFS

2007-07-29 Thread Andreas Dilger
uite large, we'd like to get this into the kernel one way or another. Can we make a decision if the ext4-specific delalloc is acceptable? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe li

Re: [RFC 12/26] ext2 white-out support

2007-07-31 Thread Andreas Dilger
x-ext4. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC] System calls for online defrag

2007-09-04 Thread Andreas Dilger
How is this much different than sys_fallocate()? > int sys_get_free_blocks(const char *fs, loff_t start, loff_t end, int count, > struct alloc_extent *space) One alternate possibility is to call the proposed FIEMAP on the block device, to return lists of free/used extents? We have a versi

Re: Distributed storage. Move away from char device ioctls.

2007-09-15 Thread Andreas Dilger
exists, and I think btrfs may have laid claim to the current generation of filesystems. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message

Re: Distributed storage. Move away from char device ioctls.

2007-09-15 Thread Andreas Dilger
solaris and ZFS ports) so perhaps those performance > critical parts that remain kernel space will be easier to merge. This is also true - when that is done the only parts that will remain in the kernel are the network drivers. With some network stacks there is even direct userspace accelerati

Re: [PATCH] JBD slab cleanups

2007-09-19 Thread Andreas Dilger
f(*new_transaction), > - GFP_NOFS|__GFP_NOFAIL); > + new_transaction = kmalloc(sizeof(*new_transaction), GFP_NOFS); This should probably be a __GFP_NOFAIL if we are trying to start a new handle in truncate, as there is no way to propagate an error to the caller. Cheers, Andreas --

Re: [PATCH] JBD: use GFP_NOFS in kmalloc

2007-09-19 Thread Andreas Dilger
;it's in a filesystem, so it should be GFP_NOFS"? We are only doing journal setup during mount so there shouldn't be any problem using GFP_KERNEL. I don't think it will inject any defect into the code, but I don't think it is needed either. Cheers, Andreas -- Andreas Dilger

Re: [patch 2/5] VFS: pass open file to ->getattr()

2007-09-21 Thread Andreas Dilger
struct file *file); It's not much of an inode operation anymore if you need to pass a file to it... Since the attributes are really part of the inode and not the file, this seems like a bit of a hack. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.

Re: [patch 3/5] VFS: pass open file to ->xattr()

2007-09-21 Thread Andreas Dilger
a file, nor is the xattr different. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [patch 3/5] VFS: pass open file to ->xattr()

2007-09-21 Thread Andreas Dilger
char *, void *, size_t, > + struct file *); > + ssize_t (*listxattr) (struct dentry *, char *, size_t, struct file *); > + int (*removexattr) (struct dentry *, const char *, struct file *); Likewise - these are no longer inode operations if you need a file. Cheer

Re: [patch 4/5] VFS: allow filesystems to implement atomic open+truncate

2007-09-21 Thread Andreas Dilger
gnore such truncate requests. This is actually something we've needed to do in Lustre for a while also. We called it ATTR_FROM_OPEN, but I don't really mind ATTR_OPEN either - the less patching we need to do the better. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluste

Re: [patch 2/4] ext2: fix rec_len overflow for 64KB block size

2007-09-25 Thread Andreas Dilger
gned-off-by: Christoph Lameter <[EMAIL PROTECTED]> Note that we just got a cleaner implemantation of this code on the ext4 mailing list from Jan Kara yesterday. Please use that one instead, in thread "Avoid rec_len overflow with 64KB block size" instead. Cheers, Andreas -- Andreas D

Re: Upgrading datastructures between different filesystem versions

2007-09-26 Thread Andreas Dilger
ars of bug fixes in the ext2/ext3 code when adding the ext4 features. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] Mor

Re: batching support for transactions

2007-10-03 Thread Andreas Dilger
ync transactions to decide if the initial transaction should be held to allow later ones. Alternately, it might be possible to check if a new thread is trying to start a sync handle when the previous one was also synchronous and had only a single handle in it, then automatically enable the delay

Re: batching support for transactions

2007-10-03 Thread Andreas Dilger
art a sync handle when the previous one was also synchronous and had > >only a single handle in it, then automatically enable the delay in that > >case. > > I am not sure that this avoids the problem with the current defaults at > 250HZ where each wait is sufficient to do 3 fu

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Andreas Dilger
I've just resent the spec we used in a separate email (attached to old thread) for reference. Cheers, Andreas -- Andreas Dilger Sr. Software Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in th

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
o the "filefrag" tool to use FIEMAP. FIEMAP_1.0.txt == File Mapping Interface 18 June 2007 Andreas Dilger, Kalpak Shah Introduction This document covers the user interface and internal implementation of an efficient fragmentation

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 16:13 -0600, Andreas Dilger wrote: > On Oct 29, 2007 13:57 -0700, Mark Fasheh wrote: > > I'm a little bit confused by fe_offset. Is it a physical offset, or a > > logical offset? The reason I ask is that your description above says "FIEMAP > > i

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
1:45:07PM -0600, Andreas Dilger wrote: > > The FIEMAP ioctl (FIle Extent MAP) is similar to the existing FIBMAP > > ioctl block device ioctl used for mapping an individual logical block > > address in a file to a physical block address in the block device. The > > FIEMAP ioctl wi

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 17:11 -0700, Mark Fasheh wrote: > On Mon, Oct 29, 2007 at 04:13:02PM -0600, Andreas Dilger wrote: > > > Btrfs, Ocfs2, and Gfs2 pack small amounts of user data directly in inode > > > blocks. > > > > Hmm, but part of the issue would be how to req

Re: Beagle and logging inotify events

2007-11-14 Thread Andreas Dilger
s could do the same thing easily. That would allow recursive-descent filesystem traversal to be much more efficient because whole chunks of the filesystem tree can be ignored during scans. Cheers, Andreas -- Andreas Dilger Sr. Software Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To

Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Andreas Dilger
"[RFC] add FIEMAP ioctl to efficiently map file allocation" in linux-fsdevel for details on this interface. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscrib

Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Andreas Dilger
ly support this functionality if the filesystem explicitly adds it, instead of pretending that it works when there are sharp pointy things in the dark corners. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send th

Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Andreas Dilger
e elegant for logfs. If not, > FIEMAP could be useful. SEEK_HOLE/SEEK_DATA only provides a fraction of the useful information that FIEMAP does. It won't give users or developers any information about the on disk layout (which is quite important for knowing if allocation algorithms are good). Chee

  1   2   3   >