Re: [RFC] ext3 freeze feature ver 0.2

2008-02-26 Thread Andreas Dilger
On Feb 26, 2008 08:39 -0800, Eric Sandeen wrote: > Takashi Sato wrote: > > > o Elevate XFS ioctl numbers (XFS_IOC_FREEZE and XFS_IOC_THAW) to the VFS > > As Andreas Dilger and Christoph Hellwig advised me, I have elevated > > them to include/linux/fs.h as below.

Re: i_version changes

2008-02-13 Thread Andreas Dilger
ited by the Linux default 250Hz > internal clock. We've seen plenty of examples of NFS clients missing > updates on the resulting filesystem due to the fact that they occurred > within 1/250 sec of each other. The other issue which unfortunately makes ctime a non-starter is the ability

Re: i_version changes

2008-02-13 Thread Andreas Dilger
pdates of the inode after every write. On ext3/ext4 this is expensive, as the ext3_dirty_inode() packs the inode from memory into the buffer each time, so that it can be journaled. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscr

Re: [sample] mem_notify v6: usage example

2008-02-11 Thread Andreas Dilger
o complex, but hiding the details of /dev/mem_notify from applications is desirable. A simple wrapper (possibly part of glibc) to return the poll fd, or set up the signal is enough. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscr

Re: [RFC] ext3 freeze feature

2008-02-08 Thread Andreas Dilger
gt;fd:file descriptor of mountpoint >FITHAW:Request cord for unfreeze You may as well make the common ioctl the same as the XFS version, both by number and parameters, so that applications which already understand the XFS ioctl will work on other filesystems. Cheers, Andreas -- Andreas D

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-02-06 Thread Andreas Dilger
Feb 2008 19:15:25 +0900. [PATCH] ext3,4:fdatasync should skip metadata writeout when overwriting It may be that we already have a solution in that patch for database workloads where the pages are already allocated by avoiding the need for ordered mode journal flushing in that case.

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-30 Thread Andreas Dilger
s already, and while I'm not sure what kernel it is for the JBD code rarely changes much Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-25 Thread Andreas Dilger
the journal on a separate disk and make it big enough that you don't block on it to flush the data to the filesystem (but not so big that it is consuming all of your RAM). That keeps your data guarantees without hurting performance. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer,

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Andreas Dilger
any kind of process (and not just those that are event loop driven) can register a callback at some arbitrary point in the code and be notified. I don't object to the poll() interface, but it would be good to have a signal mechanism also. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer,

Re: [patch 12/26] mount options: fix ext4

2008-01-25 Thread Andreas Dilger
On Jan 24, 2008 20:33 +0100, Miklos Szeredi wrote: > Add stripe= option to /proc/mounts for ext4 filesystems. > > Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]> Acked-by: Andreas Dilger <[EMAIL PROTECTED]> > Inde

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Andreas Dilger
again (or kill the real offender) if the memory usage again becomes an issue. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROT

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 22, 2008 14:38 +1100, David Chinner wrote: > On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: > > I discussed this with Ted at one point also. This is a generic problem, > > not just for readahead, because "fsck" can run multiple e2fsck in paralle

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
of other memory- hungry applications. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
"--maximum-memory MMM /dev/XXX" so each knows how much cache it can allocate. This parameter can also be specified by the user if running e2fsck directly. I haven't looked through your patch yet, but I hope to get to it soon. Cheers, Andreas -- Andreas Dilger Sr. Staff Engi

Re: [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck)

2008-01-16 Thread Andreas Dilger
blocks, incorrect bitmaps). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFD] Incremental fsck

2008-01-09 Thread Andreas Dilger
here is a copy of this script at: http://osdir.com/ml/linux.lvm.devel/2003-04/msg1.html Note that it might need some tweaks to run with DM/LVM2 commands/output, but is mostly what is needed. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada,

Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Andreas Dilger
e elegant for logfs. If not, > FIEMAP could be useful. SEEK_HOLE/SEEK_DATA only provides a fraction of the useful information that FIEMAP does. It won't give users or developers any information about the on disk layout (which is quite important for knowing if allocation algorithms are good). Chee

Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Andreas Dilger
ly support this functionality if the filesystem explicitly adds it, instead of pretending that it works when there are sharp pointy things in the dark corners. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send th

Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Andreas Dilger
"[RFC] add FIEMAP ioctl to efficiently map file allocation" in linux-fsdevel for details on this interface. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscrib

Re: Beagle and logging inotify events

2007-11-14 Thread Andreas Dilger
s could do the same thing easily. That would allow recursive-descent filesystem traversal to be much more efficient because whole chunks of the filesystem tree can be ignored during scans. Cheers, Andreas -- Andreas Dilger Sr. Software Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 17:11 -0700, Mark Fasheh wrote: > On Mon, Oct 29, 2007 at 04:13:02PM -0600, Andreas Dilger wrote: > > > Btrfs, Ocfs2, and Gfs2 pack small amounts of user data directly in inode > > > blocks. > > > > Hmm, but part of the issue would be how to req

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 16:13 -0600, Andreas Dilger wrote: > On Oct 29, 2007 13:57 -0700, Mark Fasheh wrote: > > I'm a little bit confused by fe_offset. Is it a physical offset, or a > > logical offset? The reason I ask is that your description above says "FIEMAP > > i

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
1:45:07PM -0600, Andreas Dilger wrote: > > The FIEMAP ioctl (FIle Extent MAP) is similar to the existing FIBMAP > > ioctl block device ioctl used for mapping an individual logical block > > address in a file to a physical block address in the block device. The > > FIEMAP ioctl wi

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Andreas Dilger
I've just resent the spec we used in a separate email (attached to old thread) for reference. Cheers, Andreas -- Andreas Dilger Sr. Software Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in th

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
o the "filefrag" tool to use FIEMAP. FIEMAP_1.0.txt == File Mapping Interface 18 June 2007 Andreas Dilger, Kalpak Shah Introduction This document covers the user interface and internal implementation of an efficient fragmentation

Re: batching support for transactions

2007-10-03 Thread Andreas Dilger
art a sync handle when the previous one was also synchronous and had > >only a single handle in it, then automatically enable the delay in that > >case. > > I am not sure that this avoids the problem with the current defaults at > 250HZ where each wait is sufficient to do 3 fu

Re: batching support for transactions

2007-10-03 Thread Andreas Dilger
ync transactions to decide if the initial transaction should be held to allow later ones. Alternately, it might be possible to check if a new thread is trying to start a sync handle when the previous one was also synchronous and had only a single handle in it, then automatically enable the delay

Re: Upgrading datastructures between different filesystem versions

2007-09-26 Thread Andreas Dilger
ars of bug fixes in the ext2/ext3 code when adding the ext4 features. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] Mor

Re: [patch 2/4] ext2: fix rec_len overflow for 64KB block size

2007-09-25 Thread Andreas Dilger
gned-off-by: Christoph Lameter <[EMAIL PROTECTED]> Note that we just got a cleaner implemantation of this code on the ext4 mailing list from Jan Kara yesterday. Please use that one instead, in thread "Avoid rec_len overflow with 64KB block size" instead. Cheers, Andreas -- Andreas D

Re: [patch 4/5] VFS: allow filesystems to implement atomic open+truncate

2007-09-21 Thread Andreas Dilger
gnore such truncate requests. This is actually something we've needed to do in Lustre for a while also. We called it ATTR_FROM_OPEN, but I don't really mind ATTR_OPEN either - the less patching we need to do the better. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluste

Re: [patch 3/5] VFS: pass open file to ->xattr()

2007-09-21 Thread Andreas Dilger
a file, nor is the xattr different. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [patch 3/5] VFS: pass open file to ->xattr()

2007-09-21 Thread Andreas Dilger
char *, void *, size_t, > + struct file *); > + ssize_t (*listxattr) (struct dentry *, char *, size_t, struct file *); > + int (*removexattr) (struct dentry *, const char *, struct file *); Likewise - these are no longer inode operations if you need a file. Cheer

Re: [patch 2/5] VFS: pass open file to ->getattr()

2007-09-21 Thread Andreas Dilger
struct file *file); It's not much of an inode operation anymore if you need to pass a file to it... Since the attributes are really part of the inode and not the file, this seems like a bit of a hack. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.

Re: [PATCH] JBD: use GFP_NOFS in kmalloc

2007-09-19 Thread Andreas Dilger
;it's in a filesystem, so it should be GFP_NOFS"? We are only doing journal setup during mount so there shouldn't be any problem using GFP_KERNEL. I don't think it will inject any defect into the code, but I don't think it is needed either. Cheers, Andreas -- Andreas Dilger

Re: [PATCH] JBD slab cleanups

2007-09-19 Thread Andreas Dilger
f(*new_transaction), > - GFP_NOFS|__GFP_NOFAIL); > + new_transaction = kmalloc(sizeof(*new_transaction), GFP_NOFS); This should probably be a __GFP_NOFAIL if we are trying to start a new handle in truncate, as there is no way to propagate an error to the caller. Cheers, Andreas --

Re: Distributed storage. Move away from char device ioctls.

2007-09-15 Thread Andreas Dilger
solaris and ZFS ports) so perhaps those performance > critical parts that remain kernel space will be easier to merge. This is also true - when that is done the only parts that will remain in the kernel are the network drivers. With some network stacks there is even direct userspace accelerati

Re: Distributed storage. Move away from char device ioctls.

2007-09-15 Thread Andreas Dilger
exists, and I think btrfs may have laid claim to the current generation of filesystems. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message

Re: [RFC] System calls for online defrag

2007-09-04 Thread Andreas Dilger
How is this much different than sys_fallocate()? > int sys_get_free_blocks(const char *fs, loff_t start, loff_t end, int count, > struct alloc_extent *space) One alternate possibility is to call the proposed FIEMAP on the block device, to return lists of free/used extents? We have a versi

Re: [RFC 12/26] ext2 white-out support

2007-07-31 Thread Andreas Dilger
x-ext4. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC] basic delayed allocation in VFS

2007-07-29 Thread Andreas Dilger
uite large, we'd like to get this into the kernel one way or another. Can we make a decision if the ext4-specific delalloc is acceptable? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe li

Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-19 Thread Andreas Dilger
ites are guaranteed to never require > allocation of file data." ? > --Mark In the worst case, the unwritten extent could be zero-filled before the write is done, so no exent split is needed. We discussed this recently for the ext4 fallocate, but didn't consider it important e

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-16 Thread Andreas Dilger
t; > cleanup: > kfree(b_entry_name); I don't think you should have brelse(bh) inside the loop, since it is allocated before the loop starts. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the li

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-13 Thread Andreas Dilger
;buffer" and "b_entry_name" are leaked in ext4_expand_extra_isize() if the while loop is run more than one time (again a relatively rare event). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-13 Thread Andreas Dilger
the same process, since the journal handle is also held in current->journal_info so the handle does not need to be passed as an argument all over the VFS. > This seems to boot... albeit I did not push it hard. Can you please also make a patch for jbd2. Cheers, Andreas -- Andreas Dilger Pri

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-07-12 Thread Andreas Dilger
y just dropping the FALLOC_FL_DEALLOCATE and FALLOC_FL_DEL_DATA from the interface. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PRO

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-11 Thread Andreas Dilger
for NFSv4 because it only uses the inequality check. Having the full 64 bits available eliminates the risk of collisions, and given that the spec mandates a 64-bit version I'm sure someone will take full advantage of it in NFS at some point. Cheers, Andreas -- Andreas Dilger Principal Softw

Re: [EXT4 set 8][PATCH 1/1]Add journal checksums

2007-07-11 Thread Andreas Dilger
k with a single copy for now. > > > @@ -328,6 +360,7 @@ static int do_one_pass(journal_t *journa > > > unsigned intsequence; > > > int blocktype; > > > int tag_bytes = journal_tag_bytes(journal); > >

Re: [EXT4 set 7][PATCH 1/1]Remove 32000 subdirs limit.

2007-07-11 Thread Andreas Dilger
> > + inode->i_nlink = 1; > > + EXT4_SET_RO_COMPAT_FEATURE(inode->i_sb, > > + EXT4_FEATURE_RO_COMPAT_DIR_NLINK); > > + } > > + } > > +} > > Why do we set EXT4_FEATURE_RO_COMPAT_DIR_NLINK if i_nlink==2? Because that

Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode

2007-07-11 Thread Andreas Dilger
> + EXT4_I(inode)->i_file_acl); > > + error = -EIO; > > + goto cleanup; > > + } > > + base = BHDR(bh); > > + first = BFIRST(bh); > > + end = bh->b_data + bh->

Re: [EXT4 set 4][PATCH 5/5] i_version: noversion mount option to disable inode version updates

2007-07-11 Thread Andreas Dilger
rsion updates on disk default to OFF unless NFSv4 has exported the filesystem at least once, and then it should set a persistent flag in the superblock indicating that i_version updates are needed. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To u

Re: [EXT4 set 4][PATCH 4/5] i_version:ext4 inode version update

2007-07-11 Thread Andreas Dilger
are confusing i_generation (the instance of this inode number) with i_version (whether this file has been modified)? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel"

Re: [EXT4 set 4][PATCH 3/5] i_version:ext4 inode version read/store

2007-07-11 Thread Andreas Dilger
t; > There's no comparison with EXT4_GOOD_OLD_INODE_SIZE here... Because this is the in-memory version and it is always valid (set to zero if there is extra space in the on-disk inode). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 4][PATCH 2/5] i_version: Add hi 32 bit inode version on ext4 on-disk inode

2007-07-11 Thread Andreas Dilger
On Jul 10, 2007 16:30 -0700, Andrew Morton wrote: > > Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> > > Signed-off-by: Andreas Dilger <[EMAIL PROTECTED]> > > Signed-off-by: Kalpak Shah <[EMAIL PROTECTED]> > > --- >

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-11 Thread Andreas Dilger
internal-only changes which need ext4_mark_inode_dirty(). We had a patch to disable ext4 inode versioning by a flag the superblock, but we dropped it at the last minute because it needed some updates and we didn't want to wait on that for submitting these changes upstream. Cheers, Andreas -- And

Re: [EXT4 set 3][PATCH 1/1] ext4 nanosecond timestamp

2007-07-11 Thread Andreas Dilger
topic, but they aren't attached to the patch. s_want_extra_isize is just an override for sizeof(ext4_inode) in case the sysadmin wants to reserve more fields in new inodes. There is also s_min_extra_isize which is what the kernel and e2fsck guarantee that will be available in all in-use inodes,

Re: [EXT4 set 3][PATCH 1/1] ext4 nanosecond timestamp

2007-07-11 Thread Andreas Dilger
.info/?l=linux-ext4&m=115091699809181&w=2 > > Andreas or Kalpak, is changelog from the original patch is accurate to > apply here? Mostly, yes, but the name of the feature flag has changed. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-06 Thread Andreas Dilger
having it stored into the superblock in s_flags is probably a good idea. Kalpak, do you think you could get a patch that adds e.g. EXT4_FLAGS_NO_INODE_VERSION (like EXT4_FLAGS_SIGNED_HASH in e2fsprogs). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To

Re: [ANNOUNCE] util-linux-ng 2.13-rc1

2007-07-05 Thread Andreas Dilger
brary so that e2fsprogs could use it. The only issue is the increased maintenance and packaging of separate libraries. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [EXT4 set 3][PATCH 1/1] ext4 nanosecond timestamp

2007-07-04 Thread Andreas Dilger
t;Missed this one. > >Thanks. Will update ext4 patch queue tonight with this fix. > > IIRC in the conference call it was decided to not to apply this patch. > Andreas may be able to update better. I wasn't on the most recent concall, and I've forgotten the details

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-03 Thread Andreas Dilger
be ordered between all inodes, is set by Lustre to be a transaction number). Instead of trying to incorporate this unused code into ext4 we just turn off the ext4 version code and let Lustre control this directly. It may even be that NFSv4 will need to control the version numbers itself... Ch

Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version

2007-07-03 Thread Andreas Dilger
t systems we only ever had a 32-bit in-memory version anyway so using only the low 32 bits of i_version in f_version is no more racy than in the past. For 64-bit systems using the full on-disk i_version is possible. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Syste

Re: [EXT4 set 4][PATCH 4/5] i_version:ext4 inode version update

2007-07-03 Thread Andreas Dilger
do we set the qutoa file inodes version to 1 > during write ? Hmm, I thought we had previously fixed this? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-07-01 Thread Andreas Dilger
(default keep prealloc) */ The other possible flags that were proposed, to avoid confusing backup and HSM applications when preallocated space is added or removed from a file (you don't want a backup app to re-backup a file that was migrated via HSM): FA_FL_NO_MTIME 0x10 /* keep same mt

Re: [PATCH 0/6][TAKE5] fallocate system call

2007-06-28 Thread Andreas Dilger
incremental over the previous set). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [AppArmor 00/44] AppArmor security module overview

2007-06-27 Thread Andreas Dilger
Any chance you can remove linux-fsdevel from the CC list? I don't think this has anything to do with filesystems. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" i

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread Andreas Dilger
should succeed in most near-OOM conditions). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Patent or not patent a new idea

2007-06-26 Thread Andreas Dilger
o be efficient. Unfortunately, Linux still has distinct DM and MD layers and there doesn't seem to be any work to combine the two into a more powerful single layer. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: s

Re: [PATCH 7/7][TAKE5] ext4: support new modes

2007-06-26 Thread Andreas Dilger
hanism we now have - we can encode the various different behaviours in any way we want and leave it to the caller. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread Andreas Dilger
On Jun 26, 2007 16:15 +0530, Amit K. Arora wrote: > On Mon, Jun 25, 2007 at 03:52:39PM -0600, Andreas Dilger wrote: > > In XFS one of the (many) ALLOC modes is to zero existing data on allocate. > > For ext4 all this would mean is calling ext4_ext_mark_uninitialized() on > &

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-26 Thread Andreas Dilger
On Jun 26, 2007 16:02 +0530, Amit K. Arora wrote: > On Mon, Jun 25, 2007 at 03:46:26PM -0600, Andreas Dilger wrote: > > Can you clarify - what is the current behaviour when ENOSPC (or some other > > error) is hit? Does it keep the current fallocate() or does it free it? > >

Re: vm/fs meetup in september?

2007-06-25 Thread Andreas Dilger
SIZE. I'll let the rest of you duke it out as long as at least one of them makes it into the kernel. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the b

Re: [PATCH 7/7][TAKE5] ext4: support new modes

2007-06-25 Thread Andreas Dilger
anging ctime, if that is required even though the file is not visibly changing. Maybe the ctime update should be implicit if the size or mtime are changing? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-25 Thread Andreas Dilger
* default is keep existing data */ so that it doesn't imply this is only for DEALLOC. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAI

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

2007-06-25 Thread Andreas Dilger
ot sure if this makes sense at all. > On Mon, Jun 25, 2007 at 07:15:00PM +0530, Amit K. Arora wrote: > > Implement new flags and values for mode argument. > > > > This patch implements the new flags and values for the "mode" argument > > of the fallocate syste

Re: [36/37] Large blocksize support for ext2

2007-06-20 Thread Andreas Dilger
past 64kB blocksize in any case. > > There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to > > 32kB (AFAIK), but I haven't looked into this in a while. > > I'd love to see such a patch. That is also useful for arches that have > PAGE_SIZE &

Re: [36/37] Large blocksize support for ext2

2007-06-20 Thread Andreas Dilger
would break. There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to 32kB (AFAIK), but I haven't looked into this in a while. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsu

Re: [34/37] Large blocksize support in ramfs

2007-06-20 Thread Andreas Dilger
order = simple_strtoul(options, NULL, 10); This is probably a bad name for a mount option. What about "order=10"? Otherwise you prevent any other option from being used in the future. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsub

Re: Versioning file system

2007-06-18 Thread Andreas Dilger
he filesystem and RAID layers can move beyond "ignorance is bliss" when talking to each other would be great. Not rebuilding empty parts of the fs, limit parity resync to parts of the fs that were in the previous transaction, use fs-supplied checksums to verify on-disk data is correct,

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-06-14 Thread Andreas Dilger
On Jun 14, 2007 22:04 +1000, David Chinner wrote: > On Thu, Jun 14, 2007 at 03:14:58AM -0600, Andreas Dilger wrote: > > > B FA_DEALLOCATE > > > removes the underlying disk space with the given range. The disk space > > > shall be removed regardless of it'

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-06-14 Thread Andreas Dilger
not. In some (primitive) implementations it might no longer be possible to distinguish between unwritten extents and zero-filled blocks, though at this point DEALLOC of zero-filled blocks might not be harmful either. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Sy

Re: Read/write counts

2007-06-04 Thread Andreas Dilger
- some applications assume that the amount requested == amount read/written and don't even check whether that is actually the case or not. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linu

Re: [RFC] obsoleting /etc/mtab

2007-05-31 Thread Andreas Dilger
On May 31, 2007 17:11 -0700, H. Peter Anvin wrote: > NFS takes a binary option block anyway. However, that's the exception, > not the rule. There was recently a patch submitted to linux-fsdevel to change NFS to use text option parsing. Cheers, Andreas -- Andreas Dilger Princip

Re: [patch 2/2] i_version update - ext4 part

2007-05-29 Thread Andreas Dilger
n then knfsd can just access it directly. I don't think there is any API to access it from userspace. One option is to add a virtual EA like user.inode_version and have the kernel fill this in from i_version. Lustre will manipulate the ei->i_fs_version directly. Cheers, Andreas -- And

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-25 Thread Andreas Dilger
lock is all about in the end). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC 4/5] inode reservation v0.1 (benchmark result)

2007-05-24 Thread Andreas Dilger
what the mapping turns out to be - the goal is to place inodes with a similar hash into nearby inodes, and this heuristic works relatively well for that. Once the given leaf block's inode range is full then new inodes can be allocated from a new window as it was done for the newly-created d

Re: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Andreas Dilger
your kernel has CONFIG_LBD enabled. The kernel doesn't check if the block layer can actually write to a block device > 2TB. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdev

Re: [PATCH 0/5][TAKE2] fallocate system call

2007-05-14 Thread Andreas Dilger
cation for whatever reason (interrupt, out of space, etc) like a regular write(2) call. In this case the return type needs to also be an loff_t to match @len. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the lin

Re: [RFC][PATCH 13/14] ext3 whiteout support

2007-05-14 Thread Andreas Dilger
It isn't listed in the e2fsprogs repo. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://v

Re: [PATCH 2/2] file capabilities: accomodate >32 bit capabilities

2007-05-10 Thread Andreas Dilger
On May 08, 2007 16:49 -0500, Serge E. Hallyn wrote: > Quoting Andreas Dilger ([EMAIL PROTECTED]): > > One of the important use cases I can see today is the ability to > > split the heavily-overloaded e.g. CAP_SYS_ADMIN into much more fine > > grained attributes. > >

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-05-09 Thread Andreas Dilger
l allocation/ >unallocation ? I would say yes. If glibc does the fallback fallocate via write() the mtime/ctime will be updated, so it makes sense to be consistent for both methods. Also, it just makes sense from the "this file was modified" point of view. Cheers, Andreas -- Andr

Re: [PATCH 2/2] file capabilities: accomodate >32 bit capabilities

2007-05-08 Thread Andreas Dilger
n old kernel is relatively harmless if the old kernel doesn't know what they are. It's like having a key to a door that you don't know where it is. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-08 Thread Andreas Dilger
.g. indirect blocks). One of the design goals for sys_fallocate() was to allow FA_DELALLOC to deallocate unwritten extents in a safe manner. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-07 Thread Andreas Dilger
On May 07, 2007 19:02 -0400, Jeff Garzik wrote: > Andreas Dilger wrote: > >Actually, this is a non-issue. The reason that it is handled for > >extent-only is that this is the only way to allocate space in the > >filesystem without doing the explicit zeroing. > > P

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-07 Thread Andreas Dilger
do manual zero-filling of the file in userspace. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: [PATCH 4/5] ext4: fallocate support in ext4

2007-05-07 Thread Andreas Dilger
then truncate_mutex is not needed. > > + ret = ext4_ext_get_blocks(handle, inode, block, > > + max_blocks, &map_bh, > > + EXT4_CREATE_UNINITIALIZED_EXT, 0); > > + BUG_ON(!ret); >

Re: [PATCH 0/5] fallocate system call

2007-05-03 Thread Andreas Dilger
. I think I'd agree - it may be useful to allow preallocation beyond EOF for some kinds of applications (e.g. PVR preallocating live TV in 10 minute segments or something, but not knowing in advance how long the show will actually be recorded or the final encoded size). Cheers, Andreas -- A

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-05-03 Thread Andreas Dilger
tl(DMAPI_FORCE_READ); ioctl(FIEMAP)" if an application actually needs the data to be present instead of just returning mapping info that includes "UNMAPPED. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-05-01 Thread Andreas Dilger
, and is much better than having version numbers for the interface. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majord

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-05-01 Thread Andreas Dilger
On May 01, 2007 14:22 +1000, David Chinner wrote: > On Mon, Apr 30, 2007 at 04:44:01PM -0600, Andreas Dilger wrote: > > Hmm, I'd thought "offline" would migrate to EXTENT_UNKNOWN, but I didn't > > I disagree - why would you want to indicate the state is unkno

Re: Ext2/3 block remapping tool

2007-05-01 Thread Andreas Dilger
On May 01, 2007 11:28 -0400, Theodore Tso wrote: > On Tue, May 01, 2007 at 12:01:42AM -0600, Andreas Dilger wrote: > > Except one other issue with online shrinking is that we need to move > > inodes on occasion and this poses a bunch of other problems over just > > rema

  1   2   3   >