Re: [RFC] ext3 freeze feature ver 0.2
Takashi Sato wrote: o Elevate XFS ioctl numbers (XFS_IOC_FREEZE and XFS_IOC_THAW) to the VFS As Andreas Dilger and Christoph Hellwig advised me, I have elevated them to include/linux/fs.h as below. #define FIFREEZE_IOWR('X', 119, int) #define FITHAW _IOWR('X', 120, int) The ioctl numbers used by XFS applications don't need to be changed. But my following ioctl for the freeze needs the parameter as the timeout period. So if XFS applications don't want the timeout feature as the current implementation, the parameter needs to be changed 1 (level?) into 0. So, existing xfs applications calling the xfs ioctl now will behave differently, right? We can only keep the same ioctl number if the calling semantics are the same. Keeping the same number but changing the semantics is harmful, IMHO -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in e2fsprogs.git (topics)
Theodore Tso wrote: On Sun, Feb 24, 2008 at 09:20:50PM -0700, Andreas Dilger wrote: On Feb 22, 2008 19:15 -0500, Theodore Ts'o wrote: So before the recent patch were we actually creating long symlinks in extents format? Or were we just setting the flag but still treating them as a block number? If it was the latter, I guess we can put in code into e2fsck to detect that case, and convert it back to a singleton block number. Eric informed me that the long symlinks were actually stored in extent mapped blocks. That is not harmful, because it can only be a single block and it will always fit into the inode. The other thing to note is that extent mapping is REQUIRED for 32-bit blocknumbers, so we may as well fix e2fsprogs to allow these symlinks to be handled normally. Well, at least some kernel versions (as of sometime just before 2.6.25, iirc) were storing the long symlink as a single block in i_block[0], despite EXTENTS_FL being set. Valerie noticed this, and I confirmed it, as it caused the mainline e2fsck extents support to core dump. Basically, what this means is that e2fsprogs can't trust EXTENTS_FL for long symlinks. Are you sure? This was her patch comment, from [PATCH] ext4: Don't set EXTENTS_FL flag for fast symlinks: For fast symbolic links, the file content is stored in the i_block[] array, which is not compatible with the new file extents format. e2fsck reports error on such files because EXTENTS_FL is set. Don't set the EXTENTS_FL flag when creating fast symlinks. so this was for *fast* symlinks, stored internally on top of the i_block array, right? In that case trying to read it as extents would certainly cause problems. But, for *long* symlinks I think they truly were stored in extent format, and as you say... But you do raise a good point that we need to support using the extents format in order to support blocks 2**32, so we can't just arbitrary convert all symlinks to the old-style direct block maps. ... so I think we really *should* be unconditionally storing *long* symlinks in extent format, on ext4... right? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] e2fsprogs: error checking in blkid/devname.c
Theodore Tso wrote: This looks good, but I assume that the bug was caused by some race condition where if you try to call dm_task_get_info() while some other process is creating or removing a snapshot, dm_task_get_info() is returning some kind of EAGAIN, or some other Try again; we're busy error, right? If that is the case, can you try to find out what error is being returned? It may be the right thing to do is to check to see if we are getting a resource is locked; try again in a sec error message, and retry the dm_task_get_info(), instead of just returning a failure. well, dm_task_get_info just returns either 0 or 1; unless there is some other contextual piece of information to use, I don't know if we can differentiate between error types. I'll ask agk... Thanks!! - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] e2fsprogs: error checking in blkid/devname.c
Theodore Tso wrote: On Fri, Feb 22, 2008 at 09:02:56AM -0600, Eric Sandeen wrote: Theodore Tso wrote: This looks good, but I assume that the bug was caused by some race condition where if you try to call dm_task_get_info() while some other process is creating or removing a snapshot, dm_task_get_info() is returning some kind of EAGAIN, or some other Try again; we're busy error, right? If that is the case, can you try to find out what error is being returned? It may be the right thing to do is to check to see if we are getting a resource is locked; try again in a sec error message, and retry the dm_task_get_info(), instead of just returning a failure. well, dm_task_get_info just returns either 0 or 1; unless there is some other contextual piece of information to use, I don't know if we can differentiate between error types. I'll ask agk... Maybe the right thing is to try 3 times before giving up, maybe with a nanosleep in between, or some such? Hopefully agk can give us some hints about what's the right way to handle errors from all of the dm_task* calls. From a quick chat with agk, it sounds like outright failure is appropriate. Sounds like most of the calls fail for reasons like ENOMEM (but it might be nice if it returned that, eh?) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] e2fsprogs: error checking in blkid/devname.c
Theodore Tso wrote: On Fri, Feb 22, 2008 at 10:16:53AM -0600, Eric Sandeen wrote: From a quick chat with agk, it sounds like outright failure is appropriate. Sounds like most of the calls fail for reasons like ENOMEM (but it might be nice if it returned that, eh?) So the question then is why is it that Phillip was able to seeing failures when he was creating and deleting snapshots? I don't mind having blkid return a failure, but it may not fix Phillip's scenario which he listed in BZ #433857; yeah, he won't have a core dump, which is good, but it might mean that some or all of the dm volumes disappear from the blkid results. Maybe a device-mapper bug is in order :) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e2fsck dies with error this should never happen!!!
Bas van Schaik wrote: In a few hours I will hit the road towards France for a holiday, from which I will return on the 3rd of March. I would be _really_ grateful if you could assist me solving this problem. Providing the compressed e2image would let ted or others look into the problem. Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e2fsck dies with error this should never happen!!!
Bas van Schaik wrote: Bas van Schaik wrote: In a few hours I will hit the road towards France for a holiday, from which I will return on the 3rd of March. I would be _really_ grateful if you could assist me solving this problem. Providing the compressed e2image would let ted or others look into the problem. Should that be a raw image or a 'normal' one? Raw, I think; that way you can compress it as well. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in e2fsprogs.git (topics)
Theodore Tso wrote: On Wed, Feb 20, 2008 at 12:46:57PM -0600, Eric Sandeen wrote: Theodore Tso wrote: The big news here is that extents branch is sufficiently finished that I've promoted it to the next branch. Hey, that's great news. :) Out of curiosity; what is the plan for behavior when finding symlinks with the extents flag set? Last I checked e2fsprogs-interim, they got clobbered. Is that on the TODO before extents support goes live? Right now on e2fpsrogs 'next' the extents flag being set on symlinks, block/char devices (in general inodes for which ext2fs_has_valid_blocks returns NULL) are ignored. I need to make sure this does the right thing with long symlinks --- and I'd argue that given that long symlinks can only *ever* be one block long, it's pointless to use the extents format, since it's just too complicated, and I *think* that's what the kernel code is doing, but I need to check to be sure. After the patches I sent a couple days ago, which Aneesh then collapsed into the ext4_new_inode logic, that should be the case (for both types of symlinks) Eventually I'll add code to mainline to clear EXTENTS_FL from inodes that shouldn't have it, but the kernel patches need to hit mainline first. Ok, but my concern is what happens to those long symlinks when they really truly are in extents format. One option is to say hey it was ext4DEV, deal with it and nuke the symlink, but if possible, converting back to the proper format would be nice. ... So one question which Eric you'll want to consider is at what point you want to switch the FC users from e2fsprogs-interim to the bleeding-edge e2fsprogs mainline code, since eventually this is the branch that will have blk64_t support. Well, honestly I haven't even put -interim into rawhide yet, although I'm fairly close to that (or -next, whichever makes the most sense) In my testing a few days ago, e2fsck from -interim blew away the long symlinks, and I didn't want to foist that upon the folks using ext4dev today... One related question is whether we want to try to get support for full 64-bit physical block numbers into ext4. I think there were some rough draft patches floating about, but IIRC they didn't simultaneously support the 48-bit extent format. The e2fsprogs mainline implementation of extents makes it fairly easy to support new extents formats --- it's minimal code changes in a single file, lib/ext2fs/extents.c, and made sure all of the infrastructure was present to make this easy --- but I believe that trying to support multiple formats in the kernel given the current ext4 code would be fairly invasive. Given the timeline, I'm assuming that our current path is that we aren't planning on pushing full 64-bit physical block support into the extents format for what we hope will make it into the next round of enterprise kernels, but I thought I should throw out the question so we make the decision one way or the other, explicitly. I too had assumed that 48 bits would be it for now; it should be sufficient for a good while. I guess what I'd like to see if a usable ext4 out there in the near future, with stuff added on later as necessary; delalloc, flex_bg (if that doesn't make 2.6.25...) etc. Oh, speaking of all this - what do you think the criteria are for dropping the dev from ext4dev? How do we decide that it's cooked enough? :) Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] e2fsprogs: error checking in blkid/devname.c
This is for RH Bugzilla #433857: rpc.mountd segfaults due to uninitialized value in e2fsprogs devname.c https://bugzilla.redhat.com/show_bug.cgi?id=433857 which did some very helpful analysis provided a patch. This patch is based on that, but checks all the devicemapper calls, and does some goto error handling / unwrapping, in the same style as the device-mapper lib code itself. Compile-tested only, but seems fine to me. Thanks, -Eric Index: e2fsprogs-1.40.5/lib/blkid/devname.c === --- e2fsprogs-1.40.5.orig/lib/blkid/devname.c +++ e2fsprogs-1.40.5/lib/blkid/devname.c @@ -171,37 +171,42 @@ static int dm_device_has_dep(const dev_t struct dm_deps *deps; struct dm_info info; unsigned int i; + int ret = 0; task = dm_task_create(DM_DEVICE_DEPS); if (!task) - return 0; + goto out; - dm_task_set_name(task, name); - dm_task_run(task); - dm_task_get_info(task, info); + if (!dm_task_set_name(task, name)) + goto out; - if (!info.exists) { - dm_task_destroy(task); - return 0; - } + if (!dm_task_run(task)) + goto out; + + if (!dm_task_get_info(task, info)) + goto out; + + if (!info.exists) + goto out; deps = dm_task_get_deps(task); - if (!deps || deps-count == 0) { - dm_task_destroy(task); - return 0; - } + if (!deps || deps-count == 0) + goto out; for (i = 0; i deps-count; i++) { dev_t dep_dev = deps-device[i]; if (dev == dep_dev) { - dm_task_destroy(task); - return 1; + ret = 1; + goto out; } } - dm_task_destroy(task); - return 0; +out: + if (task) + dm_task_destroy(task); + + return ret; } static int dm_device_is_leaf(const dev_t dev) @@ -214,15 +219,16 @@ static int dm_device_is_leaf(const dev_t dm_log_init(dm_quiet_log); task = dm_task_create(DM_DEVICE_LIST); if (!task) - return 1; + goto out; + dm_log_init(0); - dm_task_run(task); + if (!dm_task_run(task)) + goto out; + names = dm_task_get_names(task); - if (!names || !names-dev) { - dm_task_destroy(task); - return 1; - } + if (!names || !names-dev) + goto out; n = 0; do { @@ -234,7 +240,9 @@ static int dm_device_is_leaf(const dev_t next = names-next; } while (next); - dm_task_destroy(task); +out: + if (task) + dm_task_destroy(task); return ret; } @@ -247,20 +255,25 @@ static dev_t dm_get_devno(const char *na task = dm_task_create(DM_DEVICE_INFO); if (!task) - return ret; + goto out; - dm_task_set_name(task, name); - dm_task_run(task); - dm_task_get_info(task, info); + if (!dm_task_set_name(task, name)) + goto out; - if (!info.exists) { - dm_task_destroy(task); - return ret; - } + if (!dm_task_run(task)) + goto out; + + if (!dm_task_get_info(task, info)) + goto out; + + if (!info.exists) + goto out; ret = makedev(info.major, info.minor); - dm_task_destroy(task); +out: + if (task) + dm_task_destroy(task); return ret; } @@ -275,15 +288,15 @@ static void dm_probe_all(blkid_cache cac dm_log_init(dm_quiet_log); task = dm_task_create(DM_DEVICE_LIST); if (!task) - return; + goto out; dm_log_init(0); - dm_task_run(task); + if (!dm_task_run(task)) + goto out; + names = dm_task_get_names(task); - if (!names || !names-dev) { - dm_task_destroy(task); - return; - } + if (!names || !names-dev) + goto out; n = 0; do { @@ -311,7 +324,9 @@ try_next: next = names-next; } while (next); - dm_task_destroy(task); +out: + if (task) + dm_task_destroy(task); } #endif /* HAVE_DEVMAPPER */ - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ext4: ext4_find_next_zero_bit needs an aligned address on some arch
Aneesh Kumar K.V wrote: ext4_find_next_zero_bit and ext4_find_next_bit needs a long aligned address on x8_64. Add mb_find_next_zero_bit and mb_find_next_bit and use them in the mballoc. Fix: https://bugzilla.redhat.com/show_bug.cgi?id=433286 Eric Sandeen debugged the problem and suggested the fix. Also, Ted Mingming: we probably should get this into 2.6.25; at least w/ the way the Fedora kernel is configured, ext4 is pretty much DOA w/o this change. I'm not sure why it started showing up now, but it is, in a big way. :) Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ext4: ext4_find_next_zero_bit needs an aligned address on some arch
Aneesh Kumar K.V wrote: ext4_find_next_zero_bit and ext4_find_next_bit needs a long aligned address on x8_64. Add mb_find_next_zero_bit and mb_find_next_bit and use them in the mballoc. Fix: https://bugzilla.redhat.com/show_bug.cgi?id=433286 Eric Sandeen debugged the problem and suggested the fix. Thanks for fixing this up, Aneesh. Thanks for getting rid of the magic looks-like-a-function-but-isn't #define, too. :) The testcase I reproduced this with (basically just rebuilding a kernel src.rpm on ext4) seems to pass with change. As we had it, the direct call of find_next_zero_bit seemed to almost do the right thing, except it scanned more bits than we asked for, up to 64 bits' worth, and so a) wandered off our page and b) returned an answer that was the size we asked it to scan in some cases. (It's unfortunate that the alignment code for this got taken out in the first place, only to put it back now, but I guess I didn't speak up then, either...) Really, I think the core bitmap functions could use some fixing, or at least some warnings if it's given an address it can't cope with properly. But for now Acked-by: Eric Sandeen [EMAIL PROTECTED] Thanks, -Eric Signed-off-by: Aneesh Kumar K.V [EMAIL PROTECTED] CC:Eric Sandeen [EMAIL PROTECTED] --- fs/ext4/mballoc.c | 62 ++-- 1 files changed, 40 insertions(+), 22 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 89772b9..ccddd21 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -627,21 +627,19 @@ static ext4_fsblk_t ext4_grp_offs_to_block(struct super_block *sb, return block; } +static inline void *mb_correct_addr_and_bit(int *bit, void *addr) +{ #if BITS_PER_LONG == 64 -#define mb_correct_addr_and_bit(bit, addr) \ -{\ - bit += ((unsigned long) addr 7UL) 3; \ - addr = (void *) ((unsigned long) addr ~7UL); \ -} + *bit += ((unsigned long) addr 7UL) 3; + addr = (void *) ((unsigned long) addr ~7UL); #elif BITS_PER_LONG == 32 -#define mb_correct_addr_and_bit(bit, addr) \ -{\ - bit += ((unsigned long) addr 3UL) 3; \ - addr = (void *) ((unsigned long) addr ~3UL); \ -} + *bit += ((unsigned long) addr 3UL) 3; + addr = (void *) ((unsigned long) addr ~3UL); #else #error how many bits you are?! #endif + return addr; +} static inline int mb_test_bit(int bit, void *addr) { @@ -649,34 +647,54 @@ static inline int mb_test_bit(int bit, void *addr) * ext4_test_bit on architecture like powerpc * needs unsigned long aligned address */ - mb_correct_addr_and_bit(bit, addr); + addr = mb_correct_addr_and_bit(bit, addr); return ext4_test_bit(bit, addr); } static inline void mb_set_bit(int bit, void *addr) { - mb_correct_addr_and_bit(bit, addr); + addr = mb_correct_addr_and_bit(bit, addr); ext4_set_bit(bit, addr); } static inline void mb_set_bit_atomic(spinlock_t *lock, int bit, void *addr) { - mb_correct_addr_and_bit(bit, addr); + addr = mb_correct_addr_and_bit(bit, addr); ext4_set_bit_atomic(lock, bit, addr); } static inline void mb_clear_bit(int bit, void *addr) { - mb_correct_addr_and_bit(bit, addr); + addr = mb_correct_addr_and_bit(bit, addr); ext4_clear_bit(bit, addr); } static inline void mb_clear_bit_atomic(spinlock_t *lock, int bit, void *addr) { - mb_correct_addr_and_bit(bit, addr); + addr = mb_correct_addr_and_bit(bit, addr); ext4_clear_bit_atomic(lock, bit, addr); } +static inline int mb_find_next_zero_bit(void *addr, int max, int start) +{ + int fix = 0; + addr = mb_correct_addr_and_bit(fix, addr); + max += fix; + start += fix; + + return ext4_find_next_zero_bit(addr, max, start) - fix; +} + +static inline int mb_find_next_bit(void *addr, int max, int start) +{ + int fix = 0; + addr = mb_correct_addr_and_bit(fix, addr); + max += fix; + start += fix; + + return ext4_find_next_bit(addr, max, start) - fix; +} + static void *mb_find_buddy(struct ext4_buddy *e4b, int order, int *max) { char *bb; @@ -946,12 +964,12 @@ static void ext4_mb_generate_buddy(struct super_block *sb, /* initialize buddy from bitmap which is aggregation * of on-disk bitmap and preallocations */ - i = ext4_find_next_zero_bit(bitmap, max, 0); + i = mb_find_next_zero_bit(bitmap, max, 0); grp-bb_first_free = i; while (i max) { fragments++; first = i; - i = ext4_find_next_bit(bitmap, max, i); + i = mb_find_next_bit(bitmap, max, i); len = i - first; free += len; if (len 1) @@ -959,7 +977,7 @@ static void
Re: What's cooking in e2fsprogs.git (topics)
Theodore Tso wrote: The big news here is that extents branch is sufficiently finished that I've promoted it to the next branch. Hey, that's great news. :) Out of curiosity; what is the plan for behavior when finding symlinks with the extents flag set? Last I checked e2fsprogs-interim, they got clobbered. Is that on the TODO before extents support goes live? Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [E2FSPROGS, RFC] New mke2fs types parsing
Theodore Ts'o wrote: The following patch is a work in progress, but I'm sending it out so folks can take a look at it and comment on the general approach. What this does is change how mke2fs -T works so it can take a comma separated list, so you can do things like this: mke2fs -T ext4,small,news Is there some hierarchy of what these options are and how they fit together? i.e. small news might go together (in bizarro world...) but small,large wouldn't make sense - nor would -T ext3,ext4. Or, if somebody stores mail news on the same fs, nad they say -T mail,news but the mail news types have conflicting directives... how will you define what an acceptable composite of subtypes will be? Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [E2FSPROGS, RFC] New mke2fs types parsing
Theodore Tso wrote: On Wed, Feb 20, 2008 at 12:51:21PM -0600, Eric Sandeen wrote: Theodore Ts'o wrote: The following patch is a work in progress, but I'm sending it out so folks can take a look at it and comment on the general approach. What this does is change how mke2fs -T works so it can take a comma separated list, so you can do things like this: mke2fs -T ext4,small,news Is there some hierarchy of what these options are and how they fit together? i.e. small news might go together (in bizarro world...) but small,large wouldn't make sense - nor would -T ext3,ext4. Or, if somebody stores mail news on the same fs, nad they say -T mail,news but the mail news types have conflicting directives... how will you define what an acceptable composite of subtypes will be? There are only three things which mke2fs will do, in my design: I apologize if I was too quick to ask a dumb question rather than atually read through the patch, but thank you for the text explanation :) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH e2fsprogs] - detect LVM2 PVs in libblkid
Theodore Tso wrote: The patch works for me, but it would seem the right thing would be print the UUID without the hyphens, i.e.: guOQGdcOE3IafCm0190XkPZTy5fCEanQ instead of guOQGd-cOE3-IafC-m019-0XkP-ZTy5-fCEanQ But it already does print it without the hyphens. Or did you mean to say print the UUID *with* the hyphens? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clear extents flag on inodes created in ext4_mknod
Aneesh Kumar K.V wrote: On Tue, Feb 19, 2008 at 10:39:52AM -0600, Eric Sandeen wrote: Eric Sandeen wrote: e2fsck doesn't expect to find char, block, fifo, or socket files with the extent flag set, so clear that in ext4_mknod. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24/fs/ext4/namei.c === --- linux-2.6.24.orig/fs/ext4/namei.c +++ linux-2.6.24/fs/ext4/namei.c @@ -1766,6 +1766,7 @@ retry: #ifdef CONFIG_EXT4DEV_FS_XATTR inode-i_op = ext4_special_inode_operations; #endif + EXT4_I(inode)-i_flags = ~EXT4_EXTENTS_FL; err = ext4_add_nondir(handle, dentry, inode); } ext4_journal_stop(handle); now that I think about it; perhaps it would be better to put this logic into ext4_new_inode, rather than setting it by default and clearing it here... that way new_inode() has all the logic about whether or not a particular type of file is in extents format. How about enabling it only for directory and regular files rather than enabling it globally and then disabling the flag for symlink and device files ? I think that makes sense. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clear extents flag on inodes created in ext4_mknod
Aneesh Kumar K.V wrote: How about enabling it only for directory and regular files rather than enabling it globally and then disabling the flag for symlink and device files ? how about something like below. There are two reason for not inheriting the i_flag from directory. Looks about right to me... see below diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index d2c2e55..f430939 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -794,7 +794,12 @@ got: ei-i_dir_start_lookup = 0; ei-i_disksize = 0; - ei-i_flags = EXT4_I(dir)-i_flags ~EXT4_INDEX_FL; + /* + * Don't inherit extent flag from directory. We set extent flag on + * newly created directory and file only if -o extent mount option is + * specfied nitpick: typo + */ + ei-i_flags = EXT4_I(dir)-i_flags ~ (EXT4_INDEX_FL|EXT4_EXTENTS_FL); if (S_ISLNK(mode)) ei-i_flags = ~(EXT4_IMMUTABLE_FL|EXT4_APPEND_FL); /* dirsync only applies to directories */ @@ -836,8 +841,10 @@ got: ext4_std_error(sb, err); goto fail_free_drop; } - if (test_opt(sb, EXTENTS) !S_ISLNK(mode)) { - EXT4_I(inode)-i_flags |= EXT4_EXTENTS_FL; + if (test_opt(sb, EXTENTS)) { + /* set extent flag only for diretory and file */ + if (S_ISDIR(mode) || S_ISREG(mode)) + EXT4_I(inode)-i_flags |= EXT4_EXTENTS_FL; ext4_ext_tree_init(handle, inode); err = ext4_update_incompat_feature(handle, sb, EXT4_FEATURE_INCOMPAT_EXTENTS); Hm you only set the extents flag for dir reg, but you call the next two functions for all files? I think the next 2 lines need to be part of the conditional, no? (and then there can be just one big test, indentation can come out a bit...?) But, the spirit of the patch looks right. -Eric diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 23902ba..da942bc 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1771,7 +1771,6 @@ retry: #ifdef CONFIG_EXT4DEV_FS_XATTR inode-i_op = ext4_special_inode_operations; #endif - EXT4_I(inode)-i_flags = ~EXT4_EXTENTS_FL; err = ext4_add_nondir(handle, dentry, inode); } ext4_journal_stop(handle); - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clear extents flag on inodes created in ext4_mknod
Eric Sandeen wrote: e2fsck doesn't expect to find char, block, fifo, or socket files with the extent flag set, so clear that in ext4_mknod. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24/fs/ext4/namei.c === --- linux-2.6.24.orig/fs/ext4/namei.c +++ linux-2.6.24/fs/ext4/namei.c @@ -1766,6 +1766,7 @@ retry: #ifdef CONFIG_EXT4DEV_FS_XATTR inode-i_op = ext4_special_inode_operations; #endif + EXT4_I(inode)-i_flags = ~EXT4_EXTENTS_FL; err = ext4_add_nondir(handle, dentry, inode); } ext4_journal_stop(handle); now that I think about it; perhaps it would be better to put this logic into ext4_new_inode, rather than setting it by default and clearing it here... that way new_inode() has all the logic about whether or not a particular type of file is in extents format. Think it's worth changing? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ext4: set EXT4_EXTENTS_FL only for directory and regular files
Aneesh Kumar K.V wrote: Also don't inherit EXT4_EXTENTS_FL from parent directory. If we have a directory with extent flag set and later mount the file system with -o noextents, the files created in that directory will also have extent flag set but we would not have called ext4_ext_tree_init for them. This will cause error later when we are verifying the extent header Also we don't want to set extent flag for symlinks, char, block, fifo or socket Minor typo in comments, diretory, but otherwise: Acked-by: Eric Sandeen [EMAIL PROTECTED] Signed-off-by: Aneesh Kumar K.V [EMAIL PROTECTED] --- fs/ext4/ialloc.c | 22 +++--- fs/ext4/namei.c |1 - 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 028e601..78d1094 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -794,7 +794,12 @@ got: ei-i_dir_start_lookup = 0; ei-i_disksize = 0; - ei-i_flags = EXT4_I(dir)-i_flags ~EXT4_INDEX_FL; + /* + * Don't inherit extent flag from directory. We set extent flag on + * newly created directory and file only if -o extent mount option is + * specified + */ + ei-i_flags = EXT4_I(dir)-i_flags ~(EXT4_INDEX_FL|EXT4_EXTENTS_FL); if (S_ISLNK(mode)) ei-i_flags = ~(EXT4_IMMUTABLE_FL|EXT4_APPEND_FL); /* dirsync only applies to directories */ @@ -837,12 +842,15 @@ got: goto fail_free_drop; } if (test_opt(sb, EXTENTS)) { - EXT4_I(inode)-i_flags |= EXT4_EXTENTS_FL; - ext4_ext_tree_init(handle, inode); - err = ext4_update_incompat_feature(handle, sb, - EXT4_FEATURE_INCOMPAT_EXTENTS); - if (err) - goto fail; + /* set extent flag only for diretory and file */ + if (S_ISDIR(mode) || S_ISREG(mode)) { + EXT4_I(inode)-i_flags |= EXT4_EXTENTS_FL; + ext4_ext_tree_init(handle, inode); + err = ext4_update_incompat_feature(handle, sb, + EXT4_FEATURE_INCOMPAT_EXTENTS); + if (err) + goto fail; + } } ext4_debug(allocating inode %lu\n, inode-i_ino); diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 39d4af4..da942bc 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -2225,7 +2225,6 @@ retry: inode-i_op = ext4_fast_symlink_inode_operations; memcpy((char*)EXT4_I(inode)-i_data,symname,l); inode-i_size = l-1; - EXT4_I(inode)-i_flags = ~EXT4_EXTENTS_FL; } EXT4_I(inode)-i_disksize = inode-i_size; err = ext4_add_nondir(handle, dentry, inode); - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH e2fsprogs] print mostly-printable xattr strings in debugfs
Eric Sandeen wrote: Taking a cue from getfattr... if a string is mostly printable characters, go ahead print as a string, and escape what's left over. Ted, ping on this? Since we use selinux a lot, it'd be nice to have something readable in debugfs output. If there's anything you don't like about the patch I'll be happy to fix it up. Thanks, -Eric so we get: Extended attributes stored in inode body: selinux = system_u:object_r:root_t:s0\000 (28) instead of: Extended attributes stored in inode body: selinux = 73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f 72 3a 72 6f 6f 74 5f 74 3a 73 30 00 (28) (selinux includes the trailing null in len so it never prints as a string today) Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.5/debugfs/debugfs.c === --- e2fsprogs-1.40.5.orig/debugfs/debugfs.c +++ e2fsprogs-1.40.5/debugfs/debugfs.c @@ -434,19 +434,21 @@ static int list_blocks_proc(ext2_filsys static void dump_xattr_string(FILE *out, const char *str, int len) { - int printable = 1; + int printable = 0; int i; - /* check is string printable? */ + /* check: is string printable enough? */ for (i = 0; i len; i++) - if (!isprint(str[i])) { - printable = 0; - break; - } + if (isprint(str[i])) + printable++; + + if (printable = len*7/8) + printable = 0; for (i = 0; i len; i++) if (printable) - fprintf(out, %c, (unsigned char)str[i]); + fprintf(out, isprint(str[i]) ? %c : \\%03o, + (unsigned char)str[i]); else fprintf(out, %02x , (unsigned char)str[i]); } - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][7/28] e2fsprogs-extents.patch
Eric Sandeen wrote: So this trips up on things like sockets, fifos, and block char nodes. Just to demonstrate; doing this on ext4: mknod mnt/block b 1 1 mknod mnt/char c 1 1 mknod mnt/fifo p mksock mnt/sock mkdir -p mnt/verylongdir12345678901234567890/verylongdir12345678901234567890/verylongdir12345678901234567890 ln -s mnt/verylongdir12345678901234567890/verylongdir12345678901234567890/verylongdir12345678901234567890 mnt/longlink yields an unhappy fsck w/ e2fsprogs-interim: e2fsck 1.40.6 (09-Feb-2008) Pass 1: Checking inodes, blocks, and sizes Inode 12 has EXTENT_FL set, but is not in extents format Fix? no Inode 13 has EXTENT_FL set, but is not in extents format Fix? no Inode 14 has EXTENT_FL set, but is not in extents format Fix? no Inode 15 has EXTENT_FL set, but is not in extents format Fix? no Inode 17 has EXTENT_FL set, but is not in extents format Fix? no Pass 2: Checking directory structure Inode 12 (/block) is an illegal block device. Clear? no Inode 13 (/char) is an illegal character device. Clear? no Inode 14 (/fifo) is an illegal FIFO. Clear? no Inode 15 (/sock) is an illegal socket. Clear? no Symlink /longlink (inode #17) is invalid. Clear? no Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][7/28] e2fsprogs-extents.patch
Andreas Dilger wrote: Support for checking 32-bit extents format inodes and the INCOMPAT_EXTENTS feature. Clear the high 16 bits of extents and index entries, since the extents patches did not do this explicitly. Some parts of this code need fixing for checking 32-bit block filesystems (when INCOMPAT_64BIT support is added), marked FIXME: 48-bit support. Verify extent headers in blocks, logical ordering of extents, logical ordering of indexes. Add explicit checking of {d,t,}indirect and index blocks to detect corruption instead of implicitly doing this by checking the referred blocks and only block-at-a-time correctness. This avoids incorrectly invoking the very lengthy duplicate blocks pass for bad indirect/index blocks. We may want to tune the threshold for how many errors make a bad indirect/index block. Add ability to split or remove extents in order to allow extent reallocation during the duplicate blocks pass. ... @@ -904,21 +910,75 @@ void e2fsck_pass1(e2fsck_t ctx) ctx-fs_sockets_count++; } else mark_inode_bad(ctx, ino); - if (inode-i_block[EXT2_IND_BLOCK]) - ctx-fs_ind_count++; - if (inode-i_block[EXT2_DIND_BLOCK]) - ctx-fs_dind_count++; - if (inode-i_block[EXT2_TIND_BLOCK]) - ctx-fs_tind_count++; - if (inode-i_block[EXT2_IND_BLOCK] || - inode-i_block[EXT2_DIND_BLOCK] || - inode-i_block[EXT2_TIND_BLOCK] || - inode-i_file_acl) { - inodes_to_process[process_inode_count].ino = ino; - inodes_to_process[process_inode_count].inode = *inode; - process_inode_count++; - } else - check_blocks(ctx, pctx, block_buf); + + eh = (struct ext3_extent_header *)inode-i_block; + if ((inode-i_flags EXT4_EXTENTS_FL)) { + if ((LINUX_S_ISREG(inode-i_mode) || + LINUX_S_ISDIR(inode-i_mode)) So this trips up on things like sockets, fifos, and block char nodes. Also this is unhappy: @@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2 * If the index flag is set, then this is a bogus * device/fifo/socket */ - if (inode-i_flags EXT2_INDEX_FL) + if (inode-i_flags (EXT2_INDEX_FL | EXT4_EXTENTS_FL)) return 0; Do we really care if these have the extents flag set? IOW should we make sure the kernel doesn't set the flag, or should we make e2fsck not care... There are enough checks in e2fsck to show the intent was that these files should not have the extents flag set, but I'm not sure why it matters enough that the kernel needs to run around being sure to clear it Or... (rambling on now) it seems odd to me that zero-length files have the extents flag set at all; should we only set extents when we actually get a block allocated to the file? That would also take care of this from the kernel side I think. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][7/28] e2fsprogs-extents.patch
Theodore Tso wrote: On Mon, Feb 18, 2008 at 11:56:53AM -0600, Eric Sandeen wrote: So this trips up on things like sockets, fifos, and block char nodes. Also this is unhappy: @@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2 * If the index flag is set, then this is a bogus * device/fifo/socket */ - if (inode-i_flags EXT2_INDEX_FL) + if (inode-i_flags (EXT2_INDEX_FL | EXT4_EXTENTS_FL)) return 0; Do we really care if these have the extents flag set? IOW should we make sure the kernel doesn't set the flag, or should we make e2fsck not care... Sigh I think we need to get kernel patches into mainline ASAP not to set the EXTENTS_FL You mean on devices/fifos/sockets ? Ok. But today, with 2.6.25-rc1 and e2fsprogs-interim, long (non-fast) symlinks get clobbered by e2fsck, because: Pass 1: Checking inodes, blocks, and sizes Inode 12 has EXTENT_FL set, but is not in extents format Fix? yes Inode 12 has illegal block(s). Clear? yes Illegal block #0 (127754) in inode 12. CLEARED. Inode 12 is too big. Truncate? yes Block #1 (4) causes symlink to be too big. CLEARED. Block #4 (1) causes symlink to be too big. CLEARED. Block #5 (4772) causes symlink to be too big. CLEARED. Inode 12, i_blocks is 2, should be 0. Fix? yes Pass 2: Checking directory structure Symlink /longlink (inode #12) is invalid. Clear? yes Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -4772 Fix? yes Free blocks count wrong for group #0 (3420, counted=3421). Fix? yes Free blocks count wrong (26192, counted=26193). Fix? yes and *poof* it's gone. That one concerns me more... This *should* be in extents format, right, even though it's limited to one block... and at least for now, e2fsck needs to accept (and not complain or core dump) if EXTENTS_FL is set for files where ext2fs_inode_has_valid_blocks() returns false well, if any filetypes are not supposed to have the extents flag set, and they're zero-length, I'd say go ahead clear it, and even complain if you like - it's the design intent after all - I wouldn't worry about the noise at this stage. FWIW, I haven't seen a core dump. :) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] clear extents flag on inodes created in ext4_mknod
e2fsck doesn't expect to find char, block, fifo, or socket files with the extent flag set, so clear that in ext4_mknod. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24/fs/ext4/namei.c === --- linux-2.6.24.orig/fs/ext4/namei.c +++ linux-2.6.24/fs/ext4/namei.c @@ -1766,6 +1766,7 @@ retry: #ifdef CONFIG_EXT4DEV_FS_XATTR inode-i_op = ext4_special_inode_operations; #endif + EXT4_I(inode)-i_flags = ~EXT4_EXTENTS_FL; err = ext4_add_nondir(handle, dentry, inode); } ext4_journal_stop(handle); - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] don't set extents flag for _any_ symlinks
As symlinks are limited to a single block anyway, and e2fsck doesn't expect to find it set, don't set the extents flag on any type of symlinks at all: fast/in-inode, or the external-block flavor. There are a lot of filesystems out there by now w/ exent-style symlink blocks though, so e2fsck should probably be able to repair that at some point... Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24/fs/ext4/namei.c === --- linux-2.6.24.orig/fs/ext4/namei.c +++ linux-2.6.24/fs/ext4/namei.c @@ -2223,7 +2226,6 @@ retry: inode-i_op = ext4_fast_symlink_inode_operations; memcpy((char*)EXT4_I(inode)-i_data,symname,l); inode-i_size = l-1; - EXT4_I(inode)-i_flags = ~EXT4_EXTENTS_FL; } EXT4_I(inode)-i_disksize = inode-i_size; err = ext4_add_nondir(handle, dentry, inode); Index: linux-2.6.24/fs/ext4/ialloc.c === --- linux-2.6.24.orig/fs/ext4/ialloc.c +++ linux-2.6.24/fs/ext4/ialloc.c @@ -744,7 +744,7 @@ got: ext4_std_error(sb, err); goto fail_free_drop; } - if (test_opt(sb, EXTENTS)) { + if (test_opt(sb, EXTENTS) !S_ISLNK(mode)) { EXT4_I(inode)-i_flags |= EXT4_EXTENTS_FL; ext4_ext_tree_init(handle, inode); err = ext4_update_incompat_feature(handle, sb, - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND] ext4: Fix kernel BUG at fs/ext4/mballoc.c:910!
Valerie Clement wrote: Eric Sandeen wrote: Valerie Clement wrote: Fix kernel BUG at fs/ext4/mballoc.c:910! From: Valerie Clement [EMAIL PROTECTED] With the flex_bg feature enabled, a large file creation oopses the kernel. Valerie, what's the specific testcase for this? Thanks, -Eric I just did a simple dd, dd if=/dev/zero of=/mnt/test/foo bs=1M count=8k on an ext4 filesystem with the uninit_groups, lazy_bg and flex_bg features enabled and the default mount options. Ok, thanks Valerie. It's just always nice to know exactly how to reproduce a problem that a patch fixes, IMHO. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in e2fsprogs.git (topics)
Theodore Tso wrote: I'ts been a while since I've pushed one of these out. This updates pu against the latest next branch, so we can make sure all of the changes that had been folded into maint is included on the pu branch. It also includes more work on the tt/extents patches. Tune2fs and e2image are now functional with extents, and e2fsck has more extents checking added. Still a bit more work to be done there, and I also have to add code to deal with symlinks that have the EXTENTS_FL flag set. I've dropped the js/flex-bg branch since it had conflicts with the on-disk bg flags, and it had conflicts with the exsting patches that caused me concern. Jose, once you regenerate the patch and deal with comments, I'll add it back in. My TODO list: finish e2fsck extents support more flexible mke2fs.conf handling EXT4 -- allow mke2fs 8TB without -f ext3 should be fine as well, right? And I think Josef sent a patch... -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] allocate struct ext4_allocation_context from a kmem cache to save stack space
Mingming Cao wrote: printk could be removed...so as long as it builds fine. I had looked at the history yesterday and find this fix http://lists.openwall.net/linux-ext4/2007/10/10/2 so I was under impression that the ifdefs was added to fix compile issue. I did not look more closely. Maybe that's not a issue any more. I guess I should look into it. For now let's just drop the #ifdef removal, it's not related anyway. Would you like me to send a fresh patch? Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG_ON at mballoc.c:3752
Eric Sesterhenn wrote: * Eric Sandeen ([EMAIL PROTECTED]) wrote: Eric Sesterhenn wrote: Eric , can you run the test with below patch and see if this makes any difference ?. I know we are not fixing any bugs in the below patch. ok, i checked out the old version again and applied both patches, the BUG is gone (no surprise) In the case where it would have hit the BUG i now get the following message: [ 740.400288] Aborting journal on device loop0. No message before that about *why* it aborted? assumed the stuff was from the previous runs [ 735.424574] EXT4-fs error (device loop0): ext4_mb_generate_buddy: EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd [ 740.400288] Aborting journal on device loop0. There it is, thanks. Just making sure :) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
call for brave ext4 testers in F9 Alpha, with caveats
(I sent this to fedora-devel, but of course anyone here who would like to give it a whirl should also do so! So far there haven't been any bad reports. :) See the wiki link at the bottom for a couple of updates, caveats gotchas.) .. Hi Gang - The ext4 filesystem is a feature slated for F9, and we (rh, ibm, bull, clusterfs, and others) are working feverishly to get it ready for a prime-time F9 fs option. I'd like to get a bit more exposure if possible, so if you're feeling like living on the cutting edge of filesystems, read on: Rawhide and the F9 alpha can install onto an ext4dev filesystem root; first you need to tell (a.k.a. lie to) the installer with iamanext4developer on the boot commandline. This is akin to the jfs or reiserfs options for those filesystems, but a higher hurdle (more characters to type!). Then you can do custom partitioning, and select ext4dev for any filesystem (except /boot - no grub support (yet)). The install should proceed Just Fine(tm). If not, let me know. And now for the (known) caveats: 1) Due to bug 429857: Root inode of ext4dev root filesystem does not get selinux label - booting with selinux enabled enforcing will probably fail. Boot with enforcing=0, and use restorecon or chcon on / to (hopefully) properly update the root inode's selinux attrs. I have a fix for this bug, so soon, kernel updates will resolve this and allow it to be properly set (and retained). Please do test w/ selinux enabled though, as the new larger inodes and in-inode xattrs could use airtime. 2) There is no readily-available e2fsprogs which can repair your shiny new ext4dev filesystem. If something goes badly I'll help out because we need to know what went wrong, but so far there is no released upstream e2fsprogs which can handle the new ext4 features. So please consider anything you put on ext4dev for now to be disposable, just to be on the safe side. extents-capable e2fsprogs should be available Real Soon Now. 3) misc stuff - I've not yet tested ext4 over an encrypted block device, or even over an lvm volume. There may be some stack issues on x86 boxes still, I'm working on slimming that down. I hope that more real-world use will shake out any remaining problems. ... and I suppose I should put this into a wiki page: http://fedoraproject.org/wiki/FedoraExt4 We can keep it updated with any further issues or resolutions. Thanks! -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, E2FSPROGS] blkid: Automatically chose between ext4 and ext4dev as appropriate
Theodore Ts'o wrote: I plan to release this in an upcoming 1.40.6 release, along with code that allows tune2fs -E test_fs to work on ext4 filesystems. The idea is that if I can get this into community distro's, it will make it easier for them to experiment with ext4/ext4dev, with the appropriate forward compatibility when sometime (hopefully 2.6.26 or 2.6.27?) we rename ext4dev to ext4. Any comments before I push this change out to the maint branch? Seems logically correct to me... although it also seems like this has gotten fairly complex by now, what with blkid rummaging around in modules.dep, /proc files, etc... -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG_ON at mballoc.c:3752
Eric Sesterhenn wrote: Eric , can you run the test with below patch and see if this makes any difference ?. I know we are not fixing any bugs in the below patch. ok, i checked out the old version again and applied both patches, the BUG is gone (no surprise) In the case where it would have hit the BUG i now get the following message: [ 740.400288] Aborting journal on device loop0. No message before that about *why* it aborted? [ 740.405032] ext4_abort called. [ 740.405097] EXT4-fs error (device loop0): ext4_journal_start_sb: Detected aborted journal [ 740.405178] Remounting filesystem read-only [ 740.410974] EXT4-fs error (device loop0) in ext4_ordered_write_end: IO failure [ 740.414300] EXT4-fs error (device loop0) in ext4_reserve_inode_write: Journal has aborted [ 740.414422] pa cba56990: logic 16, phys. 1953, len 16 [ 740.414447] EXT4-fs error (device loop0): ext4_mb_release_inode_pa: free 4, pa_free 3 [ 740.414548] EXT4-fs error (device loop0) in ext4_mb_free_blocks: Journal has aborted Greetings, Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] allocate struct ext4_allocation_context from a kmem cache to save stack space
Mingming Cao wrote: Do you intend to remove the #ifdef CONFIG_PROC_FS, or it's a accident? I think we need keep that to allow ext4 build without procfs configured. Other than this, the patch looks fine to me.:) oh, it kind of snuck in. It actually should still build, as remove_proc_entry is a no-op function w/o the config option. Feel free to leave it in if you like though. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] allocate struct ext4_allocation_context from a kmem cache to save stack space
Mingming Cao wrote: On Thu, 2008-02-07 at 19:06 -0600, Eric Sandeen wrote: Mingming Cao wrote: Do you intend to remove the #ifdef CONFIG_PROC_FS, or it's a accident? I think we need keep that to allow ext4 build without procfs configured. Other than this, the patch looks fine to me.:) oh, it kind of snuck in. It actually should still build, as remove_proc_entry is a no-op function w/o the config option. Oh, I mean the proc_mkdir(EXT4_ROOT, proc_root_fs) will complain w/o CONFIG_PROC_FS configured. Mingming it'll build: static inline struct proc_dir_entry *proc_mkdir(const char *name, struct proc_dir_entry *parent) {return NULL;} yes, it'll issue a printk though. *shrug* I like fewer #ifdefs better than more, but doesn't matter much to me. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG_ON at mballoc.c:3752
Aneesh Kumar K.V wrote: On Thu, Feb 07, 2008 at 05:30:48PM -0800, Mingming Cao wrote: On Thu, 2008-02-07 at 18:25 +0530, Aneesh Kumar K.V wrote: ext4: Don't panic in case of corrupt bitmap From: Aneesh Kumar K.V [EMAIL PROTECTED] Multiblock allocator was calling BUG_ON in many case if the free and used blocks count obtained looking at the bitmap is different from what the allocator internally accounted for. Use ext4_error in such case and don't panic the system. There seems a lot of BUG_ON() and BUG() in mballoc code, other than this case. Should it always panic the whole system in those cases? Perhaps replacing with ext4_error() or some cases just WARN_ON is enough. I had looked at the BUG_ON in mballoc code and found them very useful while stabilizing the mballoc code. It helped to catch wrong usage of functions. Most of the BUG_ON are there to make sure we call the API with the lock held or the API should not return value greater than 'x' Should not call the function with a particular argument as NULL ...etc kind of thing. So i would suggest to keep them as such. Yep - as long as the BUG_ONs are for programming errors, and not things like memory allocation failures or corrupted disks, I think it's ok. Not that I've audited them all. :) I asked about this early on, and got that answer... I'm reasonably satisfied with it. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] allocate struct ext4_allocation_context from a kmem cache to save stack space
struct ext4_allocation_context is rather large, and this bloats the stack of many functions which use it. Allocating it from a named slab cache will alleviate this. For example, with this change (on top of the noinline patch sent earlier): -ext4_mb_new_blocks 200 +ext4_mb_new_blocks 40 -ext4_mb_free_blocks344 +ext4_mb_free_blocks168 -ext4_mb_release_inode_pa 216 +ext4_mb_release_inode_pa40 -ext4_mb_release_group_pa 192 +ext4_mb_release_group_pa24 Most of these stack-allocated structs are actually used only for mballoc history; and in those cases often a smaller struct would do. So changing that may be another way around it, at least for those functions, if preferred. For now, in those cases where the ac is only for history, an allocation failure simply skips the history recording, and does not cause any other failures. Comments? Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc6-mm1/fs/ext4/mballoc.c === --- linux-2.6.24-rc6-mm1.orig/fs/ext4/mballoc.c +++ linux-2.6.24-rc6-mm1/fs/ext4/mballoc.c @@ -419,6 +419,7 @@ #define MB_DEFAULT_GROUP_PREALLOC 512 static struct kmem_cache *ext4_pspace_cachep; +static struct kmem_cache *ext4_ac_cachep; #ifdef EXT4_BB_MAX_BLOCKS #undef EXT4_BB_MAX_BLOCKS @@ -2958,11 +2959,18 @@ int __init init_ext4_mballoc(void) if (ext4_pspace_cachep == NULL) return -ENOMEM; -#ifdef CONFIG_PROC_FS + ext4_ac_cachep = + kmem_cache_create(ext4_alloc_context, +sizeof(struct ext4_allocation_context), +0, SLAB_RECLAIM_ACCOUNT, NULL); + if (ext4_ac_cachep == NULL) { + kmem_cache_destroy(ext4_pspace_cachep); + return -ENOMEM; + } + proc_root_ext4 = proc_mkdir(EXT4_ROOT, proc_root_fs); if (proc_root_ext4 == NULL) printk(KERN_ERR EXT4-fs: Unable to create %s\n, EXT4_ROOT); -#endif return 0; } @@ -2971,9 +2979,8 @@ void exit_ext4_mballoc(void) { /* XXX: synchronize_rcu(); */ kmem_cache_destroy(ext4_pspace_cachep); -#ifdef CONFIG_PROC_FS + kmem_cache_destroy(ext4_ac_cachep); remove_proc_entry(EXT4_ROOT, proc_root_fs); -#endif } @@ -3698,7 +3705,7 @@ static noinline int ext4_mb_release_inod struct buffer_head *bitmap_bh, struct ext4_prealloc_space *pa) { - struct ext4_allocation_context ac; + struct ext4_allocation_context *ac; struct super_block *sb = e4b-bd_sb; struct ext4_sb_info *sbi = EXT4_SB(sb); unsigned long end; @@ -3714,9 +3721,13 @@ static noinline int ext4_mb_release_inod BUG_ON(group != e4b-bd_group pa-pa_len != 0); end = bit + pa-pa_len; - ac.ac_sb = sb; - ac.ac_inode = pa-pa_inode; - ac.ac_op = EXT4_MB_HISTORY_DISCARD; + ac = kmem_cache_alloc(ext4_ac_cachep, GFP_NOFS); + + if (ac) { + ac-ac_sb = sb; + ac-ac_inode = pa-pa_inode; + ac-ac_op = EXT4_MB_HISTORY_DISCARD; + } while (bit end) { bit = ext4_find_next_zero_bit(bitmap_bh-b_data, end, bit); @@ -3732,11 +3743,13 @@ static noinline int ext4_mb_release_inod (unsigned) group); free += next - bit; - ac.ac_b_ex.fe_group = group; - ac.ac_b_ex.fe_start = bit; - ac.ac_b_ex.fe_len = next - bit; - ac.ac_b_ex.fe_logical = 0; - ext4_mb_store_history(ac); + if (ac) { + ac-ac_b_ex.fe_group = group; + ac-ac_b_ex.fe_start = bit; + ac-ac_b_ex.fe_len = next - bit; + ac-ac_b_ex.fe_logical = 0; + ext4_mb_store_history(ac); + } mb_free_blocks(pa-pa_inode, e4b, bit, next - bit); bit = next + 1; @@ -3750,6 +3763,8 @@ static noinline int ext4_mb_release_inod } BUG_ON(free != pa-pa_free); atomic_add(free, sbi-s_mb_discarded); + if (ac) + kmem_cache_free(ext4_ac_cachep, ac); return err; } @@ -3757,12 +3772,15 @@ static noinline int ext4_mb_release_inod static noinline int ext4_mb_release_group_pa(struct ext4_buddy *e4b, struct ext4_prealloc_space *pa) { - struct ext4_allocation_context ac; + struct ext4_allocation_context *ac; struct super_block *sb = e4b-bd_sb; ext4_group_t group; ext4_grpblk_t bit; - ac.ac_op = EXT4_MB_HISTORY_DISCARD; + ac = kmem_cache_alloc(ext4_ac_cachep, GFP_NOFS); + + if (ac) + ac-ac_op = EXT4_MB_HISTORY_DISCARD; BUG_ON(pa-pa_deleted == 0
[PATCH] reduce mballoc stack usage with noinline
mballoc.c is a whole lot of static functions, which gcc seems to really like to inline. With the changes below, on x86, I can at least get from: 432 ext4_mb_new_blocks 240 ext4_mb_free_blocks 208 ext4_mb_discard_group_preallocations 188 ext4_mb_seq_groups_show 164 ext4_mb_init_cache 152 ext4_mb_release_inode_pa 136 ext4_mb_seq_history_show ... to 220 ext4_mb_free_blocks 188 ext4_mb_seq_groups_show 176 ext4_mb_regular_allocator 164 ext4_mb_init_cache 156 ext4_mb_new_blocks 152 ext4_mb_release_inode_pa 136 ext4_mb_seq_history_show 124 ext4_mb_release_group_pa ... which still has some big functions in there, but not 432 bytes! I wonder if a zone would make sense for struct ext4_allocation_context, it's 108 bytes by itself. I haven't honestly done any investigation of any performance impact of the noinlines; FWIW xfs just took control of inlining, and marked almost every non-trivial function as noinline, (although the xfs situation was a little more dire) and I don't think they had any noticeable performance impact. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-git9/fs/ext4/mballoc.c === --- linux-2.6.24-git9.orig/fs/ext4/mballoc.c +++ linux-2.6.24-git9/fs/ext4/mballoc.c @@ -1146,7 +1146,7 @@ out: return err; } -static int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group, +static noinline int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group, struct ext4_buddy *e4b) { struct ext4_sb_info *sbi = EXT4_SB(sb); @@ -1926,7 +1926,7 @@ static int ext4_mb_good_group(struct ext return 0; } -static int ext4_mb_regular_allocator(struct ext4_allocation_context *ac) +static noinline int ext4_mb_regular_allocator(struct ext4_allocation_context *ac) { ext4_group_t group; ext4_group_t i; @@ -2433,7 +2433,7 @@ static void ext4_mb_history_init(struct /* if we can't allocate history, then we simple won't use it */ } -static void ext4_mb_store_history(struct ext4_allocation_context *ac) +static noinline void ext4_mb_store_history(struct ext4_allocation_context *ac) { struct ext4_sb_info *sbi = EXT4_SB(ac-ac_sb); struct ext4_mb_history h; @@ -2769,7 +2769,7 @@ int ext4_mb_release(struct super_block * return 0; } -static void ext4_mb_free_committed_blocks(struct super_block *sb) +static noinline void ext4_mb_free_committed_blocks(struct super_block *sb) { struct ext4_sb_info *sbi = EXT4_SB(sb); int err; @@ -2982,7 +2982,7 @@ void exit_ext4_mballoc(void) * Check quota and mark choosed space (ac-ac_b_ex) non-free in bitmaps * Returns 0 if success or error code */ -static int ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, +static noinline int ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, handle_t *handle) { struct buffer_head *bitmap_bh = NULL; @@ -3099,7 +3099,7 @@ static void ext4_mb_normalize_group_requ * Normalization means making request better in terms of * size and alignment */ -static void ext4_mb_normalize_request(struct ext4_allocation_context *ac, +static noinline void ext4_mb_normalize_request(struct ext4_allocation_context *ac, struct ext4_allocation_request *ar) { int bsbits, max; @@ -3368,7 +3368,7 @@ static void ext4_mb_use_group_pa(struct /* * search goal blocks in preallocated space */ -static int ext4_mb_use_preallocated(struct ext4_allocation_context *ac) +static noinline int ext4_mb_use_preallocated(struct ext4_allocation_context *ac) { struct ext4_inode_info *ei = EXT4_I(ac-ac_inode); struct ext4_locality_group *lg; @@ -3535,7 +3535,7 @@ static void ext4_mb_put_pa(struct ext4_a /* * creates new preallocated space for given inode */ -static int ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) +static noinline int ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) { struct super_block *sb = ac-ac_sb; struct ext4_prealloc_space *pa; @@ -3622,7 +3622,7 @@ static int ext4_mb_new_inode_pa(struct e /* * creates new preallocated space for locality group inodes belongs to */ -static int ext4_mb_new_group_pa(struct ext4_allocation_context *ac) +static noinline int ext4_mb_new_group_pa(struct ext4_allocation_context *ac) { struct super_block *sb = ac-ac_sb; struct ext4_locality_group *lg; @@ -3695,7 +3695,7 @@ static int ext4_mb_new_preallocation(str * the caller MUST hold group/inode locks. * TODO: optimize the case when there are no in-core structures yet */ -static int ext4_mb_release_inode_pa(struct ext4_buddy *e4b, +static noinline int ext4_mb_release_inode_pa(struct ext4_buddy *e4b, struct buffer_head *bitmap_bh, struct ext4_prealloc_space *pa) { @@ -3755,7 +3755,7 @@ static int
[PATCH] make mballoc history a config option
mballoc history is likely a great debugging tool, but it seems a bit heavyweight. If I make it a config option and turn it off, things lighten up considerably, from: 220 ext4_mb_free_blocks 188 ext4_mb_seq_groups_show 176 ext4_mb_regular_allocator 164 ext4_mb_init_cache 156 ext4_mb_new_blocks 152 ext4_mb_release_inode_pa 136 ext4_mb_seq_history_show 124 ext4_mb_release_group_pa 124 ext4_mb_init ... to: 176 ext4_mb_regular_allocator 164 ext4_mb_init_cache 156 ext4_mb_new_blocks 124 ext4_mb_init 112 ext4_mb_free_blocks ... 44 ext4_mb_release_inode_pa ... 16 ext4_mb_release_group_pa It's a bit shocking how much this matters to the size of ext4_mb_release_inode_pa etc; I've not yet found the reason why. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-git9/fs/ext4/mballoc.c === --- linux-2.6.24-git9.orig/fs/ext4/mballoc.c +++ linux-2.6.24-git9/fs/ext4/mballoc.c @@ -367,10 +367,9 @@ #endif /* - * with EXT4_MB_HISTORY mballoc stores last N allocations in memory - * and you can monitor it in /proc/fs/ext4/dev/mb_history + * with CONFIG_EXT4DEV_FS_MBHISTORY mballoc stores last N allocations in + * memory and you can monitor it in /proc/fs/ext4/dev/mb_history */ -#define EXT4_MB_HISTORY #define EXT4_MB_HISTORY_ALLOC 1 /* allocation */ #define EXT4_MB_HISTORY_PREALLOC 2 /* preallocated blocks used */ #define EXT4_MB_HISTORY_DISCARD4 /* preallocation discarded */ @@ -562,7 +561,7 @@ struct ext4_buddy { #define EXT4_MB_BITMAP(e4b)((e4b)-bd_bitmap) #define EXT4_MB_BUDDY(e4b) ((e4b)-bd_buddy) -#ifndef EXT4_MB_HISTORY +#ifndef CONFIG_EXT4DEV_FS_MBHISTORY static inline void ext4_mb_store_history(struct ext4_allocation_context *ac) { return; @@ -2093,7 +2092,7 @@ out: return err; } -#ifdef EXT4_MB_HISTORY +#ifdef CONFIG_EXT4DEV_FS_MBHISTORY struct ext4_mb_proc_session { struct ext4_mb_history *history; struct super_block *sb; Index: linux-2.6.24-rc6-mm1/fs/Kconfig === --- linux-2.6.24-rc6-mm1.orig/fs/Kconfig +++ linux-2.6.24-rc6-mm1/fs/Kconfig @@ -202,6 +202,13 @@ config EXT4DEV_FS_SECURITY If you are not using a security module that requires using extended attributes for file security labels, say N. +config EXT4DEV_FS_MBHISTORY + bool Ext4dev mballoc allocator history + help + Enabling this option make it possible to monitor mballoc + allocator history via /proc/fs/ext4/dev/mb_history. + Disabling this option will save memory and stack space. + config JBD tristate help - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] make mballoc history a config option
Eric Sandeen wrote: It's a bit shocking how much this matters to the size of ext4_mb_release_inode_pa etc; I've not yet found the reason why. Oh, well, duh - its' the big 108 byte allocation context on the stack and if it's not used it gets optimized away. I think it's worth looking at allocating that one. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 9855] New: ext3 ACL corruption
Andreas Dilger wrote: On Jan 30, 2008 14:49 -0800, Andrew Morton wrote: Problem Description: Inode size: 256 This is a bit interesting, since it isn't very common to use large inodes. I suspect this relates to the problem. I think it is somewhat common on samba servers, though. And it's the new default in the latest e2fsprogs... maybe something will shake out in the F9 development cycle. These are production Samba servers making fairly extensive use of file and directory ACLs. Thus far, I've only noticed the corruptions when it came time to upgrade to a new kernel and reboot (and the boot scripts then run fsck). Note that I've never noticed any issues at runtime because of this - only when I later realised that ACLs had been removed from random files and/or directories. I think I will implement some scripts to unmount and run fsck nightly from cron, so I can at least detect the corruption a little earlier. If there is some more helpful debugging output I can provide, please let me know. There is just such a script in the thread forced fsck (again?). Since you are using LVs for the filesystem. Which is on the ext3-users list btw... If you are able to reproduce this, could you please dump the inode and EA block before fixing the problem. Do you need instructions on doing that? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RFC: file magic for ext4
This seems to work fine; any comments, either about the logic, or the text descriptions, or the various options I've shown/not shown? To test, copy your magic file somewhere local, replace the ext2/3 section with the following, and do something like: # file -C -m magic; file -m ./magic -s /dev/sda1 -Eric # ext2/ext3 filesystems - Andreas Dilger [EMAIL PROTECTED] 0x438 leshort 0xEF53 Linux 0x44c lelong x rev %d 0x43e leshort x \b.%d # No journal? ext2 0x45c lelong ^0x004 ext2 filesystem data 0x43a leshort ^0x001 (mounted or unclean) # Has a journal? ext3 or ext4 0x45c lelong 0x004 # and small INCOMPAT? 0x460 lelong 0x040 # and small RO_COMPAT? 0x464 lelong 0x008 ext3 filesystem data # else large RO_COMPAT? 0x464 lelong 0x007 ext4 filesystem data # else large INCOMPAT? 0x460 lelong 0x03f ext4 filesystem data # General flags for any ext* fs 0x460 lelong 0x004 (needs journal recovery) 0x43a leshort 0x002 (errors) # INCOMPAT flags 0x460 lelong 0x001 (compressed) #0x460 lelong 0x002 (filetype) #0x460 lelong 0x010 (meta bg) 0x460 lelong 0x040 (extents) 0x460 lelong 0x080 (64bit) #0x460 lelong 0x100 (mmp) #0x460 lelong 0x200 (flex bg) # RO_INCOMPAT flags #0x464 lelong 0x001 (sparse super) 0x464 lelong 0x002 (large files) 0x464 lelong 0x008 (huge files) #0x464 lelong 0x010 (gdt checksum) #0x464 lelong 0x020 (many subdirs) #0x463 lelong 0x040 (extra isize) - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e2fsck ext4 with extents possible?
supersud501 wrote: hello, i'm testing the ext4-filesystem for months now (no data loss experienced ), but i noticed e2fsck not being able to check a ext4-filesystem with extents (e2fsck: Filesystem has unsupported feature(s)). i just pulled linus' kernel tree (with the newly added patches to ext4) and e2fsprogs git tree (from git.kernel.org), but still e2fsprogs is not able to check my ext4-fs. so i wonder whether there's somewhere a patch for e2fsprogs flying around i've missed or whether checking ext4 with extents is simply not supported yet? if so, how's development going on with this? No released e2fsprogs supports ext4 filesystems with new on-disk features yet; that's partly what the dev on the end of ext4dev.ko is for. Ted's working on it. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH e2fsprogs] - detect LVM2 PVs in libblkid
The anaconda folks are now using blkid instead of hand-rolled tests for filesystem type at install time, but they had one more request: Bugzilla Bug 409321: RFE: information on blkdevs formatted as PVs https://bugzilla.redhat.com/show_bug.cgi?id=409321 The attached patch does the right thing for me on my sample set of exactly 1 PV... Any issues with reporting back something which is not actually a filesystem (lvm2pv) ? [EMAIL PROTECTED] misc/blkid -c /dev/null /dev/sda2 /dev/sda2: UUID=guOQGdcOE3IafCm0190XkPZTy5fCEanQ TYPE=lvm2pv [EMAIL PROTECTED] pvs -o pv_name,pv_uuid PV PV UUID /dev/sda2 guOQGd-cOE3-IafC-m019-0XkP-ZTy5-fCEanQ Bits liberally stolen from lvm2 userspace. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.5/lib/blkid/probe.h === --- e2fsprogs-1.40.5.orig/lib/blkid/probe.h +++ e2fsprogs-1.40.5/lib/blkid/probe.h @@ -531,6 +531,20 @@ struct hfs_mdb { __u16embed_blockcount; } __attribute__((packed)); +/* this is lvm's label_header pv_header combined. */ + +#define LVM2_ID_LEN 32 + +struct lvm2_pv_label_header { + /* label_header */ + __u8id[8]; /* LABELONE */ + __u64 sector_xl; /* Sector number of this label */ + __u32 crc_xl; /* From next field to end of sector */ + __u32 offset_xl; /* Offset from start of struct to contents */ + __u8type[8];/* LVM2 001 */ + /* pv_header */ + __u8pv_uuid[LVM2_ID_LEN]; +} __attribute__ ((packed)); /* * Byte swap functions Index: e2fsprogs-1.40.5/lib/blkid/probe.c === --- e2fsprogs-1.40.5.orig/lib/blkid/probe.c +++ e2fsprogs-1.40.5/lib/blkid/probe.c @@ -895,6 +895,64 @@ static int probe_hfsplus(struct blkid_pr return 1; } +#define LVM2_LABEL_SIZE 512 +static int lvm2_calc_crc(const void *buf, uint size) +{ + static const uint crctab[] = { + 0x, 0x1db71064, 0x3b6e20c8, 0x26d930ac, + 0x76dc4190, 0x6b6b51f4, 0x4db26158, 0x5005713c, + 0xedb88320, 0xf00f9344, 0xd6d6a3e8, 0xcb61b38c, + 0x9b64c2b0, 0x86d3d2d4, 0xa00ae278, 0xbdbdf21c + }; + uint i, crc = 0xf597a6cf; + const __u8 *data = (const __u8 *) buf; + + for (i = 0; i size; i++) { + crc ^= *data++; + crc = (crc 4) ^ crctab[crc 0xf]; + crc = (crc 4) ^ crctab[crc 0xf]; + } + return crc; +} + +static int probe_lvm2(struct blkid_probe *probe, + struct blkid_magic *id __BLKID_ATTR((unused)), + unsigned char *buf) +{ + int sector = (id-bim_kboff) 1;; + struct lvm2_pv_label_header *label; + label = (struct lvm2_pv_label_header *)buf; + + /* buf is at 0k or 1k offset; find label inside */ + if (memcmp(buf, LABELONE, 8) == 0) { + label = (struct lvm2_pv_label_header *)buf; + } else if (memcmp(buf + 512, LABELONE, 8) == 0) { + label = (struct lvm2_pv_label_header *)(buf + 512); + sector++; + } else { + return 1; + } + + if (blkid_le64(label-sector_xl) != sector) { + DBG(DEBUG_PROBE, + printf(LVM2: label for sector %d found at sector %d\n, + blkid_le64(label-sector_xl), sector)); + return 1; + } + + if (lvm2_calc_crc(label-offset_xl, LVM2_LABEL_SIZE - + ((void *)label-offset_xl - (void *)label)) != + blkid_le32(label-crc_xl)) { + DBG(DEBUG_PROBE, + printf(LVM2: label checksum incorrect at sector %d\n, + sector)); + return 1; + } + + blkid_set_tag(probe-dev, UUID, label-pv_uuid, LVM2_ID_LEN); + + return 0; +} /* * BLKID_BLK_OFFS is at least as large as the highest bim_kboff defined * in the type_array table below + bim_kbalign. @@ -988,6 +1046,10 @@ static struct blkid_magic type_array[] = { crypt_LUKS, 0,0, 6, LUKS\xba\xbe, probe_luks }, { squashfs, 0, 0, 4, sqsh, 0 }, { squashfs, 0, 0, 4, hsqs, 0 }, + { lvm2pv, 0, 0x218, 8, LVM2 001, probe_lvm2 }, + { lvm2pv, 0, 0x018, 8, LVM2 001, probe_lvm2 }, + { lvm2pv, 1, 0x018, 8, LVM2 001, probe_lvm2 }, + { lvm2pv, 1, 0x218, 8, LVM2 001, probe_lvm2 }, { NULL, 0, 0, 0, NULL, NULL } }; - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] allow in-inode EAs on ext4 root inode
The ext3 root inode was treated specially with respect to in-inode extended attributes, for reasons detailed in the removed comment below. The first mkfs-created inodes would not get extra_i_size or the EXT3_STATE_XATTR flag set in ext3_read_inode, which disallowed reading or setting in-inode EAs on the root. However, in ext4, ext4_mark_inode_dirty calls ext4_expand_extra_isize for all inodes; once this is done EAs may be placed in the root ext4 inode body. But for reasons above, it won't be found after a reboot. testcase: setfattr -n user.name -v value mntpt/ setfattr -n user.name2 -v value2 mntpt/ umount mntpt/; remount mntpt/ getfattr -d mntpt/ name2/value2 has gone missing; debugfs shows it in the inode body, but it is not found there by getattr. The following fixes it up; newer mkfs appears to properly zero the inodes, so this workaround isn't needed for ext4. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc6-mm1/fs/ext4/inode.c === --- linux-2.6.24-rc6-mm1.orig/fs/ext4/inode.c +++ linux-2.6.24-rc6-mm1/fs/ext4/inode.c @@ -2750,13 +2750,7 @@ void ext4_read_inode(struct inode * inod ei-i_data[block] = raw_inode-i_block[block]; INIT_LIST_HEAD(ei-i_orphan); - if (inode-i_ino = EXT4_FIRST_INO(inode-i_sb) + 1 - EXT4_INODE_SIZE(inode-i_sb) EXT4_GOOD_OLD_INODE_SIZE) { - /* -* When mke2fs creates big inodes it does not zero out -* the unused bytes above EXT4_GOOD_OLD_INODE_SIZE, -* so ignore those first few inodes. -*/ + if (EXT4_INODE_SIZE(inode-i_sb) EXT4_GOOD_OLD_INODE_SIZE) { ei-i_extra_isize = le16_to_cpu(raw_inode-i_extra_isize); if (EXT4_GOOD_OLD_INODE_SIZE + ei-i_extra_isize EXT4_INODE_SIZE(inode-i_sb)) { - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH e2fsprogs] print mostly-printable xattr strings in debugfs
Taking a cue from getfattr... if a string is mostly printable characters, go ahead print as a string, and escape what's left over. so we get: Extended attributes stored in inode body: selinux = system_u:object_r:root_t:s0\000 (28) instead of: Extended attributes stored in inode body: selinux = 73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f 72 3a 72 6f 6f 74 5f 74 3a 73 30 00 (28) (selinux includes the trailing null in len so it never prints as a string today) Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.5/debugfs/debugfs.c === --- e2fsprogs-1.40.5.orig/debugfs/debugfs.c +++ e2fsprogs-1.40.5/debugfs/debugfs.c @@ -434,19 +434,21 @@ static int list_blocks_proc(ext2_filsys static void dump_xattr_string(FILE *out, const char *str, int len) { - int printable = 1; + int printable = 0; int i; - /* check is string printable? */ + /* check: is string printable enough? */ for (i = 0; i len; i++) - if (!isprint(str[i])) { - printable = 0; - break; - } + if (isprint(str[i])) + printable++; + + if (printable = len*7/8) + printable = 0; for (i = 0; i len; i++) if (printable) - fprintf(out, %c, (unsigned char)str[i]); + fprintf(out, isprint(str[i]) ? %c : \\%03o, + (unsigned char)str[i]); else fprintf(out, %02x , (unsigned char)str[i]); } - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Integrating patches in SLES10 e2fsprogs
Theodore Tso wrote: On Mon, Jan 28, 2008 at 04:26:53PM +0100, Matthias Koenig wrote: Patch6: e2fsprogs-mdraid.patch This apparently adds a new environment variable, BLKID_SKIP_CHECK_MDRAID, which forces blkid to not detect mdraid devices. I'm not sure why. Workaround for people having stale RAID signature on their disk: https://bugzilla.novell.com/show_bug.cgi?id=100530 Hmm... there's got to be a better way around this. Won't help existing block devices, but it'd be nice to have a common library which could be called @ mkfs time to wipe out all known signatures... mkfs.xfs tries to do this, but it'd be silly to duplicate in every mkfs. On my one of these days list is to get another cheap/used laptop so I can try out the latest Fedora Core Rawhide without having to fire up a huge (noisy) x86_64 box Just partition... ;) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Integrating patches in SLES10 e2fsprogs
Theodore Tso wrote: Patch10:close.patch I don't understand what this patch is trying to do. * Tue Nov 15 2005 [EMAIL PROTECTED] - added close.patch provided by Ted Tso (IBM) to fix bug #132708 *grin* Maybe obsolete by now? haven't looked closely. Patch12:e2fsprogs-mkinstalldirs.patch Why? Probably same as why we have something similar; for one reason or other need to rerun autoconf, and e2fsprogs isn't compatible with latest autoconf. (This is a patch I inherited, and haven't yet investigated all the details) Patch22:e2fsprogs-1.40.4-uuidd_pid_path.patch The problem with this patch is that /var/run is cleared via rm -rf, so it is highly problamtic to put the scratch directory for uuidd in /var/run. Hm, I had similar issues with uuidd too - common theme here? Patch32:libcom_err-no-e2fsck.static.patch This patch does two completely unrelated things. One is to disable the libcom_err regression test suite (probably because some of the other changes made) and the other is to disable building the e2fsck.static file. Why these two are bundled into a single patch I'm not sure. And I have a patch to do the latter as well. Interesting how we've arrived at similar needed changes, independently. :) and Patch99:e2fsprogs-no_cmd_hiding.patch honestly I like that; I should whip up a nice patch to emulate kbuild, with V=1 or something, unless there is some other easy way to show full build commands already? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH e2fsprogs] UPDATED: ignore safe flag differences when fsck compares superblocks
Theodore Tso wrote: This is what I've checked in. See the comment about why we can't ignore a difference for the extended attribute feature. Ok. Unfortunately, that one really hurts on fedora, or any distro which writes xattrs during install. The installer mkfs's, mounts, installs with selinux, reboots = force recheck of the entire filesystem, because the kernel wrote extended attributes for selinux, but only to the primary superblock. Solely having this set in primary but not in the backups should hardly be a trigger for a full filesystem check, should it? -Eric - Ted commit a8cde73acbf6e0f9c0a3601e4f5fac2b01a27bd2 Author: Theodore Ts'o [EMAIL PROTECTED] Date: Sat Jan 26 23:17:50 2008 -0500 Ignore safe flag differences when e2fsck compares superblocks - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH e2fsprogs] UPDATED: ignore safe flag differences when fsck compares superblocks
(updated for thinko: when proper flag *is* set on both primary backup) Recent e2fsprogs (1.40.3 and higher) fsck compares primary superblock to backups, and if things differ, it forces a full check. However, the kernel has a penchant for updating flags the first time a feature is used - attributes, large files, etc. However, it only updates these on the primary sb. This then causes the new e2fsck behavior to trigger a full check. I think these flags can be safely ignored on this check; having them set on the primary but not the backups doesn't indicate corruption; if they're wrongly set on the primary, really no damage is done, and if the backup is used, but it doesn't have the flags set when it should, I'm pretty sure e2fsck can cope with that. I'll admit the patch below is not glamorous. Any comments, either on the style(sic) or the intent of the patch? Thanks, -Eric Signed-off-by: Eric Sandeen [EMAIL PROTECTED] Index: e2fsprogs-1.40.4/e2fsck/super.c === --- e2fsprogs-1.40.4.orig/e2fsck/super.c +++ e2fsprogs-1.40.4/e2fsck/super.c @@ -814,10 +814,32 @@ int check_backup_super_block(e2fsck_t ct continue; } -#define SUPER_DIFFERENT(x) (fs-super-x != tfs-super-x) - if (SUPER_DIFFERENT(s_feature_compat) || - SUPER_DIFFERENT(s_feature_incompat) || - SUPER_DIFFERENT(s_feature_ro_compat) || + /* +* A few flags are set on the fly by the kernel, but +* only in the primary superblock. They are safe +* to copy even if they differ. +*/ + +#define FEATURE_COMPAT_IGNORE (EXT2_FEATURE_COMPAT_EXT_ATTR) +#define FEATURE_RO_COMPAT_IGNORE (EXT2_FEATURE_RO_COMPAT_LARGE_FILE| \ +EXT4_FEATURE_RO_COMPAT_DIR_NLINK) +#define FEATURE_INCOMPAT_IGNORE(EXT3_FEATURE_INCOMPAT_EXTENTS) + +#define SUPER_COMPAT_DIFFERENT(x) \ + (( fs-super-x ~FEATURE_COMPAT_IGNORE) !=\ +(tfs-super-x ~FEATURE_COMPAT_IGNORE)) +#define SUPER_INCOMPAT_DIFFERENT(x)\ + (( fs-super-x ~FEATURE_INCOMPAT_IGNORE) != \ +(tfs-super-x ~FEATURE_INCOMPAT_IGNORE)) +#define SUPER_RO_COMPAT_DIFFERENT(x) \ + (( fs-super-x ~FEATURE_RO_COMPAT_IGNORE) != \ +(tfs-super-x ~FEATURE_RO_COMPAT_IGNORE)) +#define SUPER_DIFFERENT(x) \ + (fs-super-x != tfs-super-x) + + if (SUPER_COMPAT_DIFFERENT(s_feature_compat) || + SUPER_INCOMPAT_DIFFERENT(s_feature_incompat) || + SUPER_RO_COMPAT_DIFFERENT(s_feature_ro_compat) || SUPER_DIFFERENT(s_blocks_count) || SUPER_DIFFERENT(s_inodes_count) || memcmp(fs-super-s_uuid, tfs-super-s_uuid, - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] ext3 freeze feature
Theodore Tso wrote: The other approach would be to say, oh well, the freeze ioctl is inherently dangerous, and root is allowed to himself in the foot, so who cares. :-) I tend to agree. Either you need your fs frozen, or not, and if you do, be prepared for the consequences. But it was this concern which is why ext3 never exported freeze functionality to userspace, even though other commercial filesystems do support this. It wasn't that it wasn't considered, but the concern about whether or not it was sufficiently safe to make available. What's the safety concern; that the admin will forget to unfreeze? And I do agree that we probably should just implement this in filesystem independent way, in which case all of the filesystems that support this already have super_operations functions write_super_lockfs() and unlockfs(). That's what I was thinking; can't the path to freeze_bdev just be elevated out of dm-ioctl.c to fs/ioctl.c and exposed, such that any filesystem which implements .write_super_lockfs can be frozen? This is essentially what the xfs_freeze userspace does via xfs_ioctl/XFS_IOC_FREEZE - which, AFAIK, isn't used much now that the lvm hooks are in place. I'm also not sure I see the point of the timeout in the original patch; either you are done snapshotting and ready to unfreeze, or you're not; 1, or 2, or 3 seconds doesn't really matter. When you're done, you're done, and you can only unfreeze then. Shouldn't this be done programmatically, and not with some pre-determined timeout? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH e2fsprogs] ignore safe flag differences when fsck compares superblocks
Recent e2fsprogs (1.40.3 and higher) fsck compares primary superblock to backups, and if things differ, it forces a full check. However, the kernel has a penchant for updating flags the first time a feature is used - attributes, large files, etc. However, it only updates these on the primary sb. This then causes the new e2fsck behavior to trigger a full check. I think these flags can be safely ignored on this check; having them set on the primary but not the backups doesn't indicate corruption; if they're wrongly set on the primary, really no damage is done, and if the backup is used, but it doesn't have the flags set when it should, I'm pretty sure e2fsck can cope with that. I'll admit the patch below is not glamorous. Any comments, either on the style(sic) or the intent of the patch? Thanks, -Eric Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.4/e2fsck/super.c === --- e2fsprogs-1.40.4.orig/e2fsck/super.c +++ e2fsprogs-1.40.4/e2fsck/super.c @@ -814,10 +814,29 @@ int check_backup_super_block(e2fsck_t ct continue; } -#define SUPER_DIFFERENT(x) (fs-super-x != tfs-super-x) - if (SUPER_DIFFERENT(s_feature_compat) || - SUPER_DIFFERENT(s_feature_incompat) || - SUPER_DIFFERENT(s_feature_ro_compat) || + /* +* A few flags are set on the fly by the kernel, but +* only in the primary superblock. They are safe +* to copy even if they differ. +*/ + +#define FEATURE_COMPAT_IGNORE (EXT2_FEATURE_COMPAT_EXT_ATTR) +#define FEATURE_RO_COMPAT_IGNORE (EXT2_FEATURE_RO_COMPAT_LARGE_FILE | \ +EXT4_FEATURE_RO_COMPAT_DIR_NLINK) +#define FEATURE_INCOMPAT_IGNORE(EXT3_FEATURE_INCOMPAT_EXTENTS) + +#define SUPER_COMPAT_DIFFERENT(x) \ + ((fs-super-x ~FEATURE_COMPAT_IGNORE) != tfs-super-x) +#define SUPER_INCOMPAT_DIFFERENT(x)\ + ((fs-super-x ~FEATURE_INCOMPAT_IGNORE) != tfs-super-x) +#define SUPER_RO_COMPAT_DIFFERENT(x) \ + ((fs-super-x ~FEATURE_RO_COMPAT_IGNORE) != tfs-super-x) +#define SUPER_DIFFERENT(x) \ + (fs-super-x != tfs-super-x) + + if (SUPER_COMPAT_DIFFERENT(s_feature_compat) || + SUPER_INCOMPAT_DIFFERENT(s_feature_incompat) || + SUPER_RO_COMPAT_DIFFERENT(s_feature_ro_compat) || SUPER_DIFFERENT(s_blocks_count) || SUPER_DIFFERENT(s_inodes_count) || memcmp(fs-super-s_uuid, tfs-super-s_uuid, - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH e2fsprogs] teach libblkid about ext4(dev)
Returns ext4dev for now; will need to change to ext4 at the appropriate time I guess. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.4/lib/blkid/probe.c === --- e2fsprogs-1.40.4.orig/lib/blkid/probe.c +++ e2fsprogs-1.40.4/lib/blkid/probe.c @@ -148,6 +148,38 @@ static void get_ext2_info(blkid_dev dev, set_uuid(dev, es-s_uuid, 0); } +static int probe_ext4(struct blkid_probe *probe, + struct blkid_magic *id __BLKID_ATTR((unused)), + unsigned char *buf) +{ + struct ext2_super_block *es; + es = (struct ext2_super_block *)buf; + + /* Distinguish between jbd and ext2/3/4 fs */ + if (blkid_le32(es-s_feature_incompat) + EXT3_FEATURE_INCOMPAT_JOURNAL_DEV) + return -BLKID_ERR_PARAM; + + /* Distinguish between ext3/4 and ext2 */ + if (!(blkid_le32(es-s_feature_compat) + EXT3_FEATURE_COMPAT_HAS_JOURNAL)) + return -BLKID_ERR_PARAM; + + /* Distinguish between ext4 and ext3 */ + if (!(blkid_le32(es-s_feature_ro_compat) + EXT4_FEATURES_RO_COMPAT) + !(blkid_le32(es-s_feature_incompat) + EXT4_FEATURES_INCOMPAT)) + return -BLKID_ERR_PARAM; + + get_ext2_info(probe-dev, buf); + + if ((es-s_feature_compat EXT3_FEATURE_COMPAT_HAS_JOURNAL) + !uuid_is_null(es-s_journal_uuid)) + set_uuid(probe-dev, es-s_journal_uuid, EXT_JOURNAL); + + return 0; +} static int probe_ext3(struct blkid_probe *probe, struct blkid_magic *id __BLKID_ATTR((unused)), unsigned char *buf) @@ -833,6 +865,7 @@ static struct blkid_magic type_array[] = { oracleasm, 0,32, 8, ORCLDISK, probe_oracleasm }, { ntfs, 0, 3, 8, NTFS, probe_ntfs }, { jbd, 1, 0x38, 2, \123\357, probe_jbd }, + { ext4dev, 1, 0x38, 2, \123\357, probe_ext4 }, { ext3, 1, 0x38, 2, \123\357, probe_ext3 }, { ext2, 1, 0x38, 2, \123\357, probe_ext2 }, { reiserfs, 8, 0x34, 8, ReIsErFs, probe_reiserfs }, Index: e2fsprogs-1.40.4/lib/blkid/probe.h === --- e2fsprogs-1.40.4.orig/lib/blkid/probe.h +++ e2fsprogs-1.40.4/lib/blkid/probe.h @@ -88,6 +88,26 @@ struct ext2_super_block { #define EXT3_FEATURE_INCOMPAT_RECOVER 0x0004 #define EXT3_FEATURE_INCOMPAT_JOURNAL_DEV 0x0008 +#define EXT4_FEATURE_RO_COMPAT_HUGE_FILE 0x0008 +#define EXT4_FEATURE_RO_COMPAT_GDT_CSUM0x0010 +#define EXT4_FEATURE_RO_COMPAT_DIR_NLINK 0x0020 +#define EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE 0x0040 + +#define EXT4_FEATURE_INCOMPAT_EXTENTS 0x0040 /* extents support */ +#define EXT4_FEATURE_INCOMPAT_64BIT0x0080 +#define EXT4_FEATURE_INCOMPAT_MMP 0x0100 +#define EXT4_FEATURE_INCOMPAT_FLEX_BG 0x0200 + +#define EXT4_FEATURES_RO_COMPAT(EXT4_FEATURE_RO_COMPAT_HUGE_FILE| \ +EXT4_FEATURE_RO_COMPAT_GDT_CSUM| \ +EXT4_FEATURE_RO_COMPAT_DIR_NLINK| \ +EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE) + +#define EXT4_FEATURES_INCOMPAT (EXT4_FEATURE_INCOMPAT_EXTENTS| \ +EXT4_FEATURE_INCOMPAT_64BIT| \ +EXT4_FEATURE_INCOMPAT_MMP| \ +EXT4_FEATURE_INCOMPAT_FLEX_BG) + struct xfs_super_block { unsigned char xs_magic[4]; __u32 xs_blocksize; - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] Add new development flag to the ext4 filesystem
Theodore Tso wrote: As discussed on RFC, this flag is simply a generic this is a crash/burn test filesystem marker. If it is set, then filesystem code which is in development will be allowed to mount the filesystem. Filesystem code which is not considered ready for prime-time will check for this flag, and if it is not set, it will refuse to touch the filesystem. As we start rolling ext4 out to distro's like Fedora, et. al, this makes it less likely that a user might accidentally start using ext4 on a production filesystem; a bad thing, since that will essentially make it be unfsckable until e2fsprogs catches up. - Ted Overall, seems ok. One other question though, when ext4 is a fully-fledged production filesystem, and the flag requirement is gone, what stops ext3 filesystems from being silently mounted as ext4, just as happened with ext4dev today? Will users need to be sure everything explicitly mounts -t ext3 to avoid this? +static void parse_extended_opts(ext2_filsys fs, const char *opts) ... + if (!strcmp(token, test_fs)) { + fs-super-s_flags |= EXT2_FLAGS_TEST_FILESYS; + printf(Setting test filesystem flag\n); + ext2fs_mark_super_dirty(fs); + } else if (!strcmp(token, production_fs)) { + fs-super-s_flags = ~EXT2_FLAGS_TEST_FILESYS; + printf(Clearing test filesystem flag\n); + ext2fs_mark_super_dirty(fs); + } else + r_usage++; + } + if (r_usage) { + fprintf(stderr, _(\nBad options specified.\n\n + Extended options are separated by commas, + and may take an argument which\n + \tis set off by an equals ('=') sign.\n\n + Valid extended options are:\n + \ttestfs\n)); help text doesn't match reality here, missing a _ -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
why are acls user xattrs off by default?
I was looking through Anaconda to see what it's doing when it makes ext3 filesystems: rc = iutil.execWithRedirect(tune2fs, [-c0, -i0, -Odir_index, -ouser_xattr,acl, devicePath], stdout = /dev/tty5, stderr = /dev/tty5, searchPath = 1) -Odir_index is redundant by now; it's default in e2fsprogs. -c0 -i0 are probably subject to some debate, but... why are user_xattr acl required to be mount options for ext3? (added to defaults or explicit?) I'm sure there is some history here, but why build it all into the module, and then turn them off without some extra magic passed in? What's the advantage to disabling user xattrs acls at runtime by default? Thanks just curious, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Fwd: [Bug 9732] New: oops in extent code via ext4_fallocate]
---BeginMessage--- http://bugzilla.kernel.org/show_bug.cgi?id=9732 Summary: oops in extent code via ext4_fallocate Product: File System Version: 2.5 KernelVersion: 2.6.24-rc7 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: ext4 AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Latest working kernel version: Unknown Earliest failing kernel version: 2.6.24-rc7 Distribution: Fedora Rawhide Hardware Environment: x86_64 Software Environment: Problem Description: a simple fallocate; truncate; fallocate series oopses the kernel. 2.6.24-rc7 with the latest ext4 patch stack as of about Jan 9, 2008, minus delalloc patches. Steps to reproduce, using the test program at http://oss.sgi.com/archives/xfs/2007-07/msg00092.html [EMAIL PROTECTED] sdb8]# touch testfile [EMAIL PROTECTED] sdb8]# ./testfallocate -f testfile 0 65536 Trying to preallocate blocks (offset=0, len=65536) fallocate system call succedded ! ret=0 # FALLOCATE TEST REPORT # New blocks preallocated = 16. Number of bytes preallocated = 65536 Old file size = 0, New file size 65536. Old num blocks = 0, New num blocks 64. ### TESTS PASSED ### [EMAIL PROTECTED] sdb8]# ls -lh testfile; du -hc testfile -rw-r--r-- 1 root root 64K Jan 11 14:12 testfile 64K testfile 64K total [EMAIL PROTECTED] sdb8]# /root/truncate testfile 32768 Truncating testfile to 32768 [EMAIL PROTECTED] sdb8]# ls -lh testfile; du -hc testfile -rw-r--r-- 1 root root 32K Jan 11 14:12 testfile 32K testfile 32K total [EMAIL PROTECTED] sdb8]# ./testfallocate -f testfile 0 65536 Trying to preallocate blocks (offset=0, len=65536) Segmentation fault and yields: EXT4-fs: file extents enabled EXT4-fs: mballoc enabled [ cut here ] kernel BUG at fs/ext4/extents.c:1056! invalid opcode: [1] SMP CPU 2 Modules linked in: ext4dev jbd2 crc16 autofs4 hidp nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand dm_multipath video output sbs sbshc battery ac power_supply parport_pc lp parport sg pata_acpi cfi_cmdset_0002 ata_generic cfi_util button jedec_probe serio_raw cfi_probe tg3 gen_probe ck804xrom mtd rtc_cmos chipreg pata_amd map_funcs k8temp libata hwmon i2c_nforce2 shpchp i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc scsi_tgt mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd Pid: 3554, comm: testfallocate Not tainted 2.6.24-0.147.rc7.git2.fc9 #1 RIP: 0010:[88364497] [88364497] :ext4dev:ext4_ext_search_left+0x97/0xbc RSP: 0018:81012dc65cf0 EFLAGS: 00010287 RAX: 8008 RBX: 81012dc65d78 RCX: RDX: 81012dc65d90 RSI: 81012e5f37e0 RDI: 81012e04c00c RBP: 81012e04c180 R08: 0008 R09: R10: R11: 81012dc65d98 R12: 0008 R13: 81012e04c180 R14: 0008 R15: FS: 2aab96f0() GS:81013fc01a28() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 00327d0414a0 CR3: 000130ce8000 CR4: 06a0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process testfallocate (pid: 3554, threadinfo 81012dc64000, task 81012dd5a000) Stack: 88366073 8101349e71d0 81053c91 00022dcc30b8 81012dc65e88 0008 81012e549000 81012dcc30b8 81012dcc3090 81012e04c000 81012dcc30b8 0008 Call Trace: [88366073] :ext4dev:ext4_ext_get_blocks+0x5ba/0x8c1 [81053c91] lock_release_holdtime+0x27/0x49 [812748f6] _spin_unlock+0x17/0x20 [883400a6] :jbd2:start_this_handle+0x4e0/0x4fe [88366564] :ext4dev:ext4_fallocate+0x175/0x39a [81053c91] lock_release_holdtime+0x27/0x49 [81056480] __lock_acquire+0x4e7/0xc4d [81053c91] lock_release_holdtime+0x27/0x49 [810a8de7] sys_fallocate+0xe4/0x10d [8100c043] tracesys+0xd5/0xda Code: 0f 0b eb fe ff c8 89 02 0f b7 47 06 8b 4f 08 0f b7 57 04 48 RIP [88364497] :ext4dev:ext4_ext_search_left+0x97/0xbc RSP 81012dc65cf0 ---[ end trace 9a60a6a6c694770a ]--- SysRq : Resetting The BUG_ON is: BUG_ON(*logical le32_to_cpu(ex-ee_block) + le16_to_cpu(ex-ee_len)); where these were the values: logical 8 ee_block 0 ee_len 32776 Haven't looked further into it yet. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. ---End Message---
[PATCH] Fix up uuidd man page
uuidd(8) man page had a typo, and a couple of stale defaults documented. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.4/misc/uuidd.8.in === --- e2fsprogs-1.40.4.orig/misc/uuidd.8.in +++ e2fsprogs-1.40.4/misc/uuidd.8.in @@ -64,11 +64,11 @@ UUID's. .TP .BI \-p pidfile Specify the pathname where the pid file should be written. By default, -the pid file is written to /var/run/uuidd.pid. +the pid file is written to /var/lib/libuuid/uuidd.pid. .TP .BI \-s socketpath Specify the pathname used for the unix-domain socket used by uuidd. By -qdefault, the pathname used is /var/run/uuidd.sock. This is primarily +default, the pathname used is /var/lib/libuuid/request. This is primarily for debugging purposes, since the pathname is hard-coded in the libuuid library. .TP - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix up uuidd man page
Eric Sandeen wrote: uuidd(8) man page had a typo, and a couple of stale defaults documented. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: e2fsprogs-1.40.4/misc/uuidd.8.in === --- e2fsprogs-1.40.4.orig/misc/uuidd.8.in +++ e2fsprogs-1.40.4/misc/uuidd.8.in @@ -64,11 +64,11 @@ UUID's. .TP .BI \-p pidfile Specify the pathname where the pid file should be written. By default, -the pid file is written to /var/run/uuidd.pid. +the pid file is written to /var/lib/libuuid/uuidd.pid. Actually, hang on. Why is this in /var/lib, not /var/run? I can see keeping state files in /var/lib so they're persistent across boots, but the pidfile? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ext4: Use superblock s_raid_stripe_width as stripe size during block allocation.
Aneesh Kumar K.V wrote: The stipe size used during block allocation is calculated as below. a) if we specify a stripe=value option using mount time. Use that value. b) if not use the value specified in super block. b) If the value specfied at mount time is greater than blocks per group use the value specified ini the super block. Signed-off-by: Aneesh Kumar K.V [EMAIL PROTECTED] --- fs/ext4/super.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index db1edc8..10330eb 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2136,6 +2136,16 @@ static int ext4_fill_super (struct super_block *sb, void *data, int silent) sbi-s_rsv_window_head.rsv_alloc_hit = 0; sbi-s_rsv_window_head.rsv_goal_size = 0; ext4_rsv_window_add(sb, sbi-s_rsv_window_head); + /* + * set the stripe size. If we have specified it via mount option, then + * use the mount option value. If the value specified at mount time is + * greater than the blocks per group use the super block value. + * Allocator needs it be less than blocks per group. + */ + if (!sbi-s_stripe || + sbi-s_stripe = sbi-s_blocks_per_group) { Is stripe == s_blocks_per_group problematic? Should we warn/printk in the case that the specified size is too big? Thanks, -Eric + sbi-s_stripe = le32_to_cpu(es-s_raid_stripe_width); + } /* * set up enough so that it can read an inode - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] UPDATED3: types fixup for mballoc
4th time's the charm? Note, the calculations Andreas I were discussing only work properly for stripe = blocks per group... I don't know if we'd need to enforce that at mount time? - I ran into a potential overflow in ext4_mb_scan_aligned, and went looking for others in mballoc. This patch hits a few spots, compile-tested only at this point, comments welcome. This patch: changes fe_len to an int, I don't think we need it to be a long, looking at how it's used (should it be a grpblk_t?) Also change anything assigned to return value of mb_find_extent, since it returns fe_len. changes anything that does groupno * EXT4_BLOCKS_PER_GROUP or pa-pa_pstart + whatever to an ext4_fsblk_t avoids 64-bit divides modulos, and... fixes up any related formats Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext4/mballoc.c === --- linux-2.6.24-rc3.orig/fs/ext4/mballoc.c +++ linux-2.6.24-rc3/fs/ext4/mballoc.c @@ -363,7 +363,7 @@ struct ext4_free_extent { ext4_lblk_t fe_logical; ext4_grpblk_t fe_start; ext4_group_t fe_group; - unsigned long fe_len; + int fe_len; }; /* @@ -586,14 +586,14 @@ static void mb_free_blocks_double(struct BUG_ON(!ext4_is_group_locked(sb, e4b-bd_group)); for (i = 0; i count; i++) { if (!mb_test_bit(first + i, e4b-bd_info-bb_bitmap)) { - unsigned long blocknr; + ext4_fsblk_t blocknr; blocknr = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb); blocknr += first + i; blocknr += le32_to_cpu(EXT4_SB(sb)-s_es-s_first_data_block); ext4_error(sb, __FUNCTION__, double-free of inode - %lu's block %lu(bit %u in group %lu)\n, + %lu's block %llu(bit %u in group %lu)\n, inode ? inode-i_ino : 0, blocknr, first + i, e4b-bd_group); } @@ -1226,14 +1226,14 @@ static int mb_free_blocks(struct inode * order = 0; if (!mb_test_bit(block, EXT4_MB_BITMAP(e4b))) { - unsigned long blocknr; + ext4_fsblk_t blocknr; blocknr = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb); blocknr += block; blocknr += le32_to_cpu(EXT4_SB(sb)-s_es-s_first_data_block); ext4_error(sb, __FUNCTION__, double-free of inode - %lu's block %lu(bit %u in group %lu)\n, + %lu's block %llu(bit %u in group %lu)\n, inode ? inode-i_ino : 0, blocknr, block, e4b-bd_group); } @@ -1416,7 +1416,7 @@ static void ext4_mb_use_best_found(struc struct ext4_buddy *e4b) { struct ext4_sb_info *sbi = EXT4_SB(ac-ac_sb); - unsigned long ret; + int ret; BUG_ON(ac-ac_b_ex.fe_group != e4b-bd_group); BUG_ON(ac-ac_status == AC_STATUS_FOUND); @@ -1609,10 +1609,12 @@ static int ext4_mb_find_by_goal(struct e ac-ac_g_ex.fe_len, ex); if (max = ac-ac_g_ex.fe_len ac-ac_g_ex.fe_len == sbi-s_stripe) { - unsigned long start; - start = (e4b-bd_group * EXT4_BLOCKS_PER_GROUP(ac-ac_sb) + - ex.fe_start + le32_to_cpu(es-s_first_data_block)); - if (start % sbi-s_stripe == 0) { + ext4_fsblk_t start; + + start = (e4b-bd_group * EXT4_BLOCKS_PER_GROUP(ac-ac_sb)) + + ex.fe_start + le32_to_cpu(es-s_first_data_block); + /* use do_div to get remainder (would be 64-bit modulo) */ + if (do_div(start, sbi-s_stripe) == 0) { ac-ac_found++; ac-ac_b_ex = ex; ext4_mb_use_best_found(ac, e4b); @@ -1724,6 +1726,7 @@ static void ext4_mb_complex_scan_group(s /* * This is a special case for storages like raid5 * we try to find stripe-aligned chunks for stripe-size requests + * XXX should do so at least for multiples of stripe size as well */ static void ext4_mb_scan_aligned(struct ext4_allocation_context *ac, struct ext4_buddy *e4b) @@ -1732,17 +1735,19 @@ static void ext4_mb_scan_aligned(struct struct ext4_sb_info *sbi = EXT4_SB(sb); void *bitmap = EXT4_MB_BITMAP(e4b); struct ext4_free_extent ex; - unsigned long i; - unsigned long max; + ext4_fsblk_t first_group_block; + ext4_fsblk_t a; + ext4_grpblk_t i; + int max; BUG_ON(sbi-s_stripe == 0
Re: [PATCH] UPDATED: types fixup for mballoc
Andreas Dilger wrote: On Jan 02, 2008 14:01 -0600, Eric Sandeen wrote: This patch: changes fe_len to an int, I don't think we need it to be a long, looking at how it's used (should it be a grpblk_t?) Also change anything assigned to return value of mb_find_extent, since it returns fe_len. changes anything that does groupno * EXT4_BLOCKS_PER_GROUP or pa-pa_pstart + whatever to an ext4_fsblk_t fixes up any related formats The change to ext4_mb_scan_aligned to avoid a 64-bit modulo could use an extra look I think. @@ -1732,17 +1735,18 @@ static void ext4_mb_scan_aligned(struct BUG_ON(sbi-s_stripe == 0); -/* find first stripe-aligned block */ -i = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb) -+ le32_to_cpu(sbi-s_es-s_first_data_block); -i = ((i + sbi-s_stripe - 1) / sbi-s_stripe) * sbi-s_stripe; -i = (i - le32_to_cpu(sbi-s_es-s_first_data_block)) -% EXT4_BLOCKS_PER_GROUP(sb); +/* find first stripe-aligned block in group */ +a = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb) ++ le32_to_cpu(sbi-s_es-s_first_data_block); +a = (a + sbi-s_stripe - 1) ~((ext4_fsblk_t)sbi-s_stripe - 1); +a = a - le32_to_cpu(sbi-s_es-s_first_data_block); +i = do_div(a, EXT4_BLOCKS_PER_GROUP(sb)); I don't think this is correct... This code should only be used if s_stripe is NOT a power-of-two value, otherwise we can just use the buddy maps to find aligned chunks. As a result I don't think that (s_stripe - 1) change is equivalent to / s_stripe * s_stripe. Hmmm ok let me re-check that then. (...curses 64-bit math trickiness...) while (i sb-s_blocksize * 8) { if (!mb_test_bit(i, bitmap)) { Hrmmm, I thought this should be while (i EXT4_BLOCKS_PER_GROUP(sb)) and not sb-s_blocksize * 8? Did that fix get lost somewhere? I did wonder about that wondered about the magic 8 number but forgot to comment. Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] UPDATED2: types fixup for mballoc
Andreas Dilger wrote: On Jan 03, 2008 14:10 -0600, Eric Sandeen wrote: @@ -1732,17 +1735,19 @@ static void ext4_mb_scan_aligned(struct +/* find first stripe-aligned block in group */ +a = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb) + le32_to_cpu(sbi-s_es-s_first_data_block); +a += sbi-s_stripe - 1; Why not just have + sbi-s_stipe - 1 on the previous line, instead of +=? hmm good point. Also minor coding style nit: the + is normally at the end of the previous line instead of at the start of the next line, so: /* find first stripe-aligned block in group */ a = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb) + le32_to_cpu(sbi-s_es-s_first_data_block) + sbi-s_stripe - 1; Ok, that was inherited :) +a = (a * sbi-s_stripe) - le32_to_cpu(sbi-s_es-s_first_data_block); +i = do_div(a, EXT4_BLOCKS_PER_GROUP(sb)); I guess this doesn't stricly need to be a modulus either, because we know which group this will be in and can just subtract the start. You can just do: /* find first stripe-aligned block in this group */ group_start = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb) +; le32_to_cpu(sbi-s_es-s_first_data_block); a = group_start + sbi-s_stripe - 1; do_div(a, sbi-s_stripe); a = (a * sbi-s_stripe) - group_start; Ok, I knew I had stared at this for too long :) 4th try coming soon. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] UPDATED: types fixup for mballoc
Eric Sandeen wrote: Eric Sandeen wrote: I ran into a potential overflow in ext4_mb_scan_aligned, and went looking for others in mballoc. This patch hits a few spots, compile-tested only at this point, comments welcome. This patch: ... introduces a 64-bit divide. Oops... will fix that up resend. Took a while to get back to this :) But just used do_div return value for remainder instead of modulo (which would have been a 64-bit modulo) There's another do_div trick, and a bitwise round-up in there too, now that there are more 64-bit containers... Anyway, patch follows. - I ran into a potential overflow in ext4_mb_scan_aligned, and went looking for others in mballoc. This patch hits a few spots, compile-tested only at this point, comments welcome. This patch: changes fe_len to an int, I don't think we need it to be a long, looking at how it's used (should it be a grpblk_t?) Also change anything assigned to return value of mb_find_extent, since it returns fe_len. changes anything that does groupno * EXT4_BLOCKS_PER_GROUP or pa-pa_pstart + whatever to an ext4_fsblk_t fixes up any related formats The change to ext4_mb_scan_aligned to avoid a 64-bit modulo could use an extra look I think. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext4/mballoc.c === --- linux-2.6.24-rc3.orig/fs/ext4/mballoc.c +++ linux-2.6.24-rc3/fs/ext4/mballoc.c @@ -363,7 +363,7 @@ struct ext4_free_extent { ext4_lblk_t fe_logical; ext4_grpblk_t fe_start; ext4_group_t fe_group; - unsigned long fe_len; + int fe_len; }; /* @@ -586,14 +586,14 @@ static void mb_free_blocks_double(struct BUG_ON(!ext4_is_group_locked(sb, e4b-bd_group)); for (i = 0; i count; i++) { if (!mb_test_bit(first + i, e4b-bd_info-bb_bitmap)) { - unsigned long blocknr; + ext4_fsblk_t blocknr; blocknr = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb); blocknr += first + i; blocknr += le32_to_cpu(EXT4_SB(sb)-s_es-s_first_data_block); ext4_error(sb, __FUNCTION__, double-free of inode - %lu's block %lu(bit %u in group %lu)\n, + %lu's block %llu(bit %u in group %lu)\n, inode ? inode-i_ino : 0, blocknr, first + i, e4b-bd_group); } @@ -1226,14 +1226,14 @@ static int mb_free_blocks(struct inode * order = 0; if (!mb_test_bit(block, EXT4_MB_BITMAP(e4b))) { - unsigned long blocknr; + ext4_fsblk_t blocknr; blocknr = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb); blocknr += block; blocknr += le32_to_cpu(EXT4_SB(sb)-s_es-s_first_data_block); ext4_error(sb, __FUNCTION__, double-free of inode - %lu's block %lu(bit %u in group %lu)\n, + %lu's block %llu(bit %u in group %lu)\n, inode ? inode-i_ino : 0, blocknr, block, e4b-bd_group); } @@ -1416,7 +1416,7 @@ static void ext4_mb_use_best_found(struc struct ext4_buddy *e4b) { struct ext4_sb_info *sbi = EXT4_SB(ac-ac_sb); - unsigned long ret; + int ret; BUG_ON(ac-ac_b_ex.fe_group != e4b-bd_group); BUG_ON(ac-ac_status == AC_STATUS_FOUND); @@ -1609,10 +1609,12 @@ static int ext4_mb_find_by_goal(struct e ac-ac_g_ex.fe_len, ex); if (max = ac-ac_g_ex.fe_len ac-ac_g_ex.fe_len == sbi-s_stripe) { - unsigned long start; - start = (e4b-bd_group * EXT4_BLOCKS_PER_GROUP(ac-ac_sb) + - ex.fe_start + le32_to_cpu(es-s_first_data_block)); - if (start % sbi-s_stripe == 0) { + ext4_fsblk_t start; + + start = (e4b-bd_group * EXT4_BLOCKS_PER_GROUP(ac-ac_sb)) + + ex.fe_start + le32_to_cpu(es-s_first_data_block); + /* use do_div to get remainder (would be 64-bit modulo) */ + if (do_div(start, sbi-s_stripe) == 0) { ac-ac_found++; ac-ac_b_ex = ex; ext4_mb_use_best_found(ac, e4b); @@ -1724,6 +1726,7 @@ static void ext4_mb_complex_scan_group(s /* * This is a special case for storages like raid5 * we try to find stripe-aligned chunks for stripe-size requests + * XXX should do so at least for multiples of stripe size as well */ static void ext4_mb_scan_aligned
Re: [2.6.24 patch] let EXT4DEV_FS depend on BROKEN
Adrian Bunk wrote: Most people and all distributions use CONFIG_EXPERIMENTAL=y simply because too many options (including options required for hardware support) depend on it. Compare e.g.: - Marvell SATA support (HIGHLY EXPERIMENTAL) - Provide NFSv4 client support (EXPERIMENTAL) - Ext4dev/ext4 extended fs support development (EXPERIMENTAL) tristate Snapshot target (EXPERIMENTAL) depends on BLK_DEV_DM EXPERIMENTAL tristate Mirror target (EXPERIMENTAL) depends on BLK_DEV_DM EXPERIMENTAL ... It does seem that it might be a good goal to revisit options marked EXPERIMENTAL, and see if they still should be marked as such, rather than removing the option altogether. init/Kconfig describes things in EXPERIMENTAL as alpha-test - I bet there are a few things which have moved beyond this, but are still marked as such. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] export iov_shorten for ext4's use
(And, take 2... follow the coding style on export declarations...) ext4 needs to deal with 2 different max file offsets for block- and extent-allocated file formats, whereas the s_maxbytes scheme can only deal with one. So, for block-allocated files, we must catch and fix up too-large offsets from within the filesystem. Having iov_shorten exported allows such things as: if (pos + length sbi-s_bitmap_maxbytes) { nr_segs = iov_shorten((struct iovec *)iov, nr_segs, sbi-s_bitmap_maxbytes - pos); } to fix up too-large writes to these files in ext4_file_write(). This patch is currently living in the ext4 patch queue. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/read_write.c === --- linux-2.6.24-rc3.orig/fs/read_write.c +++ linux-2.6.24-rc3/fs/read_write.c @@ -450,6 +450,7 @@ unsigned long iov_shorten(struct iovec * } return seg; } +EXPORT_SYMBOL(iov_shorten) ssize_t do_sync_readv_writev(struct file *filp, const struct iovec *iov, unsigned long nr_segs, size_t len, loff_t *ppos, iov_fn_t fn) - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ext3: issue warning when bad inode found via ext3_lookup
I have a hand-crafted bad filesystem image which has corruption: [EMAIL PROTECTED] ~]# ls mnt/dir file1 file2 file3 file4 file5 [EMAIL PROTECTED] ~]# ls mnt/dir/file4 ls: cannot access mnt/dir/file4: No such file or directory [EMAIL PROTECTED] ~]# ls -l mnt/dir ls: cannot access mnt/dir/file4: No such file or directory total 8 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file1 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file2 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file3 d? ? ?? ?? file4 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file5 e2fsck also knows it's corrupted: Pass 2: Checking directory structure Entry 'file4' in /dir (2049) has deleted/unused inode 13. Clear? no Entry 'file4' in /dir (2049) has an incorrect filetype (was 2, should be 1). Fix? no Pass 3: Checking directory connectivity Unconnected directory inode 2053 (/dir/???) BUT there are no kernel messages logged anywhere because ext3_read_inode silently makes a bad_inode in this case, so that stale NFS filehandles aren't noisy. However, when we encounter such a problem after a by-name lookup, I think a warning is appropriate, as it indicates filesystem corruption. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext3/namei.c === --- linux-2.6.24-rc3.orig/fs/ext3/namei.c +++ linux-2.6.24-rc3/fs/ext3/namei.c @@ -1049,6 +1049,9 @@ static struct dentry *ext3_lookup(struct return ERR_PTR(-EACCES); if (is_bad_inode(inode)) { + ext3_warning(inode-i_sb, __FUNCTION__, +bad inode %lu in dir #%lu, + inode-i_ino, dir-i_ino); iput(inode); return ERR_PTR(-ENOENT); - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ext4: issue warning when bad inode found via ext4_lookup
I have a hand-crafted bad filesystem image which has corruption: [EMAIL PROTECTED] ~]# ls mnt/dir file1 file2 file3 file4 file5 [EMAIL PROTECTED] ~]# ls mnt/dir/file4 ls: cannot access mnt/dir/file4: No such file or directory [EMAIL PROTECTED] ~]# ls -l mnt/dir ls: cannot access mnt/dir/file4: No such file or directory total 8 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file1 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file2 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file3 d? ? ?? ?? file4 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 file5 e2fsck also knows it's corrupted: Pass 2: Checking directory structure Entry 'file4' in /dir (2049) has deleted/unused inode 13. Clear? no Entry 'file4' in /dir (2049) has an incorrect filetype (was 2, should be 1). Fix? no Pass 3: Checking directory connectivity Unconnected directory inode 2053 (/dir/???) BUT there are no kernel messages logged anywhere because ext4_read_inode silently makes a bad_inode in this case, so that stale NFS filehandles aren't noisy. However, when we encounter such a problem after a by-name lookup, I think a warning is appropriate, as it indicates filesystem corruption. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext4/namei.c === --- linux-2.6.24-rc3.orig/fs/ext4/namei.c +++ linux-2.6.24-rc3/fs/ext4/namei.c @@ -1050,6 +1050,9 @@ static struct dentry *ext4_lookup(struct return ERR_PTR(-EACCES); if (is_bad_inode(inode)) { + ext4_warning(inode-i_sb, __FUNCTION__, +bad inode %lu in dir #%lu, +inode-i_ino, dir-i_ino); iput(inode); return ERR_PTR(-ENOENT); } - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] export iov_shorten for ext4's use
ext4 needs to deal with 2 different max file offsets for block- and extent-allocated file formats, whereas the s_maxbytes scheme can only deal with one. So, for block-allocated files, we must catch and fix up too-large offsets from within the filesystem. Having iov_shorten exported allows such things as: if (pos + length sbi-s_bitmap_maxbytes) { nr_segs = iov_shorten((struct iovec *)iov, nr_segs, sbi-s_bitmap_maxbytes - pos); } to fix up too-large writes to these files in ext4_file_write(). This patch is currently living in the ext4 patch queue. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/read_write.c === --- linux-2.6.24-rc3.orig/fs/read_write.c +++ linux-2.6.24-rc3/fs/read_write.c @@ -451,6 +451,8 @@ unsigned long iov_shorten(struct iovec * return seg; } +EXPORT_SYMBOL(iov_shorten); + ssize_t do_sync_readv_writev(struct file *filp, const struct iovec *iov, unsigned long nr_segs, size_t len, loff_t *ppos, iov_fn_t fn) { - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH, RFC] - issue warning when bad inode found via ext3_lookup
I have a hand-crafted bad filesystem image which has a corrupted directory entry: [EMAIL PROTECTED] ~]# ls mnt/Picture LINKS_20 OBEN_20 VORNE_20 VORN_LINKS_20 VORN_RECHTS_20 [EMAIL PROTECTED] ~]# ls mnt/Picture/VORN_LINKS_20 ls: cannot access mnt/Picture/VORN_LINKS_20: No such file or directory [EMAIL PROTECTED] ~]# stat mnt/Picture/VORN_LINKS_20 stat: cannot stat `mnt/Picture/VORN_LINKS_20': No such file or directory [EMAIL PROTECTED] ~]# ls -l mnt/Picture ls: cannot access mnt/Picture/VORN_LINKS_20: No such file or directory total 8 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 LINKS_20 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 OBEN_20 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 VORNE_20 d? ? ?? ?? VORN_LINKS_20 drwxr-xr-x 2 root root 1024 2007-09-04 13:36 VORN_RECHTS_20 e2fsck also knows it's corrupted: Pass 2: Checking directory structure Entry 'VORN_LINKS_20' in /Picture (2049) has deleted/unused inode 13. Clear? no Entry 'VORN_LINKS_20' in /Picture (2049) has an incorrect filetype (was 2, should be 1). Fix? no Pass 3: Checking directory connectivity Unconnected directory inode 2053 (/Picture/???) BUT there are no kernel messages anywhere. I think the below makes sense; I don't think there are any instances where an inode found via a filename passed to ext3_lookup should be returning a bad inode without warning; NFS may pass in a stale filehandle which also makes a bad inode, but that shouldn't go through lookup... Comments? (if it looks good I'll resubmit for ext*_lookup) Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext3/namei.c === --- linux-2.6.24-rc3.orig/fs/ext3/namei.c +++ linux-2.6.24-rc3/fs/ext3/namei.c @@ -1049,6 +1049,10 @@ static struct dentry *ext3_lookup(struct return ERR_PTR(-EACCES); if (is_bad_inode(inode)) { + ext3_warning(inode-i_sb, __FUNCTION__, + bad inode %lu for file %s in dir #%lu, + inode-i_ino, dentry-d_name.name, + dir-i_ino); iput(inode); return ERR_PTR(-ENOENT); } - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] ext4: different maxbytes functions for bitmap extent files
Andreas Dilger wrote: On Dec 04, 2007 09:58 -0600, Eric Sandeen wrote: +static loff_t ext4_max_size(int blkbits) +{ +loff_t res; +loff_t upper_limit = MAX_LFS_FILESIZE; + +/* small i_blocks in vfs inode? */ +if (sizeof(blkcnt_t) sizeof(u64)) { It would probably be more future-proof by having a struct inode here and using if (sizeof(inode-i_blocks) sizeof(__u64)). Ok. Honestly these were both just lifted from the original max_size function for bitmap files :) -Eric +upper_limit = (1LL 32) - 1; + +/* total blocks in file system block size */ +upper_limit = (blkbits - 9); +upper_limit = blkbits; There is probably a cleaner way to do this, maybe: upper_limit = (1LL (32 + 9) - 1) ~((1 blkbits) - 1); Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] ext4: different maxbytes functions for bitmap extent files
use 2 different maxbytes functions for bitmapped extent-based files. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext4/super.c === --- linux-2.6.24-rc3.orig/fs/ext4/super.c +++ linux-2.6.24-rc3/fs/ext4/super.c @@ -1656,17 +1656,57 @@ static void ext4_orphan_cleanup (struct } /* - * Maximal file size. There is a direct, and {,double-,triple-}indirect + * Maximal extent format file size. + * Resulting logical blkno at s_maxbytes must fit in our on-disk + * extent format containers, within a sector_t, and within i_blocks + * in the vfs. ext4 inode has 48 bits of i_block in fsblock units, + * so that won't be a limiting factor. + * + * Note, this does *not* consider any metadata overhead for vfs i_blocks. + */ +static loff_t ext4_max_size(int blkbits) +{ + loff_t res; + loff_t upper_limit = MAX_LFS_FILESIZE; + + /* small i_blocks in vfs inode? */ + if (sizeof(blkcnt_t) sizeof(u64)) { + /* +* CONFIG_LSF is not enabled implies the inode +* i_block represent total blocks in 512 bytes +* 32 == size of vfs inode i_blocks * 8 +*/ + upper_limit = (1LL 32) - 1; + + /* total blocks in file system block size */ + upper_limit = (blkbits - 9); + upper_limit = blkbits; + } + + /* 32-bit extent-start container, ee_block */ + res = 1LL 32; + res = blkbits; + res -= 1; + + /* Sanity check against vm- vfs- imposed limits */ + if (res upper_limit) + res = upper_limit; + + return res; +} + +/* + * Maximal bitmap file size. There is a direct, and {,double-,triple-}indirect * block limit, and also a limit of (248 - 1) 512-byte sectors in i_blocks. * We need to be 1 filesystem block less than the 248 sector limit. */ -static loff_t ext4_max_size(int bits) +static loff_t ext4_max_bitmap_size(int bits) { loff_t res = EXT4_NDIR_BLOCKS; int meta_blocks; loff_t upper_limit; /* This is calculated to be the largest file size for a -* dense, file such that the total number of +* dense, bitmapped file such that the total number of * sectors in the file, including data and all indirect blocks, * does not exceed 248 -1 * __u32 i_blocks_lo and _u16 i_blocks_high representing the - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] ext4: Handle different max offsets for bitmap extent-based files
Basic approach: have both ext4_max_bitmap_size() and ext4_max_size() functions to compute max offsets for both types of formats. Use vfs sb-s_maxbytes for the native maxbytes, i.e. extent-format files. Put the smaller bitmap limit in a new sbi-s_bitmap_maxbytes in the ext4 superblock info structure. Catch bitmap files in ext4_file_write() and ext4_setattr() to limit extending writes, llseeks, and truncates to too-large offsets which the VFS let through due to the extent-format maxbytes. On write, allow writes up to the max, but then stop, by using iov_shorten() to limit the size of the write to the maximum. 3 patches follow: ext4_two_maxbytes_functions.patch - differentiate the maxbytes f'ns ext4_bitmap_maxbytes_vfs.patch - export iov_shorten from kernel ext4_bitmap_maxbytes.patch - store, and limit to, bitmap_maxbytes Comments? Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] ext4: (V2) store maxbytes for bitmapped files and return EFBIG as appropriate
Crud, minor off-by-one in ext4_file_write. V2 below: === Calculate store the max offset for bitmapped files, and catch too-large seeks, truncates, and writes in ext4, shortening or rejecting as appropriate. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/ext4/super.c === --- linux-2.6.24-rc3.orig/fs/ext4/super.c +++ linux-2.6.24-rc3/fs/ext4/super.c @@ -1979,6 +1979,7 @@ static int ext4_fill_super (struct super } } + sbi-s_bitmap_maxbytes = ext4_max_bitmap_size(sb-s_blocksize_bits); sb-s_maxbytes = ext4_max_size(sb-s_blocksize_bits); if (le32_to_cpu(es-s_rev_level) == EXT4_GOOD_OLD_REV) { Index: linux-2.6.24-rc3/fs/ext4/file.c === --- linux-2.6.24-rc3.orig/fs/ext4/file.c +++ linux-2.6.24-rc3/fs/ext4/file.c @@ -56,8 +56,29 @@ ext4_file_write(struct kiocb *iocb, cons ssize_t ret; int err; - ret = generic_file_aio_write(iocb, iov, nr_segs, pos); + /* +* If we have encountered a bitmap-format file, the size limit +* is smaller than s_maxbytes, which is for extent-mapped files. +*/ + if (!(EXT4_I(inode)-i_flags EXT4_EXTENTS_FL)) { + struct ext4_sb_info *sbi = EXT4_SB(inode-i_sb); + size_t length = iov_length(iov, nr_segs); + + if (pos = sbi-s_bitmap_maxbytes) { + if (length || pos sbi-s_bitmap_maxbytes) { + return -EFBIG; + } + /* zero-length writes at maxbytes are OK */ + } + + if (pos + length sbi-s_bitmap_maxbytes) { + nr_segs = iov_shorten((struct iovec *)iov, nr_segs, + sbi-s_bitmap_maxbytes - pos); + } + } + + ret = generic_file_aio_write(iocb, iov, nr_segs, pos); /* * Skip flushing if there was an error, or if nothing was written. */ Index: linux-2.6.24-rc3/fs/ext4/inode.c === --- linux-2.6.24-rc3.orig/fs/ext4/inode.c +++ linux-2.6.24-rc3/fs/ext4/inode.c @@ -309,7 +309,9 @@ static int ext4_block_to_path(struct ino offsets[n++] = i_block (ptrs - 1); final = ptrs; } else { - ext4_warning(inode-i_sb, ext4_block_to_path, block big); + ext4_warning(inode-i_sb, ext4_block_to_path, block %u max, + i_block + direct_blocks + + indirect_blocks + double_blocks); } if (boundary) *boundary = final - 1 - (i_block (ptrs - 1)); @@ -3212,6 +3214,17 @@ int ext4_setattr(struct dentry *dentry, ext4_journal_stop(handle); } + if (attr-ia_valid ATTR_SIZE) { + if (!(EXT4_I(inode)-i_flags EXT4_EXTENTS_FL)) { + struct ext4_sb_info *sbi = EXT4_SB(inode-i_sb); + + if (attr-ia_size sbi-s_bitmap_maxbytes) { + error = -EFBIG; + goto err_out; + } + } + } + if (S_ISREG(inode-i_mode) attr-ia_valid ATTR_SIZE attr-ia_size inode-i_size) { handle_t *handle; Index: linux-2.6.24-rc3/include/linux/ext4_fs_sb.h === --- linux-2.6.24-rc3.orig/include/linux/ext4_fs_sb.h +++ linux-2.6.24-rc3/include/linux/ext4_fs_sb.h @@ -38,6 +38,7 @@ struct ext4_sb_info { ext4_group_t s_groups_count;/* Number of groups in the fs */ unsigned long s_overhead_last; /* Last calculated overhead */ unsigned long s_blocks_last;/* Last seen block count */ + loff_t s_bitmap_maxbytes; /* max bytes for bitmap files */ struct buffer_head * s_sbh; /* Buffer containing the super block */ struct ext4_super_block * s_es; /* Pointer to the super block in the buffer */ struct buffer_head ** s_group_desc; - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ext4: Check for the correct error return from ext4_ext_get_blocks
Aneesh Kumar K.V wrote: ext4_ext_get_blocks returns negative values on error. We should check for = 0 Signed-off-by: Aneesh Kumar K.V [EMAIL PROTECTED] --- fs/ext4/extents.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index a2475d4..ce57245 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2558,8 +2558,8 @@ retry: ret = ext4_ext_get_blocks(handle, inode, block, max_blocks, map_bh, EXT4_CREATE_UNINITIALIZED_EXT, 0); - WARN_ON(!ret); - if (!ret) { + WARN_ON(ret = 0); + if (ret = 0) { ext4_error(inode-i_sb, ext4_fallocate, ext4_ext_get_blocks returned 0! inode#%lu , block=%llu, max_blocks=%llu, Need to change the returned 0! part of the error to returned %d! then too, I guess? -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] ext4: accumulate jbd2 stats in jiffies
Accumulate jbd2 stats in jiffies not msecs, per akpm's suggestion. Convert to msecs on when displayed. jbd2_time_diff() should still be moved to a common header file. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/jbd2/checkpoint.c === --- linux-2.6.24-rc3.orig/fs/jbd2/checkpoint.c +++ linux-2.6.24-rc3/fs/jbd2/checkpoint.c @@ -326,7 +326,7 @@ int jbd2_log_do_checkpoint(journal_t *jo goto out; transaction = journal-j_checkpoint_transactions; if (transaction-t_chp_stats.cs_chp_time == 0) - transaction-t_chp_stats.cs_chp_time = CURRENT_MSECS; + transaction-t_chp_stats.cs_chp_time = jiffies; this_tid = transaction-t_tid; restart: /* Index: linux-2.6.24-rc3/fs/jbd2/commit.c === --- linux-2.6.24-rc3.orig/fs/jbd2/commit.c +++ linux-2.6.24-rc3/fs/jbd2/commit.c @@ -340,7 +340,7 @@ void jbd2_journal_commit_transaction(jou commit_transaction-t_state = T_LOCKED; stats.ts_wait = commit_transaction-t_max_wait; - stats.ts_locked = CURRENT_MSECS; + stats.ts_locked = jiffies; stats.ts_running = jbd2_time_diff(commit_transaction-t_start, stats.ts_locked); @@ -414,7 +414,7 @@ void jbd2_journal_commit_transaction(jou */ jbd2_journal_switch_revoke_table(journal); - stats.ts_flushing = CURRENT_MSECS; + stats.ts_flushing = jiffies; stats.ts_locked = jbd2_time_diff(stats.ts_locked, stats.ts_flushing); commit_transaction-t_state = T_FLUSH; @@ -508,7 +508,7 @@ void jbd2_journal_commit_transaction(jou */ commit_transaction-t_state = T_COMMIT; - stats.ts_logging = CURRENT_MSECS; + stats.ts_logging = jiffies; stats.ts_flushing = jbd2_time_diff(stats.ts_flushing, stats.ts_logging); stats.ts_blocks = commit_transaction-t_outstanding_credits; stats.ts_blocks_logged = 0; @@ -907,7 +907,7 @@ restart_loop: J_ASSERT(commit_transaction-t_state == T_COMMIT); - commit_transaction-t_start = CURRENT_MSECS; + commit_transaction-t_start = jiffies; stats.ts_logging = jbd2_time_diff(stats.ts_logging, commit_transaction-t_start); Index: linux-2.6.24-rc3/fs/jbd2/transaction.c === --- linux-2.6.24-rc3.orig/fs/jbd2/transaction.c +++ linux-2.6.24-rc3/fs/jbd2/transaction.c @@ -60,7 +60,7 @@ jbd2_get_transaction(journal_t *journal, J_ASSERT(journal-j_running_transaction == NULL); journal-j_running_transaction = transaction; transaction-t_max_wait = 0; - transaction-t_start = CURRENT_MSECS; + transaction-t_start = jiffies; return transaction; } @@ -87,7 +87,7 @@ static int start_this_handle(journal_t * int nblocks = handle-h_buffer_credits; transaction_t *new_transaction = NULL; int ret = 0; - unsigned long ts = CURRENT_MSECS; + unsigned long ts = jiffies; if (nblocks journal-j_max_transaction_buffers) { printk(KERN_ERR JBD: %s wants too many credits (%d %d)\n, Index: linux-2.6.24-rc3/include/linux/jbd2.h === --- linux-2.6.24-rc3.orig/include/linux/jbd2.h +++ linux-2.6.24-rc3/include/linux/jbd2.h @@ -626,16 +626,13 @@ struct transaction_stats_s #define ts_written u.chp.cs_written #define ts_dropped u.chp.cs_dropped -#define CURRENT_MSECS (jiffies_to_msecs(jiffies)) - -static inline unsigned int -jbd2_time_diff(unsigned int start, unsigned int end) +static inline unsigned long +jbd2_time_diff(unsigned long start, unsigned long end) { - if (unlikely(start end)) - end = end + (~0UL - start); - else - end -= start; - return end; + if (end = start) + return end - start; + + return end + (MAX_JIFFY_OFFSET - start); } /** Index: linux-2.6.24-rc3/fs/jbd2/journal.c === --- linux-2.6.24-rc3.orig/fs/jbd2/journal.c +++ linux-2.6.24-rc3/fs/jbd2/journal.c @@ -712,15 +712,19 @@ static int jbd2_seq_history_show(struct return 0; } if (ts-ts_type == JBD2_STATS_RUN) - seq_printf(seq, %-4s %-5lu %-5lu %-5lu %-5lu %-5lu %-5lu + seq_printf(seq, %-4s %-5lu %-5u %-5u %-5u %-5u %-5u %-6lu %-5lu %-5lu\n, R, ts-ts_tid, - ts-ts_wait, ts-ts_running, ts-ts_locked, - ts-ts_flushing, ts-ts_logging, + jiffies_to_msecs(ts-ts_wait), + jiffies_to_msecs(ts-ts_running
[PATCH 2/4] ext4: Address various akpm jbd2 stats comments
(Andrew, cc'ing you since these address your original review comments; feel free to just pick the patch up via Ted/Mingming if you prefer) Address several of akpm's comments on the jbd2 stats patch: o return -ENOMEM not -EIO on memory failure o avoid unneeded casts of void pointers o minor formatting changes o size bdevname char arrays with BDEVNAME_SIZE o use #ifdef vs. #if defined() for single test Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc3/fs/jbd2/journal.c === --- linux-2.6.24-rc3.orig/fs/jbd2/journal.c +++ linux-2.6.24-rc3/fs/jbd2/journal.c @@ -747,12 +747,12 @@ static int jbd2_seq_history_open(struct s = kmalloc(sizeof(*s), GFP_KERNEL); if (s == NULL) - return -EIO; + return -ENOMEM; size = sizeof(struct transaction_stats_s) * journal-j_history_max; s-stats = kmalloc(size, GFP_KERNEL); if (s-stats == NULL) { kfree(s); - return -EIO; + return -ENOMEM; } spin_lock(journal-j_history_lock); memcpy(s-stats, journal-j_history, size); @@ -762,7 +762,7 @@ static int jbd2_seq_history_open(struct rc = seq_open(file, jbd2_seq_history_ops); if (rc == 0) { - struct seq_file *m = (struct seq_file *)file-private_data; + struct seq_file *m = file-private_data; m-private = s; } else { kfree(s-stats); @@ -774,8 +774,9 @@ static int jbd2_seq_history_open(struct static int jbd2_seq_history_release(struct inode *inode, struct file *file) { - struct seq_file *seq = (struct seq_file *)file-private_data; + struct seq_file *seq = file-private_data; struct jbd2_stats_proc_session *s = seq-private; + kfree(s-stats); kfree(s); return seq_release(inode, file); @@ -802,6 +803,7 @@ static void *jbd2_seq_info_next(struct s static int jbd2_seq_info_show(struct seq_file *seq, void *v) { struct jbd2_stats_proc_session *s = seq-private; + if (v != SEQ_START_TOKEN) return 0; seq_printf(seq, %lu transaction, each upto %u blocks\n, @@ -847,12 +849,12 @@ static int jbd2_seq_info_open(struct ino s = kmalloc(sizeof(*s), GFP_KERNEL); if (s == NULL) - return -EIO; + return -ENOMEM; size = sizeof(struct transaction_stats_s); s-stats = kmalloc(size, GFP_KERNEL); if (s-stats == NULL) { kfree(s); - return -EIO; + return -ENOMEM; } spin_lock(journal-j_history_lock); memcpy(s-stats, journal-j_stats, size); @@ -861,7 +863,7 @@ static int jbd2_seq_info_open(struct ino rc = seq_open(file, jbd2_seq_info_ops); if (rc == 0) { - struct seq_file *m = (struct seq_file *)file-private_data; + struct seq_file *m = file-private_data; m-private = s; } else { kfree(s-stats); @@ -873,7 +875,7 @@ static int jbd2_seq_info_open(struct ino static int jbd2_seq_info_release(struct inode *inode, struct file *file) { - struct seq_file *seq = (struct seq_file *)file-private_data; + struct seq_file *seq = file-private_data; struct jbd2_stats_proc_session *s = seq-private; kfree(s-stats); kfree(s); @@ -892,7 +894,7 @@ static struct proc_dir_entry *proc_jbd2_ static void jbd2_stats_proc_init(journal_t *journal) { - char name[64]; + char name[BDEVNAME_SIZE]; snprintf(name, sizeof(name) - 1, %s, bdevname(journal-j_dev, name)); journal-j_proc_entry = proc_mkdir(name, proc_jbd2_stats); @@ -915,7 +917,7 @@ static void jbd2_stats_proc_init(journal static void jbd2_stats_proc_exit(journal_t *journal) { - char name[64]; + char name[BDEVNAME_SIZE]; snprintf(name, sizeof(name) - 1, %s, bdevname(journal-j_dev, name)); remove_proc_entry(info, journal-j_proc_entry); @@ -2207,7 +2209,7 @@ static void __exit jbd2_remove_debugfs_e #endif -#if defined(CONFIG_PROC_FS) +#ifdef CONFIG_PROC_FS #define JBD2_STATS_PROC_NAME fs/jbd2 - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] support multiple max offset limits for a single superblock
Reiserfs, and now ext4, both have the possibility of encountering older on-disk format files which do not support the s_maxbytes of the newer formats. Reiserfs currently has spots in various places to catch these too-large offsets and reject them, but it's replicating a bit of code in the process. I could do the same for ext4, but Chris Mason prodded me to think of something more generic... this is what I came up with. A filesystem could then define a maxbytes i_op, and if present, it would return the max offset for that particular inode, based on format. I'm not wedded to this, but thought I'd send it out for comment. (BTW another option would be to convert old-format files when accessed, but that has its own set of tradeoffs...) Thanks, -Eric Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc1/fs/buffer.c === --- linux-2.6.24-rc1.orig/fs/buffer.c +++ linux-2.6.24-rc1/fs/buffer.c @@ -2175,7 +2175,7 @@ int generic_cont_expand_simple(struct in send_sig(SIGXFSZ, current, 0); goto out; } - if (size inode-i_sb-s_maxbytes) + if (size inode_maxbytes(inode)) goto out; err = pagecache_write_begin(NULL, mapping, size, 0, Index: linux-2.6.24-rc1/fs/open.c === --- linux-2.6.24-rc1.orig/fs/open.c +++ linux-2.6.24-rc1/fs/open.c @@ -399,7 +399,7 @@ asmlinkage long sys_fallocate(int fd, in ret = -EFBIG; /* Check for wrap through zero too */ - if (((offset + len) inode-i_sb-s_maxbytes) || ((offset + len) 0)) + if (((offset + len) inode_maxbytes(inode)) || ((offset + len) 0)) goto out_fput; if (inode-i_op inode-i_op-fallocate) Index: linux-2.6.24-rc1/fs/read_write.c === --- linux-2.6.24-rc1.orig/fs/read_write.c +++ linux-2.6.24-rc1/fs/read_write.c @@ -45,7 +45,7 @@ loff_t generic_file_llseek(struct file * offset += file-f_pos; } retval = -EINVAL; - if (offset=0 offset=inode-i_sb-s_maxbytes) { + if (offset = 0 offset = inode_maxbytes(inode)) { if (offset != file-f_pos) { file-f_pos = offset; file-f_version = 0; @@ -71,7 +71,7 @@ loff_t remote_llseek(struct file *file, offset += file-f_pos; } retval = -EINVAL; - if (offset=0 offset=file-f_path.dentry-d_inode-i_sb-s_maxbytes) { + if (offset=0 offset=inode_maxbytes(file-f_path.dentry-d_inode)) { if (offset != file-f_pos) { file-f_pos = offset; file-f_version = 0; @@ -764,7 +764,7 @@ static ssize_t do_sendfile(int out_fd, i goto fput_out; if (!max) - max = min(in_inode-i_sb-s_maxbytes, out_inode-i_sb-s_maxbytes); + max = min(inode_maxbytes(in_inode), inode_maxbytes(out_inode)); pos = *ppos; retval = -EINVAL; Index: linux-2.6.24-rc1/include/linux/fs.h === --- linux-2.6.24-rc1.orig/include/linux/fs.h +++ linux-2.6.24-rc1/include/linux/fs.h @@ -1217,8 +1217,17 @@ struct inode_operations { void (*truncate_range)(struct inode *, loff_t, loff_t); long (*fallocate)(struct inode *inode, int mode, loff_t offset, loff_t len); + unsigned long long (*maxbytes)(struct inode *inode); }; +static inline unsigned long long inode_maxbytes(struct inode *inode) +{ + if (unlikely(inode-i_op-maxbytes)) { + return inode-i_op-maxbytes(inode); + } else + return inode-i_sb-s_maxbytes; +} + struct seq_file; ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector, Index: linux-2.6.24-rc1/mm/filemap.c === --- linux-2.6.24-rc1.orig/mm/filemap.c +++ linux-2.6.24-rc1/mm/filemap.c @@ -1827,15 +1827,15 @@ inline int generic_write_checks(struct f * Linus frestrict idea will clean these up nicely.. */ if (likely(!isblk)) { - if (unlikely(*pos = inode-i_sb-s_maxbytes)) { - if (*count || *pos inode-i_sb-s_maxbytes) { + if (unlikely(*pos = inode_maxbytes(inode))) { + if (*count || *pos inode_maxbytes(inode)) { return -EFBIG; } - /* zero-length writes at -s_maxbytes are OK */ + /* zero-length writes at inode_maxbytes are OK */ } - if (unlikely(*pos + *count inode-i_sb-s_maxbytes)) - *count = inode-i_sb-s_maxbytes - *pos; + if (unlikely(*pos + *count
Re: Coverity testing freely available
Andreas Dilger wrote: Hi, I've recently found scan.coverity.com, which allows OSS-licensed projects to be run under Coverity to find defects available in the code. I think that e2fsprogs should be included there, even if some people run e2fsprogs under Coverity separately. This allows the e2fsprogs developers to find and fix the bugs more easily. I don't see e2fsprogs on the website yet. Anyone care if I go ahead ask that e2fsprogs be included? -Eric Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EXT4 set 6][PATCH 1/1]Export jbd stats through procfs
Mingming Cao wrote: [PATCH] jbd2 stats through procfs The patch below updates the jbd stats patch to 2.6.20/jbd2. The initial patch was posted by Alex Tomas in December 2005 (http://marc.info/?l=linux-ext4m=113538565128617w=2). It provides statistics via procfs such as transaction lifetime and size. [ This probably should be rewritten to use debugfs? -- Ted] Signed-off-by: Johann Lombardi [EMAIL PROTECTED] I've started going through this one to clean it up to the point where it can go forward. It's been sitting at the top of the unstable portion of the patch queue for long enough, I think :) I've converted the msecs to jiffies until the user boundary, changed the union #defines as suggested by Andrew, and various other little issues etc. Remaining to do is a generic time-difference calculator (instead of jbd2_time_diff), and looking into whether it should be made a config option; I tend to think it should, but it's fairly well sprinkled through the code, so I'll see how well that works. Also did we ever decided if this should go to debugfs? Thanks, -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] fix up vfs inode version patch for modules
ext4 calls inode_inc_iversion(), but it's not exported, so modular ext4 has trouble. Any reason not to just make it an inline, as below? Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc1/fs/inode.c === --- linux-2.6.24-rc1.orig/fs/inode.c +++ linux-2.6.24-rc1/fs/inode.c @@ -1243,23 +1243,6 @@ void touch_atime(struct vfsmount *mnt, s EXPORT_SYMBOL(touch_atime); /** - * inode_inc_iversion - increments i_version - * @inode: inode that need to be updated - * - * Every time the inode is modified, the i_version field - * will be incremented. - * The filesystem has to be mounted with i_version flag - * - */ - -void inode_inc_iversion(struct inode *inode) -{ - spin_lock(inode-i_lock); - inode-i_version++; - spin_unlock(inode-i_lock); -} - -/** * file_update_time- update mtime and ctime time * @file: file accessed * Index: linux-2.6.24-rc1/include/linux/fs.h === --- linux-2.6.24-rc1.orig/include/linux/fs.h +++ linux-2.6.24-rc1/include/linux/fs.h @@ -1396,7 +1396,21 @@ static inline void inode_dec_link_count( mark_inode_dirty(inode); } -extern void inode_inc_iversion(struct inode *inode); +/** + * inode_inc_iversion - increments i_version + * @inode: inode that need to be updated + * + * Every time the inode is modified, the i_version field will be incremented. + * The filesystem has to be mounted with i_version flag + */ + +static inline void inode_inc_iversion(struct inode *inode) +{ + spin_lock(inode-i_lock); + inode-i_version++; + spin_unlock(inode-i_lock); +} + extern void touch_atime(struct vfsmount *mnt, struct dentry *dentry); static inline void file_accessed(struct file *file) { - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ext4: fix up EXT4FS_DEBUG builds
Builds with EXT4FS_DEBUG defined (to enable ext4_debug()) fail without these changes. Clean up some format warnings too. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc1/fs/ext4/balloc.c === --- linux-2.6.24-rc1.orig/fs/ext4/balloc.c +++ linux-2.6.24-rc1/fs/ext4/balloc.c @@ -1579,7 +1579,7 @@ ext4_fsblk_t ext4_new_blocks_old(handle_ sbi = EXT4_SB(sb); es = EXT4_SB(sb)-s_es; - ext4_debug(goal=%lu.\n, goal); + ext4_debug(goal=%llu.\n, goal); /* * Allocate a block from reservation only when * filesystem is mounted with reservation(default,-o reservation), and @@ -1689,7 +1689,7 @@ retry_alloc: allocated: - ext4_debug(using block group %d(%d)\n, + ext4_debug(using block group %lu(%d)\n, group_no, gdp-bg_free_blocks_count); BUFFER_TRACE(gdp_bh, get_write_access); @@ -1880,7 +1880,7 @@ ext4_fsblk_t ext4_count_free_blocks(stru brelse(bitmap_bh); printk(ext4_count_free_blocks: stored = %llu , computed = %llu, %llu\n, - EXT4_FREE_BLOCKS_COUNT(es), + ext4_free_blocks_count(es), desc_count, bitmap_count); return bitmap_count; #else Index: linux-2.6.24-rc1/fs/ext4/ialloc.c === --- linux-2.6.24-rc1.orig/fs/ext4/ialloc.c +++ linux-2.6.24-rc1/fs/ext4/ialloc.c @@ -857,7 +857,7 @@ unsigned long ext4_count_free_inodes (st continue; x = ext4_count_free(bitmap_bh, EXT4_INODES_PER_GROUP(sb) / 8); - printk(group %d: stored = %d, counted = %lu\n, + printk(group %lu: stored = %d, counted = %lu\n, i, le16_to_cpu(gdp-bg_free_inodes_count), x); bitmap_count += x; } Index: linux-2.6.24-rc1/fs/ext4/mballoc.c === --- linux-2.6.24-rc1.orig/fs/ext4/mballoc.c +++ linux-2.6.24-rc1/fs/ext4/mballoc.c @@ -2953,9 +2953,6 @@ static int ext4_mb_mark_diskspace_used(s sbi = EXT4_SB(sb); es = sbi-s_es; - ext4_debug(using block group %d(%d)\n, ac-ac_b_group.group, - gdp-bg_free_blocks_count); - err = -EIO; bitmap_bh = read_block_bitmap(sb, ac-ac_b_ex.fe_group); if (!bitmap_bh) @@ -2970,6 +2967,9 @@ static int ext4_mb_mark_diskspace_used(s if (!gdp) goto out_err; + ext4_debug(using block group %lu(%d)\n, ac-ac_b_ex.fe_group, + gdp-bg_free_blocks_count); + err = ext4_journal_get_write_access(handle, gdp_bh); if (err) goto out_err; Index: linux-2.6.24-rc1/fs/ext4/resize.c === --- linux-2.6.24-rc1.orig/fs/ext4/resize.c +++ linux-2.6.24-rc1/fs/ext4/resize.c @@ -206,7 +206,7 @@ static int setup_new_group_blocks(struct } if (ext4_bg_has_super(sb, input-group)) { - ext4_debug(mark backup superblock %#04lx (+0)\n, start); + ext4_debug(mark backup superblock %#04llx (+0)\n, start); ext4_set_bit(0, bh-b_data); } @@ -215,7 +215,7 @@ static int setup_new_group_blocks(struct i gdblocks; i++, block++, bit++) { struct buffer_head *gdb; - ext4_debug(update backup group %#04lx (+%d)\n, block, bit); + ext4_debug(update backup group %#04llx (+%d)\n, block, bit); if ((err = extend_or_restart_transaction(handle, 1, bh))) goto exit_bh; @@ -243,7 +243,7 @@ static int setup_new_group_blocks(struct i reserved_gdb; i++, block++, bit++) { struct buffer_head *gdb; - ext4_debug(clear reserved block %#04lx (+%d)\n, block, bit); + ext4_debug(clear reserved block %#04llx (+%d)\n, block, bit); if ((err = extend_or_restart_transaction(handle, 1, bh))) goto exit_bh; @@ -256,10 +256,10 @@ static int setup_new_group_blocks(struct ext4_set_bit(bit, bh-b_data); brelse(gdb); } - ext4_debug(mark block bitmap %#04x (+%ld)\n, input-block_bitmap, + ext4_debug(mark block bitmap %#04llx (+%llu)\n, input-block_bitmap, input-block_bitmap - start); ext4_set_bit(input-block_bitmap - start, bh-b_data); - ext4_debug(mark inode bitmap %#04x (+%ld)\n, input-inode_bitmap, + ext4_debug(mark inode bitmap %#04llx (+%llu)\n, input-inode_bitmap, input-inode_bitmap - start); ext4_set_bit(input-inode_bitmap - start, bh-b_data); @@ -268,7 +268,7 @@ static int setup_new_group_blocks(struct i sbi-s_itb_per_group; i++, bit++, block++) { struct buffer_head
Re: e2fsprogs-interim scm tree?
Theodore Tso wrote: On Tue, Nov 20, 2007 at 11:24:03AM +0100, Christian Kujau wrote: Note that the userspace code still needs a lot of work. It's for this reason that I haven't been recommending people use it for production systems just yet. Understood. But the more ppl testing your stuff the better, right? Oh, absolutely. I just don't think it's fair to encourage people to use something that might cause them to risk their data unless they are going into it with their eyes wide open. Please do report any problems you find, though. Ted has said that the sourceforge bugtracker is the right place to do this for now. -Eric (remembering he has a bug to report, too) - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] types fixup for mballoc
I ran into a potential overflow in ext4_mb_scan_aligned, and went looking for others in mballoc. This patch hits a few spots, compile-tested only at this point, comments welcome. This patch: changes fe_len to an int, I don't think we need it to be a long, looking at how it's used (should it be a grpblk_t?) Also change anything assigned to return value of mb_find_extent, since it returns fe_len. changes anything that does groupno * EXT4_BLOCKS_PER_GROUP or pa-pa_pstart + whatever to an ext4_fsblk_t fixes up any related formats The change to ext4_mb_scan_aligned to add a modulo to avoid an overflow may be too clever... it could use an extra look I think. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] --- Index: linux-2.6.24-rc1/fs/ext4/mballoc.c === --- linux-2.6.23.orig/fs/ext4/mballoc.c +++ linux-2.6.23/fs/ext4/mballoc.c @@ -363,7 +363,7 @@ struct ext4_free_extent { ext4_lblk_t fe_logical; ext4_grpblk_t fe_start; ext4_grpnum_t fe_group; - unsigned long fe_len; + int fe_len; }; /* @@ -586,14 +586,14 @@ static void mb_free_blocks_double(struct BUG_ON(!ext4_is_group_locked(sb, e4b-bd_group)); for (i = 0; i count; i++) { if (!mb_test_bit(first + i, e4b-bd_info-bb_bitmap)) { - unsigned long blocknr; + ext4_fsblk_t blocknr; blocknr = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb); blocknr += first + i; blocknr += le32_to_cpu(EXT4_SB(sb)-s_es-s_first_data_block); ext4_error(sb, __FUNCTION__, double-free of inode - %lu's block %lu(bit %u in group %lu)\n, + %lu's block %llu(bit %u in group %lu)\n, inode ? inode-i_ino : 0, blocknr, first + i, e4b-bd_group); } @@ -1226,14 +1226,14 @@ static int mb_free_blocks(struct inode * order = 0; if (!mb_test_bit(block, EXT4_MB_BITMAP(e4b))) { - unsigned long blocknr; + ext4_fsblk_t blocknr; blocknr = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb); blocknr += block; blocknr += le32_to_cpu(EXT4_SB(sb)-s_es-s_first_data_block); ext4_error(sb, __FUNCTION__, double-free of inode - %lu's block %lu(bit %u in group %lu)\n, + %lu's block %llu(bit %u in group %lu)\n, inode ? inode-i_ino : 0, blocknr, block, e4b-bd_group); } @@ -1416,7 +1416,7 @@ static void ext4_mb_use_best_found(struc struct ext4_buddy *e4b) { struct ext4_sb_info *sbi = EXT4_SB(ac-ac_sb); - unsigned long ret; + int ret; BUG_ON(ac-ac_b_ex.fe_group != e4b-bd_group); BUG_ON(ac-ac_status == AC_STATUS_FOUND); @@ -1609,7 +1609,7 @@ static int ext4_mb_find_by_goal(struct e ac-ac_g_ex.fe_len, ex); if (max = ac-ac_g_ex.fe_len ac-ac_g_ex.fe_len == sbi-s_stripe) { - unsigned long start; + ext4_fsblk_t start; start = (e4b-bd_group * EXT4_BLOCKS_PER_GROUP(ac-ac_sb) + ex.fe_start + le32_to_cpu(es-s_first_data_block)); if (start % sbi-s_stripe == 0) { @@ -1732,13 +1732,14 @@ static void ext4_mb_scan_aligned(struct struct ext4_sb_info *sbi = EXT4_SB(sb); void *bitmap = EXT4_MB_BITMAP(e4b); struct ext4_free_extent ex; - unsigned long i; - unsigned long max; + ext4_grpblk_t i; + int max; BUG_ON(sbi-s_stripe == 0); - /* find first stripe-aligned block */ - i = e4b-bd_group * EXT4_BLOCKS_PER_GROUP(sb) + /* find first stripe-aligned block in group */ + /* early modulo here to avoid 32-bit overflow */ + i = (e4b-bd_group % sbi-s_stripe) * EXT4_BLOCKS_PER_GROUP(sb) + le32_to_cpu(sbi-s_es-s_first_data_block); i = ((i + sbi-s_stripe - 1) / sbi-s_stripe) * sbi-s_stripe; i = (i - le32_to_cpu(sbi-s_es-s_first_data_block)) @@ -2026,35 +2027,35 @@ static int ext4_mb_seq_history_show(stru if (hs-op == EXT4_MB_HISTORY_ALLOC) { fmt = %-5u %-8u %-23s %-23s %-23s %-5u %-5u %-2u %-5u %-5s %-5u %-6u\n; - sprintf(buf2, %lu/%d/[EMAIL PROTECTED], hs-result.fe_group, + sprintf(buf2, %lu/%d/[EMAIL PROTECTED], hs-result.fe_group, hs-result.fe_start, hs-result.fe_len, - (unsigned long)hs-result.fe_logical
Re: [PATCH] types fixup for mballoc
Eric Sandeen wrote: I ran into a potential overflow in ext4_mb_scan_aligned, and went looking for others in mballoc. This patch hits a few spots, compile-tested only at this point, comments welcome. This patch: ... introduces a 64-bit divide. Oops... will fix that up resend. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
delalloc space accounting problem.
It appears that delalloc lets me copy 50M of data onto a 30M filesystem; at least I never get ENOSPC back, although I wind up with several files that have 1M length but 0 blocks. I've filed a bug in the kernel bug tracker, I think we could use a central place to track issues: http://bugzilla.kernel.org/show_bug.cgi?id=9329 I'll try to find time to look into this unless someone knows offhand where the problem is... Thanks, -Eric (p.s. should I get ext[234] bug mail routed to this list, or would that be annoying?) - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix check_mntent_file() to pass mode for open(O_CREAT)
Andreas Dilger wrote: On my FC8 install, ismounted.c fails to build because open(O_CREAT) is used without passing a mode. The following trivial patch fixes it. You can add: Acked-by: Eric Sandeen [EMAIL PROTECTED] 'cause it's an awful lot like the patch I sent for the same issue back on 8/16 ;-) Guess I should have followed that up with a ping. (though your 0600 mode is probably better than my 0644 was) Andreas, did you also run into trouble with struct_io_manager's -open calls triggering this test? I sent a patch for that, [PATCH] rename -open and -close ops in struct_io_manager too... maybe the glibc #define tricks got smarter and don't trigger that now? -Eric Signed-off-by: Andreas Dilger [EMAIL PROTECTED] Index: e2fsprogs-1.40.2/lib/ext2fs/ismounted.c === --- e2fsprogs-1.40.2.orig/lib/ext2fs/ismounted.c +++ e2fsprogs-1.40.2/lib/ext2fs/ismounted.c @@ -147,7 +147,7 @@ static errcode_t check_mntent_file(const is_root: #define TEST_FILE /.ismount-test-file *mount_flags |= EXT2_MF_ISROOT; - fd = open(TEST_FILE, O_RDWR|O_CREAT); + fd = open(TEST_FILE, O_RDWR|O_CREAT, 0600); if (fd 0) { if (errno == EROFS) *mount_flags |= EXT2_MF_READONLY; Cheers, Andreas -- Andreas Dilger Sr. Software Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delalloc fragmenting files?
Andreas Dilger wrote: On Nov 06, 2007 13:54 -0600, Eric Sandeen wrote: Hmm bad news is when I add uninit_groups into the mix, it goes a little south again, with some out-of-order extents. Not the end of the world, but a little unexpected? I think part of the issue is that by default the groups marked BLOCK_UNINIT are skipped, to avoid dirtying those groups if they have never been used before. This policy could be changed in the mballoc code pretty easily if you think it is a net loss. Note that the size of the extents is large enough (120MB or more) that some small reordering is probably not going to affect the performance in any meaningful way. You're probably right; on the other hand, this is about the simplest test an allocator could wish for - a single-threaded large linear write in big IO chunks. In this case it's probably not a big deal; I do wonder how it might affect the bigger picture though, with more writing threads, aged filesystems, and the like. Just thought it was worth pointing out, as I started looking at allocator behavior in the simple/isolated/unrealistic :) cases. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix oops on corrupted ext4 mount
Eric Sandeen wrote: When mounting an ext4 filesystem with corrupted s_first_data_block, things can go very wrong and oops. Because blocks_count in ext4_fill_super is a u64, and we must use do_div, the calculation of db_count is done differently than on ext4. Urgh... than on ext3 -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
More testing: 4x parallel 2G writes, sequential reads
I tried ext4 vs. xfs doing 4 parallel 2G IO writes in 1M units to 4 different subdirectories of the root of the filesystem: http://people.redhat.com/esandeen/seekwatcher/ext4_4_threads.png http://people.redhat.com/esandeen/seekwatcher/xfs_4_threads.png http://people.redhat.com/esandeen/seekwatcher/ext4_xfs_4_threads.png and then read them back sequentially: http://people.redhat.com/esandeen/seekwatcher/ext4_4_threads_read.png http://people.redhat.com/esandeen/seekwatcher/xfs_4_threads_read.png http://people.redhat.com/esandeen/seekwatcher/ext4_xfs_4_read_threads.png At the end of the write, ext4 had on the order of 400 extents/file, xfs had on the order of 30 extents/file. It's clear especially from the read graph that ext4 is interleaving the 4 files, in about 5M chunks on average. Throughput seems comparable between ext4 xfs nonetheless. Again this was on a decent HW raid so seek penalties are probably not too bad. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More testing: 4x parallel 2G writes, sequential reads
Andreas Dilger wrote: The question is what the best result is for this kind of workload? In HPC applications the common case is that you will also have the data files read back in parallel instead of serially. Agreed, I'm not trying to argue what's better or worse, I'm just seeing what it's doing. The main reason I did sequential reads back is that it more clearly shows the file layout for each file on the graph. :) I'm just getting a handle on how the allocations are going for various types of writes. The test shows ext4 finishing marginally faster in the write case, and marginally slower in the read case. What happens if you have 4 parallel readers? I'll test that a bit later (have to run now); I expect parallel readers may go faster, since the blocks are interleaved, and it might be able to suck them up pretty much in order across all 4 files. I'd also like to test some of this under a single head, rather than on HW raid... -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More testing: 4x parallel 2G writes, sequential reads
Shapor Naghibzadeh wrote: On Wed, Nov 07, 2007 at 04:42:59PM -0600, Eric Sandeen wrote: Again this was on a decent HW raid so seek penalties are probably not too bad. You may want to verify that by doing a benchmark on the raw device. I recently did some benchmarks doing random I/O on a Dell 2850 w/ a PERC (megaraid) RAID5 w/ 128MB onboard writeback cache and 6x 15krpm drives and noticed appoximately one order of magnitude throughput drop on small (stripe-sized) random reads versus linear. It maxed out at ~100 random read IOPs or seeks/sec (suprisingly low). Out of curiousity, how are you counting the seeks? Chris Mason's seekwatcher (google can find it for you) is doing the graphing, it uses blocktrace for the raw data. -Eric Shapor - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More testing: 4x parallel 2G writes, sequential reads
Andreas Dilger wrote: The test shows ext4 finishing marginally faster in the write case, and marginally slower in the read case. What happens if you have 4 parallel readers? http://people.redhat.com/esandeen/seekwatcher/ext4_4_thread_par_read.png http://people.redhat.com/esandeen/seekwatcher/xfs_4_thread_par_read.png http://people.redhat.com/esandeen/seekwatcher/ext4_xfs_4_thread_par_read.png -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delalloc fragmenting files?
Eric Sandeen wrote: Alex Tomas wrote: thanks for the data. the attached patch should fix couple issues: broken history output and policy in that smaller usable chunk to be used. can you give it a spin, please? Hi Alex - This looks *much* better: Hmm bad news is when I add uninit_groups into the mix, it goes a little south again, with some out-of-order extents. Not the end of the world, but a little unexpected? Discontinuity: Block 1430784 is at 24183810 (was 24181761) Discontinuity: Block 1461760 is at 24216578 (was 24214785) Discontinuity: Block 1492480 is at 37888 (was 24247297) Discontinuity: Block 1519616 is at 850944 (was 65023) Discontinuity: Block 1520640 is at 883712 (was 851967) Discontinuity: Block 1521664 is at 1670144 (was 884735) Discontinuity: Block 1522688 is at 2685952 (was 1671167) Discontinuity: Block 1523712 is at 4226048 (was 2686975) Discontinuity: Block 1524736 is at 11271168 (was 4227071) Discontinuity: Block 1525760 is at 23952384 (was 11272191) ... -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delalloc fragmenting files?
Alex Tomas wrote: thanks for the data. the attached patch should fix couple issues: broken history output and policy in that smaller usable chunk to be used. can you give it a spin, please? Hi Alex - This looks *much* better: First block: 100355 Last block: 2342657 Discontinuity: Block 30208 is at 133120 (was 130562) Discontinuity: Block 60928 is at 165891 (was 163839) Discontinuity: Block 90368 is at 197634 (was 195330) Discontinuity: Block 121856 is at 232448 (was 229121) Discontinuity: Block 150528 is at 263170 (was 261119) Discontinuity: Block 181504 is at 296963 (was 294145) Discontinuity: Block 210944 is at 328706 (was 326402) Discontinuity: Block 241664 is at 361474 (was 359425) Discontinuity: Block 272896 is at 395264 (was 392705) Discontinuity: Block 303360 is at 427010 (was 425727) Discontinuity: Block 334080 is at 459778 (was 457729) Discontinuity: Block 365568 is at 493568 (was 491265) Discontinuity: Block 395264 is at 525314 (was 523263) Discontinuity: Block 426240 is at 558082 (was 556289) for an 8192x1M (8G) buffered dd, I now get 88 extents, in order. Do you want the mballoc history? Now I can try some multithreaded tests :) Thanks, -Eric thanks, Alex - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delalloc fragmenting files?
Eric Sandeen wrote: Alex Tomas wrote: thanks for the data. the attached patch should fix couple issues: broken history output and policy in that smaller usable chunk to be used. can you give it a spin, please? Hi Alex - This looks *much* better: (nb: that's running w/ both of the patches you sent on this thread) -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delalloc fragmenting files?
Andreas Dilger wrote: On a related note - the FIEMAP patches to filefrag also include a new output format that is much more useful, IMHO. The new format is like: {filename} ext: [logical start.. end kB]: phys start..end kB: kB:lun: flags 0: [ 0.. 30207]: 401416..522251: 120828: 0 : 1: [ 30208.. 60927]: 532480..655359: 122880: 0 : 2: [ 60928..121855]: 790536..916484: 125948: 0 : Hopefully Kalpak will be able to post the updated patches here soon. yep, I hacked existing filefrag to do something like this, the existing format is pretty hard to glance over :) One thing I like about xfs_bmap is that it can tell you which Allocation Group the blocks are in; most filesystems have some concept of sub-regions of the filesystem, such as BGs or resource groups or whatnot - do you think there is room for this in the FIEMAP interface? Hm, or should this just be calculated from knowing the size of the sub-regions... -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] allow tune2fs to set/clear resize_inode
(this patch has been carried in Red Hat / Fedora rpms for a while, not sure why it never got sent upstream... clearing this allows for mounting filesystems with a resize_inode on older systems) - Allow tune2fs to set clear resize_inode; requires fsck afterwards. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] Addresses-Red-Hat-Bugzilla: #167816 Index: e2fsprogs-git/misc/tune2fs.c === --- e2fsprogs-git.orig/misc/tune2fs.c +++ e2fsprogs-git/misc/tune2fs.c @@ -96,6 +96,7 @@ static void usage(void) static __u32 ok_features[3] = { EXT3_FEATURE_COMPAT_HAS_JOURNAL | + EXT2_FEATURE_COMPAT_RESIZE_INODE | EXT2_FEATURE_COMPAT_DIR_INDEX, /* Compat */ EXT2_FEATURE_INCOMPAT_FILETYPE, /* Incompat */ EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER /* R/O compat */ @@ -283,6 +284,7 @@ static void update_feature_set(ext2_fils { int sparse, old_sparse, filetype, old_filetype; int journal, old_journal, dxdir, old_dxdir; + int resize_inode, old_resize_inode; struct ext2_super_block *sb= fs-super; __u32 old_compat, old_incompat, old_ro_compat; @@ -298,6 +300,8 @@ static void update_feature_set(ext2_fils EXT3_FEATURE_COMPAT_HAS_JOURNAL; old_dxdir = sb-s_feature_compat EXT2_FEATURE_COMPAT_DIR_INDEX; + old_resize_inode = sb-s_feature_compat + EXT2_FEATURE_COMPAT_RESIZE_INODE; if (e2p_edit_feature(features, sb-s_feature_compat, ok_features)) { fprintf(stderr, _(Invalid filesystem option set: %s\n), @@ -312,6 +316,8 @@ static void update_feature_set(ext2_fils EXT3_FEATURE_COMPAT_HAS_JOURNAL; dxdir = sb-s_feature_compat EXT2_FEATURE_COMPAT_DIR_INDEX; + resize_inode = sb-s_feature_compat + EXT2_FEATURE_COMPAT_RESIZE_INODE; if (old_journal !journal) { if ((mount_flags EXT2_MF_MOUNTED) !(mount_flags EXT2_MF_READONLY)) { @@ -358,7 +364,8 @@ static void update_feature_set(ext2_fils sb-s_feature_incompat)) ext2fs_update_dynamic_rev(fs); if ((sparse != old_sparse) || - (filetype != old_filetype)) { + (filetype != old_filetype) || + (resize_inode != old_resize_inode)) { sb-s_state = ~EXT2_VALID_FS; printf(\n%s\n, _(please_fsck)); } Index: e2fsprogs-git/misc/tune2fs.8.in === --- e2fsprogs-git.orig/misc/tune2fs.8.in +++ e2fsprogs-git/misc/tune2fs.8.in @@ -392,12 +392,16 @@ option. .TP .B sparse_super Limit the number of backup superblocks to save space on large filesystems. +.TP +.B resize_inode +Reserve space so the block group descriptor table may grow in the future. .RE .IP After setting or clearing -.B sparse_super -and -.B filetype +.BR sparse_super , +.BR filetype , +or +.B resize_inode filesystem features, .BR e2fsck (8) must be run on the filesystem to return the filesystem to a consistent state. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH e2fsprogs] - remove timestamps from .po files
Another one that's been in RH/Fedora specfiles a while. Remove timestamps from .po files to avoid multilib conflicts. It ain't pretty but it works. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] Addresses-Red-Hat-Bugzilla: #245653 Index: e2fsprogs-git/po/Makefile.in.in === --- e2fsprogs-git.orig/po/Makefile.in.in +++ e2fsprogs-git/po/Makefile.in.in @@ -118,10 +118,12 @@ $(DOMAIN).pot-update: $(POTFILES) $(srcd rm -f $(DOMAIN).1po $(DOMAIN).2po $(DOMAIN).po; \ else \ rm -f $(DOMAIN).1po $(DOMAIN).2po $(srcdir)/$(DOMAIN).pot \ - mv $(DOMAIN).po $(srcdir)/$(DOMAIN).pot; \ + sed -f remove-potcdate.sed $(DOMAIN).po $(srcdir)/$(DOMAIN).pot \ + rm -f $(DOMAIN).po; \ fi; \ else \ - mv $(DOMAIN).po $(srcdir)/$(DOMAIN).pot; \ + sed -f remove-potcdate.sed $(DOMAIN).po $(srcdir)/$(DOMAIN).pot \ + rm -f $(DOMAIN).po; \ fi; \ } - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC e2fsprogs] - replace MKINSTALLDIRS with MKDIR_P for newer autotools
From: Thomas Woerner [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] (Thomas was the original author of this patch) I do not claim to be an autotools guru, hence the RFC. RH bugzilla #220715 claims that newer autoconf/automake no longer support MKINSTALLDIRS, and the changelog seems to confirm that, as does testing. :) The following is the patch we've been carrying for Fedora: Signed-off-by: Eric Sandeen [EMAIL PROTECTED] Addresses-Red-Hat-Bugzilla: #220715 --- Index: e2fsprogs-git/debugfs/Makefile.in === --- e2fsprogs-git.orig/debugfs/Makefile.in +++ e2fsprogs-git/debugfs/Makefile.in @@ -48,8 +48,8 @@ debugfs.8: $(DEP_SUBSTITUTE) $(srcdir)/d @$(SUBSTITUTE_UPTIME) $(srcdir)/debugfs.8.in debugfs.8 installdirs: - @echo MKINSTALLDIRS $(root_sbindir) $(man8dir) - @$(MKINSTALLDIRS) $(DESTDIR)$(root_sbindir) \ + @echo MKDIR_P $(root_sbindir) $(man8dir) + @$(MKDIR_P) $(DESTDIR)$(root_sbindir) \ $(DESTDIR)$(man8dir) install: $(PROGS) $(MANPAGES) installdirs Index: e2fsprogs-git/intl/Makefile.in === --- e2fsprogs-git.orig/intl/Makefile.in +++ e2fsprogs-git/intl/Makefile.in @@ -40,8 +40,8 @@ subdir = intl INSTALL = @INSTALL@ INSTALL_DATA = @INSTALL_DATA@ -MKINSTALLDIRS = @MKINSTALLDIRS@ -mkinstalldirs = $(SHELL) $(MKINSTALLDIRS) +MKDIR_P = @MKDIR_P@ +mkdir_p = @MKDIR_P@ l = @INTL_LIBTOOL_SUFFIX_PREFIX@ Index: e2fsprogs-git/po/Makefile.in.in === --- e2fsprogs-git.orig/po/Makefile.in.in +++ e2fsprogs-git/po/Makefile.in.in @@ -26,11 +26,10 @@ datarootdir = @datarootdir@ datadir = @datadir@ localedir = $(datadir)/locale gettextsrcdir = $(datadir)/gettext/po +mkdir_p = @MKDIR_P@ INSTALL = @INSTALL@ INSTALL_DATA = @INSTALL_DATA@ -MKINSTALLDIRS = @MKINSTALLDIRS@ -mkinstalldirs = $(SHELL) $(MKINSTALLDIRS) GMSGFMT = @GMSGFMT@ MSGFMT = @MSGFMT@ @@ -150,7 +149,7 @@ install: install-exec install-data install-exec: install-data: [EMAIL PROTECTED]@ if test $(PACKAGE) = gettext-tools; then \ - $(mkinstalldirs) $(DESTDIR)$(gettextsrcdir); \ + $(mkdir_p) $(DESTDIR)$(gettextsrcdir); \ for file in $(DISTFILES.common) Makevars.template; do \ $(INSTALL_DATA) $(srcdir)/$$file \ $(DESTDIR)$(gettextsrcdir)/$$file; \ @@ -163,13 +162,13 @@ install-data: [EMAIL PROTECTED]@ fi install-data-no: all install-data-yes: all - $(mkinstalldirs) $(DESTDIR)$(datadir) + $(mkdir_p) $(DESTDIR)$(datadir) @catalogs='$(CATALOGS)'; \ for cat in $$catalogs; do \ cat=`basename $$cat`; \ lang=`echo $$cat | sed -e 's/\.gmo$$//'`; \ dir=$(localedir)/$$lang/LC_MESSAGES; \ - $(mkinstalldirs) $(DESTDIR)$$dir; \ + $(mkdir_p) $(DESTDIR)$$dir; \ if test -r $$cat; then realcat=$$cat; else realcat=$(srcdir)/$$cat; fi; \ $(INSTALL_DATA) $$realcat $(DESTDIR)$$dir/$(DOMAIN).mo; \ echo installing $$realcat as $(DESTDIR)$$dir/$(DOMAIN).mo; \ @@ -209,19 +208,19 @@ installdirs: installdirs-exec installdir installdirs-exec: installdirs-data: [EMAIL PROTECTED]@ if test $(PACKAGE) = gettext-tools; then \ - $(mkinstalldirs) $(DESTDIR)$(gettextsrcdir); \ + $(mkdir_p) $(DESTDIR)$(gettextsrcdir); \ else \ : ; \ fi installdirs-data-no: installdirs-data-yes: - $(mkinstalldirs) $(DESTDIR)$(datadir) + $(mkdir_p) $(DESTDIR)$(datadir) @catalogs='$(CATALOGS)'; \ for cat in $$catalogs; do \ cat=`basename $$cat`; \ lang=`echo $$cat | sed -e 's/\.gmo$$//'`; \ dir=$(localedir)/$$lang/LC_MESSAGES; \ - $(mkinstalldirs) $(DESTDIR)$$dir; \ + $(mkdir_p) $(DESTDIR)$$dir; \ for lc in '' $(EXTRA_LOCALE_CATEGORIES); do \ if test -n $$lc; then \ if (cd $(DESTDIR)$(localedir)/$$lang LC_ALL=C ls -l -d $$lc 2/dev/null) | grep ' - ' /dev/null; then \ Index: e2fsprogs-git/misc/Makefile.in === --- e2fsprogs-git.orig/misc/Makefile.in +++ e2fsprogs-git/misc/Makefile.in @@ -234,8 +234,8 @@ filefrag.8: $(DEP_SUBSTITUTE) $(srcdir)/ @$(SUBSTITUTE_UPTIME) $(srcdir)/filefrag.8.in filefrag.8 installdirs: - @echo MKINSTALLDIRS $(sbindir) $(root_sbindir) $(bindir) $(man1dir) $(man8dir) $(libdir) $(root_sysconfdir) - @$(MKINSTALLDIRS) $(DESTDIR)$(sbindir) \ + @echo MKDIR_P $(sbindir) $(root_sbindir) $(bindir) $(man1dir) $(man8dir) $(libdir) $(root_sysconfdir) + @$(MKDIR_P) $(DESTDIR)$(sbindir) \ $(DESTDIR)$(root_sbindir) $(DESTDIR)$(bindir) \ $(DESTDIR)$(man1dir) $(DESTDIR)$(man8dir) \ $(DESTDIR)$(man1dir) $(DESTDIR)$(man5dir) \ Index: e2fsprogs-git/e2fsck
Re: [PATCH] allow tune2fs to set/clear resize_inode
Andreas Dilger wrote: On Nov 05, 2007 14:45 -0600, Eric Sandeen wrote: (this patch has been carried in Red Hat / Fedora rpms for a while, not sure why it never got sent upstream... clearing this allows for mounting filesystems with a resize_inode on older systems) Allow tune2fs to set clear resize_inode; requires fsck afterwards. Two things that are a bit confusing: - since RESIZE_INODE is a COMPAT feature, I don't see how this affects mounting the filesystem on older systems? Is there a bug somewhere? Hmm there is a RHEL3 bug, long since closed with a patch like this one (actually less fleshed out)... but you're right, this should work. (quick test) ...and it does. Grrr, sorry, assumed the patch was there for a reason! I'll retract this patch, I don't think it's needed. Sorry for the noise. -Eric - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delalloc fragmenting files?
Alex Tomas wrote: Eric, would you mind to repeat the run and then grab /proc/fs/ext4/dev/mb_history? thanks in advance, Alex Sure thing, attached; this is from a 1024x1M run, wound up with 32 fragments, out of order: First block: 122880 Last block: 491519 Discontinuity: Block 7424 is at 114688 (was 130303) Discontinuity: Block 15616 is at 106496 (was 122879) Discontinuity: Block 23552 is at 102400 (was 114431) Discontinuity: Block 26624 is at 155648 (was 105471) Discontinuity: Block 33792 is at 147456 (was 162815) Discontinuity: Block 41472 is at 162816 (was 155135) Discontinuity: Block 42496 is at 188416 (was 163839) Discontinuity: Block 50176 is at 180224 (was 196095) Discontinuity: Block 58368 is at 172032 (was 188415) Discontinuity: Block 66560 is at 167936 (was 180223) pid 3695 is pdflush, I assume 7634 was the dd itself. -Eric mb_history.bz2 Description: application/bzip