Re: btrfs hang on brd
On 01/06/11 13:07, Adrian Hunter wrote: On 01/06/11 11:54, David Sterba wrote: On Tue, May 31, 2011 at 10:03:12AM +0300, Adrian Hunter wrote: Hi I seem to be able to get btrfs reproducibly to produce warnings and finally hang when running a stress test on a ramdisk. Testing was done using the integration-test branch of btrfs-unstable. Note that I also tested v2.6.39 and integration-test took much longer to hang i.e. it is an improvement The test script and stack dumps are below. Is this a valid test? Is it worth me investigating these? I've tried to reproduce myself, but the fsstress utility (taken from latest LTP suite) crashes sometimes and I cannot take it as a proper reproduction. Can you point me to the exact version you used? The LTP version does not compile properly: make[4]: Entering directory `/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress' gcc -g -O2 -g -O2 -fno-strict-aliasing -pipe -Wall -DNO_XFS -I/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress -D_LARGEFILE64_SOURCE -D_GNU_SOURCE -Wno-error -I../../../../include -I../../../../include -L../../../../lib fsstress.c -o fsstress fsstress.c: In function 'dread_f': fsstress.c:1829:2: warning: implicit declaration of function 'memalign' fsstress.c:1829:6: warning: assignment makes pointer from integer without a cast fsstress.c: In function 'dwrite_f': fsstress.c:1912:6: warning: assignment makes pointer from integer without a cast fsstress.c:1844:17: warning: 'diob.d_miniosz' may be used uninitialized in this function fsstress.c:1844:17: warning: 'diob.d_maxiosz' may be used uninitialized in this function fsstress.c:1844:17: warning: 'diob.d_mem' may be used uninitialized in this function fsstress.c: In function 'dread_f': fsstress.c:1750:17: warning: 'diob.d_miniosz' may be used uninitialized in this function fsstress.c:1750:17: warning: 'diob.d_maxiosz' may be used uninitialized in this function fsstress.c:1750:17: warning: 'diob.d_mem' may be used uninitialized in this function I hacked a couple of changes but I need to check them before mailing to the ltp-list: From: Adrian Hunter adrian.hun...@intel.com Date: Wed, 1 Jun 2011 13:01:48 +0300 Subject: [PATCH] fsstress: quick fix for compile errors Signed-off-by: Adrian Hunter adrian.hun...@intel.com --- testcases/kernel/fs/fsstress/fsstress.c |2 ++ testcases/kernel/fs/fsstress/global.h |1 + 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/testcases/kernel/fs/fsstress/fsstress.c b/testcases/kernel/fs/fsstress/fsstress.c index e3b48ea..83c23ed 100644 --- a/testcases/kernel/fs/fsstress/fsstress.c +++ b/testcases/kernel/fs/fsstress/fsstress.c @@ -1757,6 +1757,7 @@ dread_f(int opno, long r) struct stat64stb; intv; +memset(diob, 0, sizeof(struct dioattr)); init_pathname(f); if (!get_fname(FT_REGFILE, r, f, NULL, NULL, v)) { if (v) @@ -1851,6 +1852,7 @@ dwrite_f(int opno, long r) struct stat64stb; intv; +memset(diob, 0, sizeof(struct dioattr)); init_pathname(f); if (!get_fname(FT_REGFILE, r, f, NULL, NULL, v)) { if (v) diff --git a/testcases/kernel/fs/fsstress/global.h b/testcases/kernel/fs/fsstress/global.h index f788395..5ab5d56 100644 --- a/testcases/kernel/fs/fsstress/global.h +++ b/testcases/kernel/fs/fsstress/global.h @@ -58,6 +58,7 @@ #include stdlib.h #include stdio.h #include unistd.h +#include malloc.h #ifndef O_DIRECT #define O_DIRECT 04 -- 1.7.4.4 (But no warning or hang observed, on top of 3.0-rc1 + cmason/for-linus) I will try it tonight. No improvement on 3.0-rc1+ (commit 5c6cce92bc8aee751aafe82c5d9caf7553226a3d). Logs follow: Warnings [ 2857.023360] WARNING: at fs/btrfs/extent-tree.c:5648 btrfs_alloc_free_block+0x14e/0x357 [btrfs]() [ 2857.023364] Hardware name: XPS 8300 [ 2857.023367] Modules linked in: tun btrfs zlib_deflate libcrc32c brd fuse cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 uinput snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device broadcom snd_pcm snd_timer snd tg3 iTCO_wdt serio_raw dcdbas iTCO_vendor_support microcode soundcore pcspkr snd_page_alloc i2c_i801 joydev usb_storage i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] [ 2857.023431] Pid: 8809, comm: btrfs-endio-wri Not tainted 3.0.0-rc1-work-2011-06-01-01+ #11 [ 2857.023435] Call Trace: [ 2857.023461] [8104db2e] warn_slowpath_common+0x85/0x9d [ 2857.023471] [8104db60] warn_slowpath_null+0x1a/0x1c [ 2857.023494] [a029cc98] btrfs_alloc_free_block+0x14e/0x357 [btrfs] [ 2857.023526] [a02c75bb] ? map_private_extent_buffer+0xb1/0xd5 [btrfs] [ 2857.023547] [a028f99f] __btrfs_cow_block+0x102/0x31e [btrfs] [ 2857.023565] [a028e500] ?
Re: linux-next: build warninga in Linus' tree
On Wed, Jun 01, 2011 at 10:16:48AM -0500, Mitch Harder wrote: I've been playing around with resurrecting the basic sysfs capabilities that had been previously incorporated into btrfs. As it stands right now, it was relatively easy to re-implement sysfs as it was originally. However, that implementation of sysfs wasn't populated with much information (only total_blocks, blocks_used, and blocksize). Goffredo Baroncelli (CCed) posted a patch to enhance sysfs interface: https://patchwork.kernel.org/patch/308902/ (http://www.spinics.net/lists/linux-btrfs/msg06777.html) I also had to reverse a small portion of code that was in the last clean-up. Restoring the code should not be a problem, the cleanup was too eager and I think a sysfs inteface would be good, not only for debugging purposes or tuning. If a CONFIG_BTRFS_DEBUG type configuration flag is ever introduced, it would be interesting to resurrect btrfs' sysfs capabilities. Hearing about CONFIG_BTRFS_DEBUG again, seems worth to add it. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build warninga in Linus' tree
On Fri, Jun 03, 2011 at 01:10:49PM +0200, David Sterba wrote: On Wed, Jun 01, 2011 at 10:16:48AM -0500, Mitch Harder wrote: I've been playing around with resurrecting the basic sysfs capabilities that had been previously incorporated into btrfs. As it stands right now, it was relatively easy to re-implement sysfs as it was originally. However, that implementation of sysfs wasn't populated with much information (only total_blocks, blocks_used, and blocksize). Goffredo Baroncelli (CCed) posted a patch to enhance sysfs interface: https://patchwork.kernel.org/patch/308902/ (http://www.spinics.net/lists/linux-btrfs/msg06777.html) I also had to reverse a small portion of code that was in the last clean-up. Restoring the code should not be a problem, the cleanup was too eager and I think a sysfs inteface would be good, not only for debugging purposes or tuning. Indeed. There's a few parts of the balance API that would be significantly enhanced by being able to put things in sysfs. I could drop at least one (if not two) of the three ioctls if I had somewhere in sysfs to put the relevant files. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- The glass is neither half-full nor half-empty; it is twice as --- large as it needs to be. signature.asc Description: Digital signature
[PATCH] btrfs: move extra checks under debug option in btrfs_search_slot
CC: Josef Bacik jo...@redhat.com Signed-off-by: David Sterba dste...@suse.cz --- this patch is in conflict with josef's patch http://git.kernel.org/?p=linux/kernel/git/josef/btrfs-work.git;a=commit;h=98cdd9ffc5da7aa4c516347f7fc8f65cb08df6ae fs/btrfs/ctree.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index b0e18d9..4fe7634 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1648,9 +1648,11 @@ again: } cow_done: BUG_ON(!cow ins_len); +#ifdef CONFIG_BTRFS_DEBUG if (level != btrfs_header_level(b)) WARN_ON(1); level = btrfs_header_level(b); +#endif p-nodes[level] = b; if (!p-skip_locking) -- 1.7.5.2.353.g5df3e -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build warninga in Linus' tree
On Fri, Jun 03, 2011 at 01:10:49PM +0200, David Sterba wrote: On Wed, Jun 01, 2011 at 10:16:48AM -0500, Mitch Harder wrote: I've been playing around with resurrecting the basic sysfs capabilities that had been previously incorporated into btrfs. As it stands right now, it was relatively easy to re-implement sysfs as it was originally. However, that implementation of sysfs wasn't populated with much information (only total_blocks, blocks_used, and blocksize). Goffredo Baroncelli (CCed) posted a patch to enhance sysfs interface: https://patchwork.kernel.org/patch/308902/ (http://www.spinics.net/lists/linux-btrfs/msg06777.html) I also had to reverse a small portion of code that was in the last clean-up. Restoring the code should not be a problem, the cleanup was too eager and I think a sysfs inteface would be good, not only for debugging purposes or tuning. If a CONFIG_BTRFS_DEBUG type configuration flag is ever introduced, it would be interesting to resurrect btrfs' sysfs capabilities. Hearing about CONFIG_BTRFS_DEBUG again, seems worth to add it. For debugging stuff, please use debugfs instead of sysfs, as that is what it is there for. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: move extra checks under debug option in btrfs_search_slot
On 06/03/2011 08:09 AM, David Sterba wrote: CC: Josef Bacik jo...@redhat.com Signed-off-by: David Sterba dste...@suse.cz --- Lets use this instead, I'll drop mine. Thanks, Reviewed-by: Josef Bacik jo...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: fix uninitialized variable warning
From: David Sterba dste...@suse.cz With Linus' tree, today's linux-next build (powercp ppc64_defconfig) produced this warning: fs/btrfs/delayed-inode.c: In function 'btrfs_delayed_update_inode': fs/btrfs/delayed-inode.c:1598:6: warning: 'ret' may be used uninitialized in this function Introduced by commit 16cdcec736cd (btrfs: implement delayed inode items operation). This fixes a bug in btrfs_update_inode(): if the returned value from btrfs_delayed_update_inode is a nonzero garbage, inode stat data are not updated and several call paths may hit a BUG_ON or fail with strange code. Reported-by: Stephen Rothwell s...@canb.auug.org.au Signed-off-by: David Sterba dste...@suse.cz --- patch pushed to git://repo.or.cz/linux-2.6/btrfs-unstable.git #fixes fs/btrfs/delayed-inode.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 01e2950..8cb012f 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1595,7 +1595,7 @@ int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct inode *inode) { struct btrfs_delayed_node *delayed_node; - int ret; + int ret = 0; delayed_node = btrfs_get_or_create_delayed_node(inode); if (IS_ERR(delayed_node)) -- 1.7.5.2.353.g5df3e -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: fix uninitialized variable warning
Excerpts from David Sterba's message of 2011-06-03 10:50:14 -0400: From: David Sterba dste...@suse.cz With Linus' tree, today's linux-next build (powercp ppc64_defconfig) produced this warning: fs/btrfs/delayed-inode.c: In function 'btrfs_delayed_update_inode': fs/btrfs/delayed-inode.c:1598:6: warning: 'ret' may be used uninitialized in this function Introduced by commit 16cdcec736cd (btrfs: implement delayed inode items operation). This fixes a bug in btrfs_update_inode(): if the returned value from btrfs_delayed_update_inode is a nonzero garbage, inode stat data are not updated and several call paths may hit a BUG_ON or fail with strange code. Ugh, thanks! It looks like the gcc uninit stuff isn't as verbose as it used to be, but it does catch a bunch of allocated/set but not used vars. I have a nitems = 0 fix in my tree as well. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Quota Implementation
On Fri, Jun 03, 2011 at 06:24:41PM +0200, Arne Jansen wrote: Hi, If no one is already working on it, I'd like to take the Quota lock and see how far I come. Let me sketch out in short what I'm planning to do: - Quota will be subvolume based. Only the FS-trees and data extents will be accounted. - Quota Groups can be defined. Every quota group can comprise any number of subvolumes. A subvolume can be assigned to any number of quota groups. - A Quota Group can account/limit the total amount of space that is referenced by it and/or the amount of space that is exclusively referenced (i.e. referenced by no other quota group). - With this it is possible to define a hierarchical quota that need not necessarily reflect the filesystem hierarchy. - It is also possible to decide for each snapshot if it should be accounted into the parent group. So in a scenario where each subvolume reflect a user home, it's possible to have some snapshots accounted to the user and others not (e.g. the ones needed for system backups). - Quota information will be stored in new records, possibly in a separate tree. - It should be possible to change the Quota config and group assignments online, though this might need a full re-scan of the fs. - It does NOT include any kind of user/group (UID/GID) quota. Any addenda or arguments why it's impossible or insane welcome. There's a problem in that in some cases, it's possible to get into a situation where you can't *delete* files because you're going over quota. If I have two subvolumes that share most of their data (e.g. one is a snapshot of the other), and both subvolumes have a limit under the exclusive use clause, then deleting material from subvolume A could cause subvolume B to go over quota. If users can create their own subvolumes, then using the exclusive use form is also pointless, because as a user, I can simply snapshot (or otherwise CoW copy) all my data into a snapshot, and I then don't pay for it. That one probably comes under the admin shot himself in the foot, though. Getting out the bike-shed brush, I might suggest the use of some name other than quota, because inevitably people will think of UID/GID-type quotas, and we've got enough confusingly-modified terminology already. Size bounds, storage bounds, possibly? Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Is it true that last known good on Windows XP --- boots into CP/M? signature.asc Description: Digital signature
[PATCH] btrfs-progs: Avoid buffer overflow for device name
btrfs overwrites memory for too long device paramater try btrfs device scan $(awk 'BEGIN{$5090=OFS=x;print}') ... ** buffer overflow detected ***: btrfs terminated === Backtrace: = /lib64/libc.so.6(__fortify_fail+0x37)[0x7f0ef2ea0607] /lib64/libc.so.6(+0xf6580)[0x7f0ef2e9e580] btrfs[0x402ec4] btrfs[0x401b48] /lib64/libc.so.6(__libc_start_main+0xed)[0x7f0ef2dc943d] btrfs[0x401df1] Patch just add obvious strncpy() checks to several users osf this paramater, probably still some path length check is needed to properly report error. See https://bugzilla.redhat.com/show_bug.cgi?id=710534 Signed-off-by: Milan Broz mb...@redhat.com --- btrfs-vol.c |2 +- btrfs_cmds.c | 14 +++--- btrfsctl.c |2 +- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/btrfs-vol.c b/btrfs-vol.c index 4ed799d..e06a54e 100644 --- a/btrfs-vol.c +++ b/btrfs-vol.c @@ -151,7 +151,7 @@ int main(int ac, char **av) } fd = dirfd(dirstream); if (device) - strcpy(args.name, device); + strncpy(args.name, device, sizeof(args.name)); else args.name[0] = '\0'; diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 8031c58..6f5c634 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -375,7 +375,7 @@ int do_clone(int argc, char **argv) printf(Create a snapshot of '%s' in '%s/%s'\n, subvol, dstdir, newname); args.fd = fd; - strcpy(args.name, newname); + strncpy(args.name, newname, sizeof(args.name)); res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args); close(fd); @@ -436,7 +436,7 @@ int do_delete_subvolume(int argc, char **argv) } printf(Delete subvolume '%s/%s'\n, dname, vname); - strcpy(args.name, vname); + strncpy(args.name, vname, sizeof(args.name)); res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, args); close(fd); @@ -490,7 +490,7 @@ int do_create_subvol(int argc, char **argv) } printf(Create subvolume '%s/%s'\n, dstdir, newname); - strcpy(args.name, newname); + strncpy(args.name, newname, sizeof(args.name)); res = ioctl(fddst, BTRFS_IOC_SUBVOL_CREATE, args); close(fddst); @@ -553,7 +553,7 @@ int do_scan(int argc, char **argv) printf(Scanning for Btrfs filesystems in '%s'\n, argv[i]); - strcpy(args.name, argv[i]); + strncpy(args.name, argv[i], sizeof(args.name)); /* * FIXME: which are the error code returned by this ioctl ? * it seems that is impossible to understand if there no is @@ -593,7 +593,7 @@ int do_resize(int argc, char **argv) } printf(Resize '%s' of '%s'\n, path, amount); - strcpy(args.name, amount); + strncpy(args.name, amount, sizeof(args.name)); res = ioctl(fd, BTRFS_IOC_RESIZE, args); close(fd); if( res 0 ){ @@ -736,7 +736,7 @@ int do_add_volume(int nargs, char **args) } close(devfd); - strcpy(ioctl_args.name, args[i]); + strncpy(ioctl_args.name, args[i], sizeof(ioctl_args.name)); res = ioctl(fdmnt, BTRFS_IOC_ADD_DEV, ioctl_args); if(res0){ fprintf(stderr, ERROR: error adding the device '%s'\n, args[i]); @@ -792,7 +792,7 @@ int do_remove_volume(int nargs, char **args) struct btrfs_ioctl_vol_args arg; int res; - strcpy(arg.name, args[i]); + strncpy(arg.name, args[i], sizeof(arg.name)); res = ioctl(fdmnt, BTRFS_IOC_RM_DEV, arg); if(res0){ fprintf(stderr, ERROR: error removing the device '%s'\n, args[i]); diff --git a/btrfsctl.c b/btrfsctl.c index 92bdf39..29210f5 100644 --- a/btrfsctl.c +++ b/btrfsctl.c @@ -237,7 +237,7 @@ int main(int ac, char **av) } if (name) - strcpy(args.name, name); + strncpy(args.name, name, sizeof(args.name)); else args.name[0] = '\0'; -- 1.7.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: Null terminate string in scan device ioctl
btrfs_scan_one_device() directly uses vol-name without additional checks so in the case of unterminated string in ioctl it can access memory outside of btrfs_ioctl_vol_args struct. Always terminate name string (as the same as other users do already). Signed-off-by: Milan Broz mb...@redhat.com --- fs/btrfs/super.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 9b2e7e5..2bb1a99 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1148,6 +1148,8 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, if (IS_ERR(vol)) return PTR_ERR(vol); + vol-name[BTRFS_PATH_NAME_MAX] = '\0'; + switch (cmd) { case BTRFS_IOC_SCAN_DEV: ret = btrfs_scan_one_device(vol-name, FMODE_READ, -- 1.7.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Delayed inode operations not doing the right thing with enospc
Hello, I got a lot of these when running stress.sh on my test box [ 9792.654889] [ cut here ] [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681 btrfs_alloc_free_block+0xca/0x27c [btrfs]() [ 9792.654899] Hardware name: To Be Filled By O.E.M. [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] [ 9792.654919] Pid: 2762, comm: rm Tainted: GW 2.6.39+ #1 [ 9792.654920] Call Trace: [ 9792.654922] [81053c4a] warn_slowpath_common+0x83/0x9b [ 9792.654925] [81053c7c] warn_slowpath_null+0x1a/0x1c [ 9792.654933] [a038e747] btrfs_alloc_free_block+0xca/0x27c [btrfs] [ 9792.654945] [a03b8562] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.654953] [a038189b] __btrfs_cow_block+0xfc/0x30c [btrfs] [ 9792.654963] [a0396aa6] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs] [ 9792.654970] [a0382e48] ? read_block_for_search+0x94/0x368 [btrfs] [ 9792.654978] [a0381ba9] btrfs_cow_block+0xfe/0x146 [btrfs] [ 9792.654986] [a03848b0] btrfs_search_slot+0x14d/0x4b6 [btrfs] [ 9792.654997] [a03b8562] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.655022] [a03938e8] btrfs_lookup_inode+0x2f/0x8f [btrfs] [ 9792.655025] [8147afac] ? _cond_resched+0xe/0x22 [ 9792.655027] [8147b892] ? mutex_lock+0x29/0x50 [ 9792.655039] [a03d41b1] btrfs_update_delayed_inode+0x72/0x137 [btrfs] [ 9792.655051] [a03d4ea2] btrfs_run_delayed_items+0x90/0xdb [btrfs] [ 9792.655062] [a039a69b] btrfs_commit_transaction+0x228/0x654 [btrfs] [ 9792.655064] [8106e8da] ? remove_wait_queue+0x3a/0x3a [ 9792.655075] [a03a2fa5] btrfs_evict_inode+0x14d/0x202 [btrfs] [ 9792.655077] [81132bd6] evict+0x71/0x111 [ 9792.655079] [81132de0] iput+0x12a/0x132 [ 9792.655081] [8112aa3a] do_unlinkat+0x106/0x155 [ 9792.655083] [81127b83] ? path_put+0x1f/0x23 [ 9792.655085] [8109c53c] ? audit_syscall_entry+0x145/0x171 [ 9792.655087] [81128410] ? putname+0x34/0x36 [ 9792.655090] [8112b441] sys_unlinkat+0x29/0x2b [ 9792.655092] [81482c42] system_call_fastpath+0x16/0x1b [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]--- This is because use_block_rsv() is having to do a reserve_metadata_bytes(), which shouldn't happen as we should have reserved enough space for those operations to complete. This is happening because use_block_rsv() will call get_block_rsv(), which if root-ref_cows is set (which is the case on all fs roots) we will use trans-block_rsv, which will only have what the current transaction starter had reserved. What needs to be done instead is we need to have a block reserve that any reservation that is done at create time for these inodes is migrated to this special reserve, and then when you run the delayed inode items stuff you set trans-block_rsv to the special block reserve so the accounting is all done properly. This is just off the top of my head, there may be a better way to do it, I've not actually looked that the delayed inode code at all. I would do this myself but I have a ever increasing list of shit to do so will somebody pick this up and fix it please? Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Quota Implementation
On Fri, Jun 3, 2011 at 8:47 PM, Hugo Mills h...@carfax.org.uk wrote: On Fri, Jun 03, 2011 at 06:24:41PM +0200, Arne Jansen wrote: Hi, If no one is already working on it, I'd like to take the Quota lock and see how far I come. Let me sketch out in short what I'm planning to do: - Quota will be subvolume based. Only the FS-trees and data extents will be accounted. - Quota Groups can be defined. Every quota group can comprise any number of subvolumes. A subvolume can be assigned to any number of quota groups. - A Quota Group can account/limit the total amount of space that is referenced by it and/or the amount of space that is exclusively referenced (i.e. referenced by no other quota group). - With this it is possible to define a hierarchical quota that need not necessarily reflect the filesystem hierarchy. - It is also possible to decide for each snapshot if it should be accounted into the parent group. So in a scenario where each subvolume reflect a user home, it's possible to have some snapshots accounted to the user and others not (e.g. the ones needed for system backups). - Quota information will be stored in new records, possibly in a separate tree. - It should be possible to change the Quota config and group assignments online, though this might need a full re-scan of the fs. - It does NOT include any kind of user/group (UID/GID) quota. Any addenda or arguments why it's impossible or insane welcome. There's a problem in that in some cases, it's possible to get into a situation where you can't *delete* files because you're going over quota. If I have two subvolumes that share most of their data (e.g. one is a snapshot of the other), and both subvolumes have a limit under the exclusive use clause, then deleting material from subvolume A could cause subvolume B to go over quota. If users can create their own subvolumes, then using the exclusive use form is also pointless, because as a user, I can simply snapshot (or otherwise CoW copy) all my data into a snapshot, and I then don't pay for it. That one probably comes under the admin shot himself in the foot, though. Getting out the bike-shed brush, I might suggest the use of some name other than quota, because inevitably people will think of UID/GID-type quotas, and we've got enough confusingly-modified terminology already. Size bounds, storage bounds, possibly? Budget :)? Regards, Andrey Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Is it true that last known good on Windows XP --- boots into CP/M? -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iD8DBQFN6RAiIKyzvlFcI40RAkkQAKCAulO65dL1F/vaO7W20qJEAKuonwCghfvH XlliA+eCfmLmP/G0quVALe0= =m513 -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Quota Implementation
On Friday 03 June 2011 18:24:41 Arne Jansen wrote: Hi, If no one is already working on it, I'd like to take the Quota lock and see how far I come. Let me sketch out in short what I'm planning to do: - Quota will be subvolume based. Only the FS-trees and data extents will be accounted. - Quota Groups can be defined. Every quota group can comprise any number of subvolumes. A subvolume can be assigned to any number of quota groups. - A Quota Group can account/limit the total amount of space that is referenced by it and/or the amount of space that is exclusively referenced (i.e. referenced by no other quota group). - With this it is possible to define a hierarchical quota that need not necessarily reflect the filesystem hierarchy. - It is also possible to decide for each snapshot if it should be accounted into the parent group. So in a scenario where each subvolume reflect a user home, it's possible to have some snapshots accounted to the user and others not (e.g. the ones needed for system backups). - Quota information will be stored in new records, possibly in a separate tree. - It should be possible to change the Quota config and group assignments online, though this might need a full re-scan of the fs. - It does NOT include any kind of user/group (UID/GID) quota. Any addenda or arguments why it's impossible or insane welcome. What's the benefit of this complexity? Why not a more simple quota/reservation per subvolume? The semantics you described, can be achived by user/group quotas too. And we need them anyway. Perhaps this can be implemented together, reusing the code. Then we have the question if user/group quotas are per filesystem or per subvolume. regards, Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-gui
On Thu, Jun 02, 2011 at 09:41:08AM +0100, Hugo Mills wrote: On Thu, Jun 02, 2011 at 03:31:16PM +0700, Fajar A. Nugraha wrote: On Thu, Jun 2, 2011 at 6:20 AM, Hugo Mills h...@carfax.org.uk wrote: Over the last few weeks, I've been playing with a foolish idea, mostly triggered by a cluster of people being confused by btrfs's free space reporting (df vs btrfs fi df vs btrfs fi show). I also wanted an excuse, and some code, to mess around in the depths of the FS data structures. Like all silly ideas, this one got a bit out of hand, and seems to have turned into something vaguely useful. I'm therefore pleased to announce the first major public release of btrfs-gui[1]: a point-and- click tool for managing btrfs filesystems. The tool currently can scan for and list btrfs filesystems and the volumes they live on. It can show the allocation and usage of data in a selected filesystem, categorised by use, replication, and device. It can show and manipulate subvolumes and snapshots: creation, deletion, and setting the default. Some comments: (1) Currently it needs to be run from the directory where it's downloaded, even after a python3 setup.py install. When run from other directory, it bails with [snip] OSError: [Errno 2] No such file or directory: './btrfs-gui-helper' Is this intentional? No, and will be fixed later today. I forsee an emergency 0.2.1 coming shortly. :) OK, it's fixed in git in the stable-v0.2 branch. Unless anyone else reports something that needs fixing over the weekend, I'll tag it as 0.2.1 on Sunday and roll another release tarball. (The fix is actually pretty ugly, and has some poor UX in it for one case, but I've run out of brain this evening, and can't face the shell hackery necessary to do it nicely right now.) (2) When showing space usage for a single-device FS, selecting Show unallocated space as raw space, why is the top and bottom graph different? Shouldn't it be the same, since there's only one device? Good question. I shall investigate what's going on. OK, on investigation and reflection, it shouldn't be identical, because metadata is DUP. The per-disk displays show actual physical disk usage; the filesystem display at the top shows unique data. Therefore, the top display will show half the amount of metadata than the bottom. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Comic Sans goes into a bar, and the barman says, We don't --- serve your type here. signature.asc Description: Digital signature
Fwd: btrfs causing reboots and kernel oops on SL 6 (RHEL 6)
Hi, I'm using SL 6 (RHEL 6) and I've been playing around with running PostgreSQL on btrfs. Snapshotting works ok, but the computer keeps rebooting without warning (can be 5 mins or 1.5 hours), finally I actually managed to get a Kernel Crash instead of just a reboot. I took a picture of the screen: http://imageshack.us/photo/my-images/716/img0143y.jpg/ The important bits are: IP: [a032c471] btrfs_print_leaf +0x31/0x820 [btrfs] PGD 0 Oops: [#1] SMP last sysfs file: /sys/devices/virtual/block/dm-3/dm/name The crashes aren't predictable either. Like it doesn't always happen when I do a snapshot or anything like that. Is this a known problem, that is fixed in a later kernel or something like that? btrfs seems cool though, I hope there is something I just misconfigured or something so that I can get it to be more reliable, although I do acknowledge that this is an experimental filesystem. Cheers, -Joel -- Joel Pearson Software Engineer Agile Digital Engineering Pty Ltd A.B.N. 98 106 361 273 A: 5/28 Eyre St Kingston ACT 2604 P: +61 1300-858-277 F: +61 1300-858-477 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html