Re: btrfs hang on brd

2011-06-03 Thread Adrian Hunter

On 01/06/11 13:07, Adrian Hunter wrote:

On 01/06/11 11:54, David Sterba wrote:

On Tue, May 31, 2011 at 10:03:12AM +0300, Adrian Hunter wrote:

Hi

I seem to be able to get btrfs reproducibly to
produce warnings and finally hang when running
a stress test on a ramdisk.

Testing was done using the integration-test
branch of btrfs-unstable.  Note that I also tested
v2.6.39 and integration-test took much longer to
hang i.e. it is an improvement

The test script and stack dumps are below.

Is this a valid test?

Is it worth me investigating these?


I've tried to reproduce myself, but the fsstress utility (taken from
latest LTP suite) crashes sometimes and I cannot take it as a proper
reproduction. Can you point me to the exact version you used?


The LTP version does not compile properly:

make[4]: Entering directory
`/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress'

gcc -g -O2 -g -O2 -fno-strict-aliasing -pipe -Wall  -DNO_XFS
-I/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress
-D_LARGEFILE64_SOURCE -D_GNU_SOURCE -Wno-error -I../../../../include
-I../../../../include   -L../../../../lib  fsstress.c   -o fsstress
fsstress.c: In function 'dread_f':
fsstress.c:1829:2: warning: implicit declaration of function 'memalign'
fsstress.c:1829:6: warning: assignment makes pointer from integer
without a cast
fsstress.c: In function 'dwrite_f':
fsstress.c:1912:6: warning: assignment makes pointer from integer
without a cast
fsstress.c:1844:17: warning: 'diob.d_miniosz' may be used uninitialized
in this function
fsstress.c:1844:17: warning: 'diob.d_maxiosz' may be used uninitialized
in this function
fsstress.c:1844:17: warning: 'diob.d_mem' may be used uninitialized in
this function
fsstress.c: In function 'dread_f':
fsstress.c:1750:17: warning: 'diob.d_miniosz' may be used uninitialized
in this function
fsstress.c:1750:17: warning: 'diob.d_maxiosz' may be used uninitialized
in this function
fsstress.c:1750:17: warning: 'diob.d_mem' may be used uninitialized in
this function


I hacked a couple of changes but I need to check them before
mailing to the ltp-list:


From: Adrian Hunter adrian.hun...@intel.com
Date: Wed, 1 Jun 2011 13:01:48 +0300
Subject: [PATCH] fsstress: quick fix for compile errors

Signed-off-by: Adrian Hunter adrian.hun...@intel.com
---
  testcases/kernel/fs/fsstress/fsstress.c |2 ++
  testcases/kernel/fs/fsstress/global.h   |1 +
  2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/testcases/kernel/fs/fsstress/fsstress.c
b/testcases/kernel/fs/fsstress/fsstress.c
index e3b48ea..83c23ed 100644
--- a/testcases/kernel/fs/fsstress/fsstress.c
+++ b/testcases/kernel/fs/fsstress/fsstress.c
@@ -1757,6 +1757,7 @@ dread_f(int opno, long r)
  struct stat64stb;
  intv;

+memset(diob, 0, sizeof(struct dioattr));
  init_pathname(f);
  if (!get_fname(FT_REGFILE, r, f, NULL, NULL, v)) {
  if (v)
@@ -1851,6 +1852,7 @@ dwrite_f(int opno, long r)
  struct stat64stb;
  intv;

+memset(diob, 0, sizeof(struct dioattr));
  init_pathname(f);
  if (!get_fname(FT_REGFILE, r, f, NULL, NULL, v)) {
  if (v)
diff --git a/testcases/kernel/fs/fsstress/global.h
b/testcases/kernel/fs/fsstress/global.h
index f788395..5ab5d56 100644
--- a/testcases/kernel/fs/fsstress/global.h
+++ b/testcases/kernel/fs/fsstress/global.h
@@ -58,6 +58,7 @@
  #include stdlib.h
  #include stdio.h
  #include unistd.h
+#include malloc.h

  #ifndef O_DIRECT
  #define O_DIRECT 04
--
1.7.4.4



(But no warning or hang observed, on top of 3.0-rc1 + cmason/for-linus)


I will try it tonight.


No improvement on 3.0-rc1+ (commit 
5c6cce92bc8aee751aafe82c5d9caf7553226a3d).


Logs follow:


Warnings


[ 2857.023360] WARNING: at fs/btrfs/extent-tree.c:5648 
btrfs_alloc_free_block+0x14e/0x357 [btrfs]()

[ 2857.023364] Hardware name: XPS 8300
[ 2857.023367] Modules linked in: tun btrfs zlib_deflate libcrc32c brd 
fuse cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT 
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 uinput 
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec 
snd_hwdep snd_seq snd_seq_device broadcom snd_pcm snd_timer snd tg3 
iTCO_wdt serio_raw dcdbas iTCO_vendor_support microcode soundcore pcspkr 
snd_page_alloc i2c_i801 joydev usb_storage i915 drm_kms_helper drm 
i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[ 2857.023431] Pid: 8809, comm: btrfs-endio-wri Not tainted 
3.0.0-rc1-work-2011-06-01-01+ #11

[ 2857.023435] Call Trace:
[ 2857.023461]  [8104db2e] warn_slowpath_common+0x85/0x9d
[ 2857.023471]  [8104db60] warn_slowpath_null+0x1a/0x1c
[ 2857.023494]  [a029cc98] btrfs_alloc_free_block+0x14e/0x357 
[btrfs]
[ 2857.023526]  [a02c75bb] ? 
map_private_extent_buffer+0xb1/0xd5 [btrfs]

[ 2857.023547]  [a028f99f] __btrfs_cow_block+0x102/0x31e [btrfs]
[ 2857.023565]  [a028e500] ? 

Re: linux-next: build warninga in Linus' tree

2011-06-03 Thread David Sterba
On Wed, Jun 01, 2011 at 10:16:48AM -0500, Mitch Harder wrote:
 I've been playing around with resurrecting the basic sysfs
 capabilities that had been previously incorporated into btrfs.
 
 As it stands right now, it was relatively easy to re-implement sysfs
 as it was originally.  However, that implementation of sysfs wasn't
 populated with much information (only total_blocks, blocks_used, and
 blocksize).

Goffredo Baroncelli (CCed) posted a patch to enhance sysfs interface:

https://patchwork.kernel.org/patch/308902/
(http://www.spinics.net/lists/linux-btrfs/msg06777.html)

 I also had to reverse a small portion of code that was in the last
 clean-up.

Restoring the code should not be a problem, the cleanup was too eager
and I think a sysfs inteface would be good, not only for debugging
purposes or tuning.

 If a CONFIG_BTRFS_DEBUG type configuration flag is ever introduced, it
 would be interesting to resurrect btrfs' sysfs capabilities.

Hearing about CONFIG_BTRFS_DEBUG again, seems worth to add it.


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: build warninga in Linus' tree

2011-06-03 Thread Hugo Mills
On Fri, Jun 03, 2011 at 01:10:49PM +0200, David Sterba wrote:
 On Wed, Jun 01, 2011 at 10:16:48AM -0500, Mitch Harder wrote:
  I've been playing around with resurrecting the basic sysfs
  capabilities that had been previously incorporated into btrfs.
  
  As it stands right now, it was relatively easy to re-implement sysfs
  as it was originally.  However, that implementation of sysfs wasn't
  populated with much information (only total_blocks, blocks_used, and
  blocksize).
 
 Goffredo Baroncelli (CCed) posted a patch to enhance sysfs interface:
 
 https://patchwork.kernel.org/patch/308902/
 (http://www.spinics.net/lists/linux-btrfs/msg06777.html)
 
  I also had to reverse a small portion of code that was in the last
  clean-up.
 
 Restoring the code should not be a problem, the cleanup was too eager
 and I think a sysfs inteface would be good, not only for debugging
 purposes or tuning.

   Indeed. There's a few parts of the balance API that would be
significantly enhanced by being able to put things in sysfs. I could
drop at least one (if not two) of the three ioctls if I had somewhere
in sysfs to put the relevant files.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- The glass is neither half-full nor half-empty; it is twice as ---  
large as it needs to be. 


signature.asc
Description: Digital signature


[PATCH] btrfs: move extra checks under debug option in btrfs_search_slot

2011-06-03 Thread David Sterba
CC: Josef Bacik jo...@redhat.com
Signed-off-by: David Sterba dste...@suse.cz
---

this patch is in conflict with josef's patch

http://git.kernel.org/?p=linux/kernel/git/josef/btrfs-work.git;a=commit;h=98cdd9ffc5da7aa4c516347f7fc8f65cb08df6ae

 fs/btrfs/ctree.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index b0e18d9..4fe7634 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1648,9 +1648,11 @@ again:
}
 cow_done:
BUG_ON(!cow  ins_len);
+#ifdef CONFIG_BTRFS_DEBUG
if (level != btrfs_header_level(b))
WARN_ON(1);
level = btrfs_header_level(b);
+#endif
 
p-nodes[level] = b;
if (!p-skip_locking)
-- 
1.7.5.2.353.g5df3e

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: build warninga in Linus' tree

2011-06-03 Thread Greg KH
On Fri, Jun 03, 2011 at 01:10:49PM +0200, David Sterba wrote:
 On Wed, Jun 01, 2011 at 10:16:48AM -0500, Mitch Harder wrote:
  I've been playing around with resurrecting the basic sysfs
  capabilities that had been previously incorporated into btrfs.
  
  As it stands right now, it was relatively easy to re-implement sysfs
  as it was originally.  However, that implementation of sysfs wasn't
  populated with much information (only total_blocks, blocks_used, and
  blocksize).
 
 Goffredo Baroncelli (CCed) posted a patch to enhance sysfs interface:
 
 https://patchwork.kernel.org/patch/308902/
 (http://www.spinics.net/lists/linux-btrfs/msg06777.html)
 
  I also had to reverse a small portion of code that was in the last
  clean-up.
 
 Restoring the code should not be a problem, the cleanup was too eager
 and I think a sysfs inteface would be good, not only for debugging
 purposes or tuning.
 
  If a CONFIG_BTRFS_DEBUG type configuration flag is ever introduced, it
  would be interesting to resurrect btrfs' sysfs capabilities.
 
 Hearing about CONFIG_BTRFS_DEBUG again, seems worth to add it.

For debugging stuff, please use debugfs instead of sysfs, as that is
what it is there for.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: move extra checks under debug option in btrfs_search_slot

2011-06-03 Thread Josef Bacik
On 06/03/2011 08:09 AM, David Sterba wrote:
 CC: Josef Bacik jo...@redhat.com
 Signed-off-by: David Sterba dste...@suse.cz
 ---
 

Lets use this instead, I'll drop mine.  Thanks,

Reviewed-by: Josef Bacik jo...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: fix uninitialized variable warning

2011-06-03 Thread David Sterba
From: David Sterba dste...@suse.cz

With Linus' tree, today's linux-next build (powercp ppc64_defconfig)
produced this warning:

fs/btrfs/delayed-inode.c: In function 'btrfs_delayed_update_inode':
fs/btrfs/delayed-inode.c:1598:6: warning: 'ret' may be used
uninitialized in this function

Introduced by commit 16cdcec736cd (btrfs: implement delayed inode items
operation).

This fixes a bug in btrfs_update_inode(): if the returned value from
btrfs_delayed_update_inode is a nonzero garbage, inode stat data are not
updated and several call paths may hit a BUG_ON or fail with strange
code.

Reported-by: Stephen Rothwell s...@canb.auug.org.au
Signed-off-by: David Sterba dste...@suse.cz
---

patch pushed to git://repo.or.cz/linux-2.6/btrfs-unstable.git #fixes

 fs/btrfs/delayed-inode.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 01e2950..8cb012f 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1595,7 +1595,7 @@ int btrfs_delayed_update_inode(struct btrfs_trans_handle 
*trans,
   struct btrfs_root *root, struct inode *inode)
 {
struct btrfs_delayed_node *delayed_node;
-   int ret;
+   int ret = 0;
 
delayed_node = btrfs_get_or_create_delayed_node(inode);
if (IS_ERR(delayed_node))
-- 
1.7.5.2.353.g5df3e

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: fix uninitialized variable warning

2011-06-03 Thread Chris Mason
Excerpts from David Sterba's message of 2011-06-03 10:50:14 -0400:
 From: David Sterba dste...@suse.cz
 
 With Linus' tree, today's linux-next build (powercp ppc64_defconfig)
 produced this warning:
 
 fs/btrfs/delayed-inode.c: In function 'btrfs_delayed_update_inode':
 fs/btrfs/delayed-inode.c:1598:6: warning: 'ret' may be used
 uninitialized in this function
 
 Introduced by commit 16cdcec736cd (btrfs: implement delayed inode items
 operation).
 
 This fixes a bug in btrfs_update_inode(): if the returned value from
 btrfs_delayed_update_inode is a nonzero garbage, inode stat data are not
 updated and several call paths may hit a BUG_ON or fail with strange
 code.

Ugh, thanks!  It looks like the gcc uninit stuff isn't as verbose as it
used to be, but it does catch a bunch of allocated/set but not used
vars.

I have a nitems = 0 fix in my tree as well.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Quota Implementation

2011-06-03 Thread Hugo Mills
On Fri, Jun 03, 2011 at 06:24:41PM +0200, Arne Jansen wrote:
 Hi,
 
 If no one is already working on it, I'd like to take the Quota lock and
 see how far I come.
 Let me sketch out in short what I'm planning to do:
 
  - Quota will be subvolume based. Only the FS-trees and data extents
will be accounted.
  - Quota Groups can be defined. Every quota group can comprise any
number of subvolumes. A subvolume can be assigned to any number
of quota groups.
  - A Quota Group can account/limit the total amount of space that is
referenced by it and/or the amount of space that is exclusively
referenced (i.e. referenced by no other quota group).
  - With this it is possible to define a hierarchical quota that need
not necessarily reflect the filesystem hierarchy.
  - It is also possible to decide for each snapshot if it should be
accounted into the parent group. So in a scenario where each
subvolume reflect a user home, it's possible to have some snapshots
accounted to the user and others not (e.g. the ones needed for system
backups).
  - Quota information will be stored in new records, possibly in a
separate tree.
  - It should be possible to change the Quota config and group
assignments online, though this might need a full re-scan of the fs.
  - It does NOT include any kind of user/group (UID/GID) quota.
 
 Any addenda or arguments why it's impossible or insane welcome.

   There's a problem in that in some cases, it's possible to get into
a situation where you can't *delete* files because you're going over
quota. If I have two subvolumes that share most of their data
(e.g. one is a snapshot of the other), and both subvolumes have a
limit under the exclusive use clause, then deleting material from
subvolume A could cause subvolume B to go over quota.

   If users can create their own subvolumes, then using the exclusive
use form is also pointless, because as a user, I can simply snapshot
(or otherwise CoW copy) all my data into a snapshot, and I then don't
pay for it. That one probably comes under the admin shot himself in
the foot, though.

   Getting out the bike-shed brush, I might suggest the use of some
name other than quota, because inevitably people will think of
UID/GID-type quotas, and we've got enough confusingly-modified
terminology already. Size bounds, storage bounds, possibly?

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
 --- Is it true that last known good on Windows XP --- 
boots into CP/M? 


signature.asc
Description: Digital signature


[PATCH] btrfs-progs: Avoid buffer overflow for device name

2011-06-03 Thread Milan Broz
btrfs overwrites memory for too long device paramater

try
btrfs device scan $(awk 'BEGIN{$5090=OFS=x;print}')
...

** buffer overflow detected ***: btrfs terminated
=== Backtrace: =
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f0ef2ea0607]
/lib64/libc.so.6(+0xf6580)[0x7f0ef2e9e580]
btrfs[0x402ec4]
btrfs[0x401b48]
/lib64/libc.so.6(__libc_start_main+0xed)[0x7f0ef2dc943d]
btrfs[0x401df1]

Patch just add obvious strncpy() checks to several users
osf this paramater, probably still some path length check
is needed to properly report error.

See https://bugzilla.redhat.com/show_bug.cgi?id=710534

Signed-off-by: Milan Broz mb...@redhat.com
---
 btrfs-vol.c  |2 +-
 btrfs_cmds.c |   14 +++---
 btrfsctl.c   |2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/btrfs-vol.c b/btrfs-vol.c
index 4ed799d..e06a54e 100644
--- a/btrfs-vol.c
+++ b/btrfs-vol.c
@@ -151,7 +151,7 @@ int main(int ac, char **av)
}
fd = dirfd(dirstream);
if (device)
-   strcpy(args.name, device);
+   strncpy(args.name, device, sizeof(args.name));
else
args.name[0] = '\0';
 
diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 8031c58..6f5c634 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -375,7 +375,7 @@ int do_clone(int argc, char **argv)
printf(Create a snapshot of '%s' in '%s/%s'\n,
   subvol, dstdir, newname);
args.fd = fd;
-   strcpy(args.name, newname);
+   strncpy(args.name, newname, sizeof(args.name));
res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args);
 
close(fd);
@@ -436,7 +436,7 @@ int do_delete_subvolume(int argc, char **argv)
}
 
printf(Delete subvolume '%s/%s'\n, dname, vname);
-   strcpy(args.name, vname);
+   strncpy(args.name, vname, sizeof(args.name));
res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, args);
 
close(fd);
@@ -490,7 +490,7 @@ int do_create_subvol(int argc, char **argv)
}
 
printf(Create subvolume '%s/%s'\n, dstdir, newname);
-   strcpy(args.name, newname);
+   strncpy(args.name, newname, sizeof(args.name));
res = ioctl(fddst, BTRFS_IOC_SUBVOL_CREATE, args);
 
close(fddst);
@@ -553,7 +553,7 @@ int do_scan(int argc, char **argv)
 
printf(Scanning for Btrfs filesystems in '%s'\n, argv[i]);
 
-   strcpy(args.name, argv[i]);
+   strncpy(args.name, argv[i], sizeof(args.name));
/*
 * FIXME: which are the error code returned by this ioctl ?
 * it seems that is impossible to understand if there no is
@@ -593,7 +593,7 @@ int do_resize(int argc, char **argv)
}
 
printf(Resize '%s' of '%s'\n, path, amount);
-   strcpy(args.name, amount);
+   strncpy(args.name, amount, sizeof(args.name));
res = ioctl(fd, BTRFS_IOC_RESIZE, args);
close(fd);
if( res  0 ){
@@ -736,7 +736,7 @@ int do_add_volume(int nargs, char **args)
}
close(devfd);
 
-   strcpy(ioctl_args.name, args[i]);
+   strncpy(ioctl_args.name, args[i], sizeof(ioctl_args.name));
res = ioctl(fdmnt, BTRFS_IOC_ADD_DEV, ioctl_args);
if(res0){
fprintf(stderr, ERROR: error adding the device 
'%s'\n, args[i]);
@@ -792,7 +792,7 @@ int do_remove_volume(int nargs, char **args)
struct  btrfs_ioctl_vol_args arg;
int res;
 
-   strcpy(arg.name, args[i]);
+   strncpy(arg.name, args[i], sizeof(arg.name));
res = ioctl(fdmnt, BTRFS_IOC_RM_DEV, arg);
if(res0){
fprintf(stderr, ERROR: error removing the device 
'%s'\n, args[i]);
diff --git a/btrfsctl.c b/btrfsctl.c
index 92bdf39..29210f5 100644
--- a/btrfsctl.c
+++ b/btrfsctl.c
@@ -237,7 +237,7 @@ int main(int ac, char **av)
 }
 
if (name)
-   strcpy(args.name, name);
+   strncpy(args.name, name, sizeof(args.name));
else
args.name[0] = '\0';
 
-- 
1.7.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: Null terminate string in scan device ioctl

2011-06-03 Thread Milan Broz
btrfs_scan_one_device() directly uses vol-name
without additional checks so in the case of unterminated
string in ioctl it can access memory outside
of btrfs_ioctl_vol_args struct.

Always terminate name string (as the same as other users
do already).

Signed-off-by: Milan Broz mb...@redhat.com
---
 fs/btrfs/super.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 9b2e7e5..2bb1a99 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1148,6 +1148,8 @@ static long btrfs_control_ioctl(struct file *file, 
unsigned int cmd,
if (IS_ERR(vol))
return PTR_ERR(vol);
 
+   vol-name[BTRFS_PATH_NAME_MAX] = '\0';
+
switch (cmd) {
case BTRFS_IOC_SCAN_DEV:
ret = btrfs_scan_one_device(vol-name, FMODE_READ,
-- 
1.7.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Delayed inode operations not doing the right thing with enospc

2011-06-03 Thread Josef Bacik
Hello,

I got a lot of these when running stress.sh on my test box

[ 9792.654889] [ cut here ]
[ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681
btrfs_alloc_free_block+0xca/0x27c [btrfs]()
[ 9792.654899] Hardware name: To Be Filled By O.E.M.
[ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables
arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211
snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill
pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco
i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd
mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi
ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper
drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[ 9792.654919] Pid: 2762, comm: rm Tainted: GW   2.6.39+ #1
[ 9792.654920] Call Trace:
[ 9792.654922]  [81053c4a] warn_slowpath_common+0x83/0x9b
[ 9792.654925]  [81053c7c] warn_slowpath_null+0x1a/0x1c
[ 9792.654933]  [a038e747] btrfs_alloc_free_block+0xca/0x27c
[btrfs]
[ 9792.654945]  [a03b8562] ? map_extent_buffer+0x6e/0xa8 [btrfs]
[ 9792.654953]  [a038189b] __btrfs_cow_block+0xfc/0x30c [btrfs]
[ 9792.654963]  [a0396aa6] ? btrfs_buffer_uptodate+0x47/0x58
[btrfs]
[ 9792.654970]  [a0382e48] ? read_block_for_search+0x94/0x368
[btrfs]
[ 9792.654978]  [a0381ba9] btrfs_cow_block+0xfe/0x146 [btrfs]
[ 9792.654986]  [a03848b0] btrfs_search_slot+0x14d/0x4b6 [btrfs]
[ 9792.654997]  [a03b8562] ? map_extent_buffer+0x6e/0xa8 [btrfs]
[ 9792.655022]  [a03938e8] btrfs_lookup_inode+0x2f/0x8f [btrfs]
[ 9792.655025]  [8147afac] ? _cond_resched+0xe/0x22
[ 9792.655027]  [8147b892] ? mutex_lock+0x29/0x50
[ 9792.655039]  [a03d41b1]
btrfs_update_delayed_inode+0x72/0x137 [btrfs]
[ 9792.655051]  [a03d4ea2] btrfs_run_delayed_items+0x90/0xdb
[btrfs]
[ 9792.655062]  [a039a69b]
btrfs_commit_transaction+0x228/0x654 [btrfs]
[ 9792.655064]  [8106e8da] ? remove_wait_queue+0x3a/0x3a
[ 9792.655075]  [a03a2fa5] btrfs_evict_inode+0x14d/0x202 [btrfs]
[ 9792.655077]  [81132bd6] evict+0x71/0x111
[ 9792.655079]  [81132de0] iput+0x12a/0x132
[ 9792.655081]  [8112aa3a] do_unlinkat+0x106/0x155
[ 9792.655083]  [81127b83] ? path_put+0x1f/0x23
[ 9792.655085]  [8109c53c] ? audit_syscall_entry+0x145/0x171
[ 9792.655087]  [81128410] ? putname+0x34/0x36
[ 9792.655090]  [8112b441] sys_unlinkat+0x29/0x2b
[ 9792.655092]  [81482c42] system_call_fastpath+0x16/0x1b
[ 9792.655093] ---[ end trace 02b696eb02b3f768 ]---


This is because use_block_rsv() is having to do a
reserve_metadata_bytes(), which shouldn't happen as we should have
reserved enough space for those operations to complete.  This is
happening because use_block_rsv() will call get_block_rsv(), which if
root-ref_cows is set (which is the case on all fs roots) we will use
trans-block_rsv, which will only have what the current transaction
starter had reserved.

What needs to be done instead is we need to have a block reserve that
any reservation that is done at create time for these inodes is migrated
to this special reserve, and then when you run the delayed inode items
stuff you set trans-block_rsv to the special block reserve so the
accounting is all done properly.

This is just off the top of my head, there may be a better way to do it,
I've not actually looked that the delayed inode code at all.

I would do this myself but I have a ever increasing list of shit to do
so will somebody pick this up and fix it please?  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Quota Implementation

2011-06-03 Thread Andrey Kuzmin
On Fri, Jun 3, 2011 at 8:47 PM, Hugo Mills h...@carfax.org.uk wrote:
 On Fri, Jun 03, 2011 at 06:24:41PM +0200, Arne Jansen wrote:
 Hi,

 If no one is already working on it, I'd like to take the Quota lock and
 see how far I come.
 Let me sketch out in short what I'm planning to do:

  - Quota will be subvolume based. Only the FS-trees and data extents
    will be accounted.
  - Quota Groups can be defined. Every quota group can comprise any
    number of subvolumes. A subvolume can be assigned to any number
    of quota groups.
  - A Quota Group can account/limit the total amount of space that is
    referenced by it and/or the amount of space that is exclusively
    referenced (i.e. referenced by no other quota group).
  - With this it is possible to define a hierarchical quota that need
    not necessarily reflect the filesystem hierarchy.
  - It is also possible to decide for each snapshot if it should be
    accounted into the parent group. So in a scenario where each
    subvolume reflect a user home, it's possible to have some snapshots
    accounted to the user and others not (e.g. the ones needed for system
    backups).
  - Quota information will be stored in new records, possibly in a
    separate tree.
  - It should be possible to change the Quota config and group
    assignments online, though this might need a full re-scan of the fs.
  - It does NOT include any kind of user/group (UID/GID) quota.

 Any addenda or arguments why it's impossible or insane welcome.

   There's a problem in that in some cases, it's possible to get into
 a situation where you can't *delete* files because you're going over
 quota. If I have two subvolumes that share most of their data
 (e.g. one is a snapshot of the other), and both subvolumes have a
 limit under the exclusive use clause, then deleting material from
 subvolume A could cause subvolume B to go over quota.

   If users can create their own subvolumes, then using the exclusive
 use form is also pointless, because as a user, I can simply snapshot
 (or otherwise CoW copy) all my data into a snapshot, and I then don't
 pay for it. That one probably comes under the admin shot himself in
 the foot, though.

   Getting out the bike-shed brush, I might suggest the use of some
 name other than quota, because inevitably people will think of
 UID/GID-type quotas, and we've got enough confusingly-modified
 terminology already. Size bounds, storage bounds, possibly?

Budget :)?

Regards,
Andrey


   Hugo.

 --
 === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
         --- Is it true that last known good on Windows XP ---
                            boots into CP/M?

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.10 (GNU/Linux)

 iD8DBQFN6RAiIKyzvlFcI40RAkkQAKCAulO65dL1F/vaO7W20qJEAKuonwCghfvH
 XlliA+eCfmLmP/G0quVALe0=
 =m513
 -END PGP SIGNATURE-


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Quota Implementation

2011-06-03 Thread Johannes Hirte
On Friday 03 June 2011 18:24:41 Arne Jansen wrote:
 Hi,
 
 If no one is already working on it, I'd like to take the Quota lock and
 see how far I come.
 Let me sketch out in short what I'm planning to do:
 
   - Quota will be subvolume based. Only the FS-trees and data extents
 will be accounted.
   - Quota Groups can be defined. Every quota group can comprise any
 number of subvolumes. A subvolume can be assigned to any number
 of quota groups.
   - A Quota Group can account/limit the total amount of space that is
 referenced by it and/or the amount of space that is exclusively
 referenced (i.e. referenced by no other quota group).
   - With this it is possible to define a hierarchical quota that need
 not necessarily reflect the filesystem hierarchy.
   - It is also possible to decide for each snapshot if it should be
 accounted into the parent group. So in a scenario where each
 subvolume reflect a user home, it's possible to have some snapshots
 accounted to the user and others not (e.g. the ones needed for system
 backups).
   - Quota information will be stored in new records, possibly in a
 separate tree.
   - It should be possible to change the Quota config and group
 assignments online, though this might need a full re-scan of the fs.
   - It does NOT include any kind of user/group (UID/GID) quota.
 
 Any addenda or arguments why it's impossible or insane welcome.

What's the benefit of this complexity? Why not a more simple quota/reservation 
per subvolume? The semantics you described, can be achived by user/group 
quotas too. And we need them anyway. Perhaps this can be implemented together, 
reusing the code. Then we have the question if user/group quotas are per 
filesystem or per subvolume.

regards,
  Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-gui

2011-06-03 Thread Hugo Mills
On Thu, Jun 02, 2011 at 09:41:08AM +0100, Hugo Mills wrote:
 On Thu, Jun 02, 2011 at 03:31:16PM +0700, Fajar A. Nugraha wrote:
  On Thu, Jun 2, 2011 at 6:20 AM, Hugo Mills h...@carfax.org.uk wrote:
     Over the last few weeks, I've been playing with a foolish idea,
   mostly triggered by a cluster of people being confused by btrfs's free
   space reporting (df vs btrfs fi df vs btrfs fi show). I also wanted an
   excuse, and some code, to mess around in the depths of the FS data
   structures.
  
     Like all silly ideas, this one got a bit out of hand, and seems to
   have turned into something vaguely useful. I'm therefore pleased to
   announce the first major public release of btrfs-gui[1]: a point-and-
   click tool for managing btrfs filesystems.
  
     The tool currently can scan for and list btrfs filesystems and the
   volumes they live on. It can show the allocation and usage of data in
   a selected filesystem, categorised by use, replication, and device. It
   can show and manipulate subvolumes and snapshots: creation, deletion,
   and setting the default.
  
  
  Some comments:
  (1) Currently it needs to be run from the directory where it's
  downloaded, even after a python3 setup.py install. When run from
  other directory, it bails with
[snip]
  OSError: [Errno 2] No such file or directory: './btrfs-gui-helper'
  
  Is this intentional?
 
No, and will be fixed later today. I forsee an emergency 0.2.1
 coming shortly. :)

   OK, it's fixed in git in the stable-v0.2 branch. Unless anyone else
reports something that needs fixing over the weekend, I'll tag it as
0.2.1 on Sunday and roll another release tarball.

   (The fix is actually pretty ugly, and has some poor UX in it for
one case, but I've run out of brain this evening, and can't face the
shell hackery necessary to do it nicely right now.)

  (2) When showing space usage for a single-device FS, selecting Show
  unallocated space as raw space, why is the top and bottom graph
  different? Shouldn't it be the same, since there's only one device?
 
Good question. I shall investigate what's going on.

   OK, on investigation and reflection, it shouldn't be identical,
because metadata is DUP. The per-disk displays show actual physical
disk usage; the filesystem display at the top shows unique
data. Therefore, the top display will show half the amount of metadata
than the bottom.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Comic Sans goes into a bar,  and the barman says, We don't ---   
 serve your type here.  


signature.asc
Description: Digital signature


Fwd: btrfs causing reboots and kernel oops on SL 6 (RHEL 6)

2011-06-03 Thread Joel Pearson
Hi,

I'm using SL 6 (RHEL 6) and I've been playing around with running
PostgreSQL on btrfs. Snapshotting works ok, but the computer keeps
rebooting without warning (can be 5 mins or 1.5 hours), finally I
actually managed to get a Kernel Crash instead of just a reboot.

I took a picture of the screen:
http://imageshack.us/photo/my-images/716/img0143y.jpg/

The important bits are:

IP: [a032c471] btrfs_print_leaf +0x31/0x820 [btrfs]
PGD 0
Oops:  [#1] SMP
last sysfs file: /sys/devices/virtual/block/dm-3/dm/name

The crashes aren't predictable either. Like it doesn't always happen
when I do a snapshot or anything like that.

Is this a known problem, that is fixed in a later kernel or something like that?

btrfs seems cool though, I hope there is something I just
misconfigured or something so that I can get it to be more reliable,
although I do acknowledge that this is an experimental filesystem.

Cheers,

-Joel

--
Joel Pearson
Software Engineer
Agile Digital Engineering Pty Ltd
A.B.N. 98 106 361 273
A: 5/28 Eyre St Kingston ACT 2604
P: +61 1300-858-277
F: +61 1300-858-477
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html