[PATCH 0/4] Btrfs: A few small bug fixes

2011-04-13 Thread Li Zefan
Hi Chris,

Those bugs are small, and the fixes are simple and straitforward.

You can pull from:

git://repo.or.cz/linux-btrfs-devel.git for-chris



Li Zefan (2):
  Btrfs: Check if btrfs_next_leaf() returns error in btrfs_listxattr()
  Btrfs: Check if btrfs_next_leaf() returns error in btrfs_real_readdir()

Miao Xie (2):
  Btrfs: Fix incorrect inode nlink in btrfs_link()
  Btrfs: Check validity before setting an acl

---
 fs/btrfs/acl.c   |9 +
 fs/btrfs/inode.c |   34 +-
 fs/btrfs/xattr.c |   33 -
 3 files changed, 30 insertions(+), 46 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] Btrfs: Check if btrfs_next_leaf() returns error in btrfs_listxattr()

2011-04-13 Thread Li Zefan
btrfs_next_leaf() can return -errno, and we should propagate
it to userspace.

This also simplifies how we walk the btree path.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/xattr.c |   33 -
 1 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
index e5d22f2..07b9bc3 100644
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -180,11 +180,10 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char 
*buffer, size_t size)
struct btrfs_path *path;
struct extent_buffer *leaf;
struct btrfs_dir_item *di;
-   int ret = 0, slot, advance;
+   int ret = 0, slot;
size_t total_size = 0, size_left = size;
unsigned long name_ptr;
size_t name_len;
-   u32 nritems;
 
/*
 * ok we want all objects associated with this id.
@@ -204,34 +203,24 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char 
*buffer, size_t size)
ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
if (ret  0)
goto err;
-   advance = 0;
+
while (1) {
leaf = path-nodes[0];
-   nritems = btrfs_header_nritems(leaf);
slot = path-slots[0];
 
/* this is where we start walking through the path */
-   if (advance || slot = nritems) {
+   if (slot = btrfs_header_nritems(leaf)) {
/*
 * if we've reached the last slot in this leaf we need
 * to go to the next leaf and reset everything
 */
-   if (slot = nritems-1) {
-   ret = btrfs_next_leaf(root, path);
-   if (ret)
-   break;
-   leaf = path-nodes[0];
-   nritems = btrfs_header_nritems(leaf);
-   slot = path-slots[0];
-   } else {
-   /*
-* just walking through the slots on this leaf
-*/
-   slot++;
-   path-slots[0]++;
-   }
+   ret = btrfs_next_leaf(root, path);
+   if (ret  0)
+   goto err;
+   else if (ret  0)
+   break;
+   continue;
}
-   advance = 1;
 
btrfs_item_key_to_cpu(leaf, found_key, slot);
 
@@ -250,7 +239,7 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char 
*buffer, size_t size)
 
/* we are just looking for how big our buffer needs to be */
if (!size)
-   continue;
+   goto next;
 
if (!buffer || (name_len + 1)  size_left) {
ret = -ERANGE;
@@ -263,6 +252,8 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char 
*buffer, size_t size)
 
size_left -= name_len + 1;
buffer += name_len + 1;
+next:
+   path-slots[0]++;
}
ret = total_size;
 
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] Btrfs: Check if btrfs_next_leaf() returns error in btrfs_real_readdir()

2011-04-13 Thread Li Zefan
btrfs_next_leaf() can return -errno, and we should propagate
it to userspace.

This also simplifies how we walk the btree path.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/inode.c |   28 ++--
 1 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 55a6a0b..b9f7f52 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4221,10 +4221,8 @@ static int btrfs_real_readdir(struct file *filp, void 
*dirent,
struct btrfs_key found_key;
struct btrfs_path *path;
int ret;
-   u32 nritems;
struct extent_buffer *leaf;
int slot;
-   int advance;
unsigned char d_type;
int over = 0;
u32 di_cur;
@@ -4267,27 +4265,19 @@ static int btrfs_real_readdir(struct file *filp, void 
*dirent,
ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
if (ret  0)
goto err;
-   advance = 0;
 
while (1) {
leaf = path-nodes[0];
-   nritems = btrfs_header_nritems(leaf);
slot = path-slots[0];
-   if (advance || slot = nritems) {
-   if (slot = nritems - 1) {
-   ret = btrfs_next_leaf(root, path);
-   if (ret)
-   break;
-   leaf = path-nodes[0];
-   nritems = btrfs_header_nritems(leaf);
-   slot = path-slots[0];
-   } else {
-   slot++;
-   path-slots[0]++;
-   }
+   if (slot = btrfs_header_nritems(leaf)) {
+   ret = btrfs_next_leaf(root, path);
+   if (ret  0)
+   goto err;
+   else if (ret  0)
+   break;
+   continue;
}
 
-   advance = 1;
item = btrfs_item_nr(leaf, slot);
btrfs_item_key_to_cpu(leaf, found_key, slot);
 
@@ -4296,7 +4286,7 @@ static int btrfs_real_readdir(struct file *filp, void 
*dirent,
if (btrfs_key_type(found_key) != key_type)
break;
if (found_key.offset  filp-f_pos)
-   continue;
+   goto next;
 
filp-f_pos = found_key.offset;
 
@@ -4349,6 +4339,8 @@ skip:
di_cur += di_len;
di = (struct btrfs_dir_item *)((char *)di + di_len);
}
+next:
+   path-slots[0]++;
}
 
/* Reached end of directory/root. Bump pos past the last item. */
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] Btrfs: Fix incorrect inode nlink in btrfs_link()

2011-04-13 Thread Li Zefan
From: Miao Xie mi...@cn.fujitsu.com

Link count of the inode is not decreased if btrfs_set_inode_index()
fails.

Signed-off-by: Miao Xie mi...@cn.fujitsu.com
Singed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/inode.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b9f7f52..a4157cf 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4846,9 +4846,6 @@ static int btrfs_link(struct dentry *old_dentry, struct 
inode *dir,
if (inode-i_nlink == ~0U)
return -EMLINK;
 
-   btrfs_inc_nlink(inode);
-   inode-i_ctime = CURRENT_TIME;
-
err = btrfs_set_inode_index(dir, index);
if (err)
goto fail;
@@ -4864,6 +4861,9 @@ static int btrfs_link(struct dentry *old_dentry, struct 
inode *dir,
goto fail;
}
 
+   btrfs_inc_nlink(inode);
+   inode-i_ctime = CURRENT_TIME;
+
btrfs_set_trans_block_group(trans, dir);
ihold(inode);
 
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] Support the new parameters in do_clone(int argc, char** argv).

2011-04-13 Thread Andreas Philipp
Now 'btrfs subvolume snapshot' takes not two but only at least two
parameters. Additionally, the help message is updated accordingly.

Signed-off-by: Andreas Philipp philipp.andr...@gmail.com
---
 btrfs.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 46314cf..f70d64b 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -44,9 +44,9 @@ static struct Command commands[] = {
/*
avoid short commands different for the case only
*/
-   { do_clone, 2,
- subvolume snapshot, source [dest/]name\n
-   Create a writable snapshot of the subvolume source with\n
+   { do_clone, -2,
+ subvolume snapshot, [-r] source [dest/]name\n
+   Create a writable/readonly snapshot of the subvolume source 
with\n
the name name in the dest directory.
},
{ do_delete_subvolume, 1,
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] Added support for an additional ioctl.

2011-04-13 Thread Andreas Philipp
Added BTRFS_IOC_SNAP_CREATE_V2 and struct btrfs_ioctl_vol_args_v2 as
defined in fs/btrfs/ioctl.h in the kernel sources.

Signed-off-by: Andreas Philipp philipp.andr...@gmail.com
---
 ioctl.h |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/ioctl.h b/ioctl.h
index 776d7a9..358f814 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -30,6 +30,17 @@ struct btrfs_ioctl_vol_args {
char name[BTRFS_PATH_NAME_MAX + 1];
 };
 
+#define BTRFS_SUBVOL_RDONLY(1ULL  1)
+#define BTRFS_SUBVOL_NAME_MAX 4039
+
+struct btrfs_ioctl_vol_args_v2 {
+   __s64 fd;
+   __u64 transid;
+   __u64 flags;
+   __u64 unused[4];
+   char name[BTRFS_SUBVOL_NAME_MAX + 1];
+};
+
 struct btrfs_ioctl_search_key {
/* which root are we searching.  0 is the tree of tree roots */
__u64 tree_id;
@@ -132,6 +143,7 @@ struct btrfs_ioctl_space_args {
struct btrfs_ioctl_space_info spaces[0];
 };
 
+/* BTRFS_IOC_SNAP_CREATE is no longer used by the btrfs command */
 #define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \
   struct btrfs_ioctl_vol_args)
 #define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \
@@ -169,4 +181,6 @@ struct btrfs_ioctl_space_args {
 #define BTRFS_IOC_DEFAULT_SUBVOL _IOW(BTRFS_IOCTL_MAGIC, 19, u64)
 #define BTRFS_IOC_SPACE_INFO _IOWR(BTRFS_IOCTL_MAGIC, 20, \
struct btrfs_ioctl_space_args)
+#define BTRFS_IOC_SNAP_CREATE_V2 _IOW(BTRFS_IOCTL_MAGIC, 23, \
+  struct btrfs_ioctl_vol_args_v2)
 #endif
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] Test the additional ioctl.

2011-04-13 Thread Andreas Philipp

Signed-off-by: Andreas Philipp philipp.andr...@gmail.com
---
 ioctl-test.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/ioctl-test.c b/ioctl-test.c
index 7cf3bc2..1c27d61 100644
--- a/ioctl-test.c
+++ b/ioctl-test.c
@@ -22,6 +22,7 @@ unsigned long ioctls[] = {
BTRFS_IOC_INO_LOOKUP,
BTRFS_IOC_DEFAULT_SUBVOL,
BTRFS_IOC_SPACE_INFO,
+   BTRFS_IOC_SNAP_CREATE_V2,
0 };
 
 int main(int ac, char **av)
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] Add support for read-only subvolumes.

2011-04-13 Thread Andreas Philipp
Use BTRFS_IOC_CREATE_SNAP_V2 instead of BTRFS_IOC_CREATE_SNAP and add
an option for the creation of a readonly snapshot.

Signed-off-by: Andreas Philipp philipp.andr...@gmail.com
---
 btrfs_cmds.c |   44 
 1 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 8031c58..baec675 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -43,7 +43,7 @@
 
 #ifdef __CHECKER__
 #define BLKGETSIZE64 0
-#define BTRFS_IOC_SNAP_CREATE 0
+#define BTRFS_IOC_SNAP_CREATE_V2 0
 #define BTRFS_VOL_NAME_MAX 255
 struct btrfs_ioctl_vol_args { char name[BTRFS_VOL_NAME_MAX]; };
 static inline int ioctl(int fd, int define, void *arg) { return 0; }
@@ -310,13 +310,34 @@ int do_subvol_list(int argc, char **argv)
 int do_clone(int argc, char **argv)
 {
char*subvol, *dst;
-   int res, fd, fddst, len;
+   int res, fd, fddst, len, optind = 0, readonly = 0;
char*newname;
char*dstdir;
 
-   subvol = argv[1];
-   dst = argv[2];
-   struct btrfs_ioctl_vol_args args;
+   while(1) {
+   int c = getopt(argc, argv, r);
+   if (c  0)
+   break;
+   switch(c) {
+   case 'r':
+   optind++;
+   readonly = 1;
+   break;
+   default:
+   fprintf(stderr, Invalid arguments for 
subvolume snapshot\n);
+   free(argv);
+   return 1;
+   }
+   }
+   if (argc - optind  2) {
+   fprintf(stderr, Invalid arguments for defragment\n);
+   free(argv);
+   return 1;
+   }
+
+   subvol = argv[optind+1];
+   dst = argv[optind+2];
+   struct btrfs_ioctl_vol_args_v2  args;
 
res = test_issubvolume(subvol);
if(res0){
@@ -371,12 +392,19 @@ int do_clone(int argc, char **argv)
fprintf(stderr, ERROR: can't access to '%s'\n, dstdir);
return 12;
}
+   
+   if (readonly) {
+   args.flags |= BTRFS_SUBVOL_RDONLY;
+   printf(Create a readonly snapshot of '%s' in '%s/%s'\n,
+  subvol, dstdir, newname);
+   }
+   else
+   printf(Create a snapshot of '%s' in '%s/%s'\n,
+  subvol, dstdir, newname);
 
-   printf(Create a snapshot of '%s' in '%s/%s'\n,
-  subvol, dstdir, newname);
args.fd = fd;
strcpy(args.name, newname);
-   res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args);
+   res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE_V2, args);
 
close(fd);
close(fddst);
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] Updated documentation for btrfs subvolume snapshot.

2011-04-13 Thread Andreas Philipp

Signed-off-by: Andreas Philipp philipp.andr...@gmail.com
---
 man/btrfs.8.in |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index 26ef982..b59bc6f 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -5,7 +5,7 @@
 .SH NAME
 btrfs \- control a btrfs filesystem
 .SH SYNOPSIS
-\fBbtrfs\fP \fBsubvolume snapshot\fP\fI source [dest/]name\fP
+\fBbtrfs\fP \fBsubvolume snapshot\fP\fI [-r] source [dest/]name\fP
 .PP
 \fBbtrfs\fP \fBsubvolume delete\fP\fI subvolume\fP
 .PP
@@ -70,10 +70,11 @@ command.
 .SH COMMANDS
 .TP
 
-\fBsubvolume snapshot\fR\fI source [dest/]name\fR
-Create a writable snapshot of the subvolume \fIsource\fR with the name
-\fIname\fR in the \fIdest\fR directory. If \fIsource\fR is not a
-subvolume, \fBbtrfs\fR returns an error.
+\fBsubvolume snapshot\fR\fI [-r] source [dest/]name\fR
+Create a writable/readonly snapshot of the subvolume \fIsource\fR with the
+name \fIname\fR in the \fIdest\fR directory. If \fIsource\fR is not a
+subvolume, \fBbtrfs\fR returns an error. If \fI-r\fR is given, the snapshot
+will be readonly.
 .TP
 
 \fBsubvolume delete\fR\fI subvolume\fR
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Warning when mounting btrfs partition, kernel unaligned access

2011-04-13 Thread David Sterba
Hi

On Wed, Apr 13, 2011 at 01:03:56AM +0200, Sébastien Bernard wrote:
 Then, after writing on the disk, I got a lot of warning:
 [  822.515875] Kernel unaligned access at TPC[103c2204]
 
 I peeked a look at the btrf_csum_final and here's the function :
 void btrfs_csum_final(u32 crc, char *result)
 {
 *(__le32 *)result = ~cpu_to_le32(crc);
 }

FYI, this has been fixed and is already merged into Linus' tree. Commit
7e75bf3ff3a716d7b21d8fb43bf823115801c1e9.

dave
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Btrfs: A few small bug fixes

2011-04-13 Thread Chris Mason
Excerpts from Li Zefan's message of 2011-04-13 03:42:01 -0400:
 Hi Chris,
 
 Those bugs are small, and the fixes are simple and straitforward.
 
 You can pull from:
 
 git://repo.or.cz/linux-btrfs-devel.git for-chris

Thanks these are now in my master branch.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Re: btrfs does not work on usermode linux

2011-04-13 Thread Chris Mason
Excerpts from Sergei Trofimovich's message of 2011-04-12 17:23:33 -0400:
 On Mon, 11 Apr 2011 15:50:48 -0400
 Josef Bacik jo...@redhat.com wrote:
 
  On 04/11/2011 03:44 PM, Sergei Trofimovich wrote:
   Fix data corruption caused by memcpy() usage on overlapping data.
   I've observed it first when found out usermode linux crash on btrfs.
 
 ...
 
  Fair enough, BUG_ON() it is.  Repost that version and you can add my
  
  Reviewed-by: Josef Bacik jo...@redhat.com
 
 Thank you! Added and resent as:
 http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg09357.html
 

This is in the master branch now, please give it another test.  Thanks a
lot for bisecting down and patching!

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: avoid taking the chunk_mutex in do_chunk_alloc

2011-04-13 Thread David Sterba
On Tue, Apr 12, 2011 at 08:42:39AM -0400, Josef Bacik wrote:
  hmm, the goto will lead to problems, cause in out clause there is a 
  mutex_unlock(), which
  we do not have a mutex_lock yet.
 
 
 Hrm I wonder why xfstests didn't trip over that, thats what I get for patching
 while watching the kid.  Thanks,

a 'dad lock' :)

dave
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/8] btrfs: Balance progress monitoring

2011-04-13 Thread David Sterba
On Tue, Apr 12, 2011 at 07:42:07PM +0100, Hugo Mills wrote:
There will be savings in the future, however -- when I add Li's
 suggestion for tracking the number of bytes (in the block groups as a
 whole, and in terms of useful data stored), plus the vaddr of the
 last-moved block group, the size of the btrfs_balance_info struct will
 go up from its current 8 bytes to 48. I've just not quite finished
 that patch yet, and wanted to get the rest of the patches settled
 while I work on the new one...

makes sense, no objections.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New btrfsck status

2011-04-13 Thread Ernst Sjöstrand
A very good question indeed! ;-)

Regards
//Ernst Sjöstrand

On Tue, Mar 29, 2011 at 14:13, Thomas Backlund t...@mandriva.org wrote:
 Chris Mason skrev 10.2.2011 14:17:

 Excerpts from Ben Gamari's message of 2011-02-09 21:52:20 -0500:

 Hey all,

 Over the last several months there have been many claims regarding the
 release of the rewritten btrfsck. Unfortunately, despite numerous
 claims that it will be released Real Soon Now(c), I have yet to see
 even a repository with preliminary code. Did I miss an announcement?
 There is something to be said for release early, release often. Is
 there a timeline for getting btrfsck into some sort of usable form?

 Yes, but its still real soon now.  I've been at about 90% done since
 Christmas.  It would have been out last week but I've been chasing a
 debugging a very difficult corruption under load.

 I finally found a race in btrfs causing the corruption and now I'm back
 on fsck full time again.


 Any status updates on this ?

 Checking:
 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git;a=shortlog;h=refs/heads/next

 I see last commit is 3+ months ago

 --
 Thomas
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: separate superblock items out of fs_info

2011-04-13 Thread David Sterba
fs_info has now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)

Add a wrapper for freeing fs_info.

Signed-off-by: David Sterba dste...@suse.cz
---
 fs/btrfs/compression.c |2 +-
 fs/btrfs/ctree.h   |   12 ++--
 fs/btrfs/disk-io.c |   20 ++--
 fs/btrfs/extent-tree.c |   14 +++---
 fs/btrfs/file-item.c   |   12 ++--
 fs/btrfs/inode.c   |2 +-
 fs/btrfs/ioctl.c   |6 +++---
 fs/btrfs/super.c   |   20 ++--
 fs/btrfs/sysfs.c   |6 +++---
 fs/btrfs/transaction.c |   10 +-
 fs/btrfs/tree-log.c|4 ++--
 fs/btrfs/volumes.c |   24 
 12 files changed, 74 insertions(+), 58 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 41d1d7c..377d581 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -85,7 +85,7 @@ struct compressed_bio {
 static inline int compressed_bio_size(struct btrfs_root *root,
  unsigned long disk_size)
 {
-   u16 csum_size = btrfs_super_csum_size(root-fs_info-super_copy);
+   u16 csum_size = btrfs_super_csum_size(root-fs_info-super_copy);
return sizeof(struct compressed_bio) +
((disk_size + root-sectorsize - 1) / root-sectorsize) *
csum_size;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 3458b57..9d83ee4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -928,8 +928,8 @@ struct btrfs_fs_info {
wait_queue_head_t transaction_blocked_wait;
wait_queue_head_t async_submit_wait;
 
-   struct btrfs_super_block super_copy;
-   struct btrfs_super_block super_for_commit;
+   struct btrfs_super_block *super_copy;
+   struct btrfs_super_block *super_for_commit;
struct block_device *__bdev;
struct super_block *sb;
struct inode *btree_inode;
@@ -2333,6 +2333,14 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct extent_buffer *node,
struct extent_buffer *parent);
+
+static inline void free_fs_info(struct btrfs_fs_info *fs_info)
+{
+   kfree(fs_info-super_copy);
+   kfree(fs_info-super_for_commit);
+   kfree(fs_info);
+}
+
 /* root-item.c */
 int btrfs_find_root_ref(struct btrfs_root *tree_root,
struct btrfs_path *path,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8f1d44b..4fced6b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -210,7 +210,7 @@ static int csum_tree_block(struct btrfs_root *root, struct 
extent_buffer *buf,
   int verify)
 {
u16 csum_size =
-   btrfs_super_csum_size(root-fs_info-super_copy);
+   btrfs_super_csum_size(root-fs_info-super_copy);
char *result = NULL;
unsigned long len;
unsigned long cur_len;
@@ -1763,14 +1763,14 @@ struct btrfs_root *open_ctree(struct super_block *sb,
goto fail_iput;
}
 
-   memcpy(fs_info-super_copy, bh-b_data, sizeof(fs_info-super_copy));
-   memcpy(fs_info-super_for_commit, fs_info-super_copy,
-  sizeof(fs_info-super_for_commit));
+   memcpy(fs_info-super_copy, bh-b_data, sizeof(*fs_info-super_copy));
+   memcpy(fs_info-super_for_commit, fs_info-super_copy,
+  sizeof(*fs_info-super_for_commit));
brelse(bh);
 
-   memcpy(fs_info-fsid, fs_info-super_copy.fsid, BTRFS_FSID_SIZE);
+   memcpy(fs_info-fsid, fs_info-super_copy-fsid, BTRFS_FSID_SIZE);
 
-   disk_super = fs_info-super_copy;
+   disk_super = fs_info-super_copy;
if (!btrfs_super_root(disk_super))
goto fail_iput;
 
@@ -2151,10 +2151,10 @@ fail_srcu:
 fail:
kfree(extent_root);
kfree(tree_root);
-   kfree(fs_info);
kfree(chunk_root);
kfree(dev_root);
kfree(csum_root);
+   free_fs_info(fs_info);
return ERR_PTR(err);
 }
 
@@ -2325,10 +2325,10 @@ int write_all_supers(struct btrfs_root *root, int 
max_mirrors)
int total_errors = 0;
u64 flags;
 
-   max_errors = btrfs_super_num_devices(root-fs_info-super_copy) - 1;
+   max_errors = btrfs_super_num_devices(root-fs_info-super_copy) - 1;
do_barriers = !btrfs_test_opt(root, NOBARRIER);
 
-   sb = root-fs_info-super_for_commit;
+   sb = root-fs_info-super_for_commit;
dev_item = sb-dev_item;
 
mutex_lock(root-fs_info-fs_devices-device_list_mutex);
@@ -2601,7 +2601,7 @@ int close_ctree(struct btrfs_root *root)
kfree(fs_info-chunk_root);
kfree(fs_info-dev_root);
kfree(fs_info-csum_root);
-   kfree(fs_info);
+   free_fs_info(fs_info);
 
   

[PATCH] Btrfs: don't reserve metadata when we're using the delalloc reserve

2011-04-13 Thread Josef Bacik
There are a bunch of places where we do btrfs_join_transaction(root, 1), but
really we set the block_rsv to the delalloc reserve because our metadata was
reserved at delalloc time.  This means we don't need to reserve space at all,
and can just join the transaction and go.  This patch also fixes a few places
where we weren't actually setting the block_rsv to the delalloc reserve.
Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/inode.c |   27 +--
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c4b914e..e9bda50 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -420,7 +420,7 @@ again:
}
}
if (start == 0) {
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);
BUG_ON(IS_ERR(trans));
btrfs_set_trans_block_group(trans, inode);
trans-block_rsv = root-fs_info-delalloc_block_rsv;
@@ -617,8 +617,9 @@ retry:
async_extent-start + async_extent-ram_size - 1,
GFP_NOFS);
 
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);
BUG_ON(IS_ERR(trans));
+   trans-block_rsv = root-fs_info-delalloc_block_rsv;
ret = btrfs_reserve_extent(trans, root,
   async_extent-compressed_size,
   async_extent-compressed_size,
@@ -778,7 +779,12 @@ static noinline int cow_file_range(struct inode *inode,
int ret = 0;
 
BUG_ON(root == root-fs_info-tree_root);
-   trans = btrfs_join_transaction(root, 1);
+
+   /*
+* Our metadata reservations should have been taken care of in the
+* delalloc stuff, so we don't need to reserve space here.
+*/
+   trans = btrfs_join_transaction(root, 0);
BUG_ON(IS_ERR(trans));
btrfs_set_trans_block_group(trans, inode);
trans-block_rsv = root-fs_info-delalloc_block_rsv;
@@ -1054,11 +1060,12 @@ static noinline int run_delalloc_nocow(struct inode 
*inode,
BUG_ON(!path);
if (root == root-fs_info-tree_root) {
nolock = true;
-   trans = btrfs_join_transaction_nolock(root, 1);
+   trans = btrfs_join_transaction_nolock(root, 0);
} else {
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);
}
BUG_ON(IS_ERR(trans));
+   trans-block_rsv = root-fs_info-delalloc_block_rsv;
 
cow_start = (u64)-1;
cur_offset = start;
@@ -1715,9 +1722,9 @@ static int btrfs_finish_ordered_io(struct inode *inode, 
u64 start, u64 end)
ret = btrfs_ordered_update_i_size(inode, 0, ordered_extent);
if (!ret) {
if (nolock)
-   trans = btrfs_join_transaction_nolock(root, 1);
+   trans = btrfs_join_transaction_nolock(root, 0);
else
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);
BUG_ON(IS_ERR(trans));
btrfs_set_trans_block_group(trans, inode);
trans-block_rsv = root-fs_info-delalloc_block_rsv;
@@ -1732,9 +1739,9 @@ static int btrfs_finish_ordered_io(struct inode *inode, 
u64 start, u64 end)
 0, cached_state, GFP_NOFS);
 
if (nolock)
-   trans = btrfs_join_transaction_nolock(root, 1);
+   trans = btrfs_join_transaction_nolock(root, 0);
else
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);
BUG_ON(IS_ERR(trans));
btrfs_set_trans_block_group(trans, inode);
trans-block_rsv = root-fs_info-delalloc_block_rsv;
@@ -5839,7 +5846,7 @@ again:
 
BUG_ON(!ordered);
 
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);
if (IS_ERR(trans)) {
err = -ENOMEM;
goto out;
-- 
1.7.2.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: don't reserve metadata when we're using the delalloc reserve

2011-04-13 Thread Arne Jansen
On 13.04.2011 18:06, Josef Bacik wrote:
 There are a bunch of places where we do btrfs_join_transaction(root, 1), but
 really we set the block_rsv to the delalloc reserve because our metadata was
 reserved at delalloc time.  This means we don't need to reserve space at all,
 and can just join the transaction and go.  This patch also fixes a few places
 where we weren't actually setting the block_rsv to the delalloc reserve.
 Thanks,
 
 Signed-off-by: Josef Bacik jo...@redhat.com
 ---
  fs/btrfs/inode.c |   27 +--
  1 files changed, 17 insertions(+), 10 deletions(-)
 
 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index c4b914e..e9bda50 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -420,7 +420,7 @@ again:
   }
   }
   if (start == 0) {
 - trans = btrfs_join_transaction(root, 1);
 + trans = btrfs_join_transaction(root, 0);

btrfs_join_transaction ignores the num_blocks parameter, so this
shouldn't change anything. Maybe it's cleaner to just eradicate
the parameter.

-Arne



   BUG_ON(IS_ERR(trans));
   btrfs_set_trans_block_group(trans, inode);
   trans-block_rsv = root-fs_info-delalloc_block_rsv;
 @@ -617,8 +617,9 @@ retry:
   async_extent-start + async_extent-ram_size - 1,
   GFP_NOFS);
  
 - trans = btrfs_join_transaction(root, 1);
 + trans = btrfs_join_transaction(root, 0);
   BUG_ON(IS_ERR(trans));
 + trans-block_rsv = root-fs_info-delalloc_block_rsv;
   ret = btrfs_reserve_extent(trans, root,
  async_extent-compressed_size,
  async_extent-compressed_size,
 @@ -778,7 +779,12 @@ static noinline int cow_file_range(struct inode *inode,
   int ret = 0;
  
   BUG_ON(root == root-fs_info-tree_root);
 - trans = btrfs_join_transaction(root, 1);
 +
 + /*
 +  * Our metadata reservations should have been taken care of in the
 +  * delalloc stuff, so we don't need to reserve space here.
 +  */
 + trans = btrfs_join_transaction(root, 0);
   BUG_ON(IS_ERR(trans));
   btrfs_set_trans_block_group(trans, inode);
   trans-block_rsv = root-fs_info-delalloc_block_rsv;
 @@ -1054,11 +1060,12 @@ static noinline int run_delalloc_nocow(struct inode 
 *inode,
   BUG_ON(!path);
   if (root == root-fs_info-tree_root) {
   nolock = true;
 - trans = btrfs_join_transaction_nolock(root, 1);
 + trans = btrfs_join_transaction_nolock(root, 0);
   } else {
 - trans = btrfs_join_transaction(root, 1);
 + trans = btrfs_join_transaction(root, 0);
   }
   BUG_ON(IS_ERR(trans));
 + trans-block_rsv = root-fs_info-delalloc_block_rsv;
  
   cow_start = (u64)-1;
   cur_offset = start;
 @@ -1715,9 +1722,9 @@ static int btrfs_finish_ordered_io(struct inode *inode, 
 u64 start, u64 end)
   ret = btrfs_ordered_update_i_size(inode, 0, ordered_extent);
   if (!ret) {
   if (nolock)
 - trans = btrfs_join_transaction_nolock(root, 1);
 + trans = btrfs_join_transaction_nolock(root, 0);
   else
 - trans = btrfs_join_transaction(root, 1);
 + trans = btrfs_join_transaction(root, 0);
   BUG_ON(IS_ERR(trans));
   btrfs_set_trans_block_group(trans, inode);
   trans-block_rsv = root-fs_info-delalloc_block_rsv;
 @@ -1732,9 +1739,9 @@ static int btrfs_finish_ordered_io(struct inode *inode, 
 u64 start, u64 end)
0, cached_state, GFP_NOFS);
  
   if (nolock)
 - trans = btrfs_join_transaction_nolock(root, 1);
 + trans = btrfs_join_transaction_nolock(root, 0);
   else
 - trans = btrfs_join_transaction(root, 1);
 + trans = btrfs_join_transaction(root, 0);
   BUG_ON(IS_ERR(trans));
   btrfs_set_trans_block_group(trans, inode);
   trans-block_rsv = root-fs_info-delalloc_block_rsv;
 @@ -5839,7 +5846,7 @@ again:
  
   BUG_ON(!ordered);
  
 - trans = btrfs_join_transaction(root, 1);
 + trans = btrfs_join_transaction(root, 0);
   if (IS_ERR(trans)) {
   err = -ENOMEM;
   goto out;

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: don't reserve metadata when we're using the delalloc reserve

2011-04-13 Thread Josef Bacik

On 04/13/2011 12:34 PM, Arne Jansen wrote:

On 13.04.2011 18:06, Josef Bacik wrote:

There are a bunch of places where we do btrfs_join_transaction(root, 1), but
really we set the block_rsv to the delalloc reserve because our metadata was
reserved at delalloc time.  This means we don't need to reserve space at all,
and can just join the transaction and go.  This patch also fixes a few places
where we weren't actually setting the block_rsv to the delalloc reserve.
Thanks,

Signed-off-by: Josef Bacikjo...@redhat.com
---
  fs/btrfs/inode.c |   27 +--
  1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c4b914e..e9bda50 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -420,7 +420,7 @@ again:
}
}
if (start == 0) {
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root, 0);


btrfs_join_transaction ignores the num_blocks parameter, so this
shouldn't change anything. Maybe it's cleaner to just eradicate
the parameter.



Balls I forgot about that, though we should still be using the delalloc 
block reserve in the places that I put it.  I'll just fix that up.  Thanks,


Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: make sure to use the delalloc reserve when filling delalloc

2011-04-13 Thread Josef Bacik
In the prealloc filling code and compressed code we don't set trans-block_rsv
to the delalloc block reserve properly, which is going to make us use metadata
from the wrong pool, this patch fixes that.  Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/inode.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c4b914e..a23e9ee 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -619,6 +619,7 @@ retry:
 
trans = btrfs_join_transaction(root, 1);
BUG_ON(IS_ERR(trans));
+   trans-block_rsv = root-fs_info-delalloc_block_rsv;
ret = btrfs_reserve_extent(trans, root,
   async_extent-compressed_size,
   async_extent-compressed_size,
@@ -1059,6 +1060,7 @@ static noinline int run_delalloc_nocow(struct inode 
*inode,
trans = btrfs_join_transaction(root, 1);
}
BUG_ON(IS_ERR(trans));
+   trans-block_rsv = root-fs_info-delalloc_block_rsv;
 
cow_start = (u64)-1;
cur_offset = start;
-- 
1.7.2.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: take away the num_items argument from btrfs_join_transaction

2011-04-13 Thread Josef Bacik
I keep forgetting that btrfs_join_transaction() just ignores the num_items
argument, which leads me to sending pointless patches and looking stupid :).  So
just kill the num_items argument from btrfs_join_transaction and
btrfs_start_ioctl_transaction, since neither of them use it.  Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/disk-io.c |6 +++---
 fs/btrfs/extent-tree.c |   12 ++--
 fs/btrfs/inode.c   |   34 +-
 fs/btrfs/ioctl.c   |4 ++--
 fs/btrfs/relocation.c  |   12 ++--
 fs/btrfs/transaction.c |   13 +
 fs/btrfs/transaction.h |9 +++--
 7 files changed, 42 insertions(+), 48 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 68c84c8..0a141df 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1568,7 +1568,7 @@ static int transaction_kthread(void *arg)
transid = cur-transid;
spin_unlock(root-fs_info-new_trans_lock);
 
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
BUG_ON(IS_ERR(trans));
if (transid == trans-transid) {
ret = btrfs_commit_transaction(trans, root);
@@ -2495,13 +2495,13 @@ int btrfs_commit_super(struct btrfs_root *root)
down_write(root-fs_info-cleanup_work_sem);
up_write(root-fs_info-cleanup_work_sem);
 
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
if (IS_ERR(trans))
return PTR_ERR(trans);
ret = btrfs_commit_transaction(trans, root);
BUG_ON(ret);
/* run commit again to drop the original snapshot */
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
if (IS_ERR(trans))
return PTR_ERR(trans);
btrfs_commit_transaction(trans, root);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 362cc9b..0714a57 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3155,7 +3155,7 @@ again:
spin_unlock(data_sinfo-lock);
 alloc:
alloc_target = btrfs_get_alloc_profile(root, 1);
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
if (IS_ERR(trans))
return PTR_ERR(trans);
 
@@ -3182,7 +3182,7 @@ alloc:
 commit_trans:
if (!committed  !root-fs_info-open_ioctl_trans) {
committed = 1;
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
if (IS_ERR(trans))
return PTR_ERR(trans);
ret = btrfs_commit_transaction(trans, root);
@@ -3543,7 +3543,7 @@ again:
goto out;
 
ret = -ENOSPC;
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
if (IS_ERR(trans))
goto out;
ret = btrfs_commit_transaction(trans, root);
@@ -3770,7 +3770,7 @@ int btrfs_block_rsv_check(struct btrfs_trans_handle 
*trans,
if (trans)
return -EAGAIN;
 
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
BUG_ON(IS_ERR(trans));
ret = btrfs_commit_transaction(trans, root);
return 0;
@@ -7600,7 +7600,7 @@ int btrfs_drop_dead_reloc_roots(struct btrfs_root *root)
 
BUG_ON(reloc_root-commit_root != NULL);
while (1) {
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
BUG_ON(IS_ERR(trans));
 
mutex_lock(root-fs_info-drop_mutex);
@@ -8123,7 +8123,7 @@ int btrfs_set_block_group_ro(struct btrfs_root *root,
 
BUG_ON(cache-ro);
 
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
BUG_ON(IS_ERR(trans));
 
alloc_flags = update_block_group_flags(root, cache-flags);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a23e9ee..ade00e7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -420,7 +420,7 @@ again:
}
}
if (start == 0) {
-   trans = btrfs_join_transaction(root, 1);
+   trans = btrfs_join_transaction(root);
BUG_ON(IS_ERR(trans));
btrfs_set_trans_block_group(trans, inode);
trans-block_rsv = root-fs_info-delalloc_block_rsv;
@@ -617,7 +617,7 @@ retry:
async_extent-start + async_extent-ram_size - 1,
GFP_NOFS);
 
-   trans = btrfs_join_transaction(root, 1);
+   

Re: Warning when mounting btrfs partition, kernel unaligned access

2011-04-13 Thread David Miller
From: David Sterba d...@jikos.cz
Date: Wed, 13 Apr 2011 11:40:37 +0200

 On Wed, Apr 13, 2011 at 01:03:56AM +0200, Sébastien Bernard wrote:
 Then, after writing on the disk, I got a lot of warning:
 [  822.515875] Kernel unaligned access at TPC[103c2204]
 
 I peeked a look at the btrf_csum_final and here's the function :
 void btrfs_csum_final(u32 crc, char *result)
 {
 *(__le32 *)result = ~cpu_to_le32(crc);
 }
 
 FYI, this has been fixed and is already merged into Linus' tree. Commit
 7e75bf3ff3a716d7b21d8fb43bf823115801c1e9.

Might I suggest adding a BUG_ON() validation of the alignment or
similar here?

You can make the test really cheap, and this way no matter what kind
of systems the btrfs folks do their testing on this kind of regression
will be spotted fast.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Josef Bacik
There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: GW   2.6.39-rc2+ #35 
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[a044ec34]  [a044ec34] 
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0  EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS:  () GS:88003e40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
880032278000)
Stack:
 0001 88002a92 881d 038d
  0005 88003aa38000 81481012
 88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
 [81481012] ? sub_preempt_count+0x97/0xaa
 [a044f92e] run_clustered_refs+0x61b/0x700 [btrfs]
 [81480f89] ? sub_preempt_count+0xe/0xaa
 [a0446ee9] ? spin_lock+0xe/0x10 [btrfs]
 [a044fae4] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
 [8147dc1c] ? _raw_spin_unlock+0x4a/0x57
 [a045af1b] __btrfs_end_transaction+0x89/0x1ed [btrfs]
 [a045b0c2] btrfs_end_transaction+0x15/0x17 [btrfs]
 [a0466932] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
 [a04669d6] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
 [a0477fd5] end_bio_extent_writepage+0xae/0x159 [btrfs]
 [811457e3] bio_endio+0x2d/0x2f
 [a0456c44] end_workqueue_fn+0x111/0x120 [btrfs]
 [a0480a0e] worker_loop+0x192/0x4d1 [btrfs]
 [a048087c] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
 [81068a69] kthread+0xa0/0xa8
 [8107a847] ? trace_hardirqs_on_caller+0x111/0x135
 [81485364] kernel_thread_helper+0x4/0x10
 [8147e398] ? retint_restore_args+0x13/0x13
 [810689c9] ? __init_kthread_worker+0x5b/0x5b
 [81485360] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 
48 8b 72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04 0f 0b eb fe 48 8b 03 
48 89 45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP  [a044ec34] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
 RSP 88000b4319f0
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before ending the
transaction.  However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we still have
plenty of space that we need to use.  So instead release the delalloc bytes
_after_ we end the transaction so that we don't get this false ENOSPC.  Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/inode.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ade00e7..b1e5b11 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1783,9 +1783,13 @@ out:
if (trans)
btrfs_end_transaction_nolock(trans, root);
} else {
-   btrfs_delalloc_release_metadata(inode, ordered_extent-len);
if (trans)
btrfs_end_transaction(trans, root);
+   /*
+* Release after the transaction ends so it covers the delayed
+* ref updates
+*/
+   btrfs_delalloc_release_metadata(inode, ordered_extent-len);
}
 
/* once for us */
@@ -5897,8 +5901,8 @@ out_unlock:
 ordered-file_offset + ordered-len - 1,
 cached_state, GFP_NOFS);
 out:
-   btrfs_delalloc_release_metadata(inode, ordered-len);
btrfs_end_transaction(trans, root);
+  

[PATCH] Btrfs: if we've already started a trans handle, use that one

2011-04-13 Thread Josef Bacik
We currently track trans handles in current-journal_info, but we don't actually
use it.  This patch fixes it.  This will cover the case where we have multiple
people starting transactions down the call chain.  This keeps us from having to
allocate a new handle and all of that, we just increase the use count of the
current handle, save the old block_rsv, and return.  I tested this with xfstests
and it worked out fine.  Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/transaction.c |   17 +
 fs/btrfs/transaction.h |2 ++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 70bfb26..46f4056 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -184,6 +184,15 @@ static struct btrfs_trans_handle *start_transaction(struct 
btrfs_root *root,
 
if (root-fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR)
return ERR_PTR(-EROFS);
+
+   if (current-journal_info) {
+   WARN_ON(type != TRANS_JOIN  type != TRANS_JOIN_NOLOCK);
+   h = current-journal_info;
+   h-use_count++;
+   h-orig_rsv = h-block_rsv;
+   h-block_rsv = NULL;
+   goto got_it;
+   }
 again:
h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
if (!h)
@@ -213,7 +222,9 @@ again:
h-block_group = 0;
h-bytes_reserved = 0;
h-delayed_ref_updates = 0;
+   h-use_count = 1;
h-block_rsv = NULL;
+   h-orig_rsv = NULL;
 
smp_mb();
if (cur_trans-blocked  may_wait_transaction(root, type)) {
@@ -241,6 +252,7 @@ again:
}
}
 
+got_it:
if (type != TRANS_JOIN_NOLOCK)
mutex_lock(root-fs_info-trans_mutex);
record_root_in_trans(h, root);
@@ -428,6 +440,11 @@ static int __btrfs_end_transaction(struct 
btrfs_trans_handle *trans,
struct btrfs_fs_info *info = root-fs_info;
int count = 0;
 
+   if (--trans-use_count) {
+   trans-block_rsv = trans-orig_rsv;
+   return 0;
+   }
+
while (count  4) {
unsigned long cur = trans-delayed_ref_updates;
trans-delayed_ref_updates = 0;
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 1f573f0..154314f 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -47,11 +47,13 @@ struct btrfs_trans_handle {
u64 transid;
u64 block_group;
u64 bytes_reserved;
+   unsigned long use_count;
unsigned long blocks_reserved;
unsigned long blocks_used;
unsigned long delayed_ref_updates;
struct btrfs_transaction *transaction;
struct btrfs_block_rsv *block_rsv;
+   struct btrfs_block_rsv *orig_rsv;
 };
 
 struct btrfs_pending_snapshot {
-- 
1.7.2.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Re: btrfs does not work on usermode linux

2011-04-13 Thread Sergei Trofimovich
On Wed, 13 Apr 2011 07:32:59 -0400
Chris Mason chris.ma...@oracle.com wrote:

 This is in the master branch now, please give it another test.  Thanks a
 lot for bisecting down and patching!

Tested on btrfs-unstable/master. Works correctly. Reverting
3387206f26e1b48703e810175b98611a4fd8e8ea (to make sure)
on top of master returns panic.

Thank you!

-- 

  Sergei


signature.asc
Description: PGP signature


Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Arne Jansen

On 13.04.2011 20:54, Josef Bacik wrote:

There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: GW   2.6.39-rc2+ #35 
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[a044ec34]  [a044ec34] 
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0  EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS:  () GS:88003e40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
880032278000)
Stack:
  0001 88002a92 881d 038d
   0005 88003aa38000 81481012
  88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
  [81481012] ? sub_preempt_count+0x97/0xaa
  [a044f92e] run_clustered_refs+0x61b/0x700 [btrfs]
  [81480f89] ? sub_preempt_count+0xe/0xaa
  [a0446ee9] ? spin_lock+0xe/0x10 [btrfs]
  [a044fae4] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
  [8147dc1c] ? _raw_spin_unlock+0x4a/0x57
  [a045af1b] __btrfs_end_transaction+0x89/0x1ed [btrfs]
  [a045b0c2] btrfs_end_transaction+0x15/0x17 [btrfs]
  [a0466932] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
  [a04669d6] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
  [a0477fd5] end_bio_extent_writepage+0xae/0x159 [btrfs]
  [811457e3] bio_endio+0x2d/0x2f
  [a0456c44] end_workqueue_fn+0x111/0x120 [btrfs]
  [a0480a0e] worker_loop+0x192/0x4d1 [btrfs]
  [a048087c] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
  [81068a69] kthread+0xa0/0xa8
  [8107a847] ? trace_hardirqs_on_caller+0x111/0x135
  [81485364] kernel_thread_helper+0x4/0x10
  [8147e398] ? retint_restore_args+0x13/0x13
  [810689c9] ? __init_kthread_worker+0x5b/0x5b
  [81485360] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 48 8b 
72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 040f  0b eb fe 48 8b 03 48 89 
45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP  [a044ec34] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
  RSP88000b4319f0
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before ending the
transaction.  However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we still have
plenty of space that we need to use.  So instead release the delalloc bytes
_after_ we end the transaction so that we don't get this false ENOSPC.  Thanks,

Signed-off-by: Josef Bacikjo...@redhat.com
---
  fs/btrfs/inode.c |8 ++--
  1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ade00e7..b1e5b11 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1783,9 +1783,13 @@ out:
if (trans)
btrfs_end_transaction_nolock(trans, root);
} else {
-   btrfs_delalloc_release_metadata(inode, ordered_extent-len);
if (trans)
btrfs_end_transaction(trans, root);
+   /*
+* Release after the transaction ends so it covers the delayed
+* ref updates
+*/
+   btrfs_delalloc_release_metadata(inode, ordered_extent-len);


I think calling end_transaction doesn't guarantee you that all delayed
refs have run, only if end_transaction leads to commit transaction.
Another problem I see is that commit_transaction just uses the block_rsv
of whatever trans happened to 

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Yan, Zheng
On Thu, Apr 14, 2011 at 2:54 AM, Josef Bacik jo...@redhat.com wrote:
 There have been many sporadic reports of the following panic

 [ cut here ]
 kernel BUG at fs/btrfs/extent-tree.c:5498!
 invalid opcode:  [#1] PREEMPT SMP
 last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
 CPU 7
 Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
 ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
 acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
 snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
 snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
 rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
 snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

 Pid: 28504, comm: btrfs-endio-wri Tainted: G        W   2.6.39-rc2+ #35 
 Hewlett-Packard HP xw6600 Workstation/0A9Ch
 RIP: 0010:[a044ec34]  [a044ec34] 
 alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
 RSP: 0018:88000b4319f0  EFLAGS: 00010286
 RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
 RDX: 88000b4318c0 RSI: 00d5 RDI: 
 RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
 R10: 0001 R11: 88000b431b10 R12: 88000b431b10
 R13: 00b2 R14:  R15: 88002225f2f8
 FS:  () GS:88003e40() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
 880032278000)
 Stack:
  0001 88002a92 881d 038d
   0005 88003aa38000 81481012
  88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
 Call Trace:
  [81481012] ? sub_preempt_count+0x97/0xaa
  [a044f92e] run_clustered_refs+0x61b/0x700 [btrfs]
  [81480f89] ? sub_preempt_count+0xe/0xaa
  [a0446ee9] ? spin_lock+0xe/0x10 [btrfs]
  [a044fae4] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
  [8147dc1c] ? _raw_spin_unlock+0x4a/0x57
  [a045af1b] __btrfs_end_transaction+0x89/0x1ed [btrfs]
  [a045b0c2] btrfs_end_transaction+0x15/0x17 [btrfs]
  [a0466932] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
  [a04669d6] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
  [a0477fd5] end_bio_extent_writepage+0xae/0x159 [btrfs]
  [811457e3] bio_endio+0x2d/0x2f
  [a0456c44] end_workqueue_fn+0x111/0x120 [btrfs]
  [a0480a0e] worker_loop+0x192/0x4d1 [btrfs]
  [a048087c] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
  [81068a69] kthread+0xa0/0xa8
  [8107a847] ? trace_hardirqs_on_caller+0x111/0x135
  [81485364] kernel_thread_helper+0x4/0x10
  [8147e398] ? retint_restore_args+0x13/0x13
  [810689c9] ? __init_kthread_worker+0x5b/0x5b
  [81485360] ? gs_change+0x13/0x13
 Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 
 48 8b 72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04 0f 0b eb fe 48 8b 
 03 48 89 45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
 RIP  [a044ec34] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
  RSP 88000b4319f0
 ---[ end trace 81d1c68cb00af83e ]---

 This is because we have been releasing the delalloc bytes before ending the
 transaction.  However the way we make allocations, any updates to the
 extent_tree are delayed and then run when the transaction runs, so we still 
 have
 plenty of space that we need to use.  So instead release the delalloc bytes
 _after_ we end the transaction so that we don't get this false ENOSPC.  
 Thanks,


This is wrong, because btrfs_run_delayed_refs uses global block reservation.


 Signed-off-by: Josef Bacik jo...@redhat.com
 ---
  fs/btrfs/inode.c |    8 ++--
  1 files changed, 6 insertions(+), 2 deletions(-)

 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index ade00e7..b1e5b11 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -1783,9 +1783,13 @@ out:
                if (trans)
                        btrfs_end_transaction_nolock(trans, root);
        } else {
 -               btrfs_delalloc_release_metadata(inode, ordered_extent-len);
                if (trans)
                        btrfs_end_transaction(trans, root);
 +               /*
 +                * Release after the transaction ends so it covers the delayed
 +                * ref updates
 +                */
 +               btrfs_delalloc_release_metadata(inode, ordered_extent-len);
        }

        /* once for us */
 @@ -5897,8 +5901,8 @@ out_unlock:
    

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Josef Bacik

On 04/13/2011 06:08 PM, Arne Jansen wrote:

On 13.04.2011 20:54, Josef Bacik wrote:

There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc
cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6
dm_multipath kvm uinput snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd
hp_wmi i5400_edac sparse_keymap iTCO_wdt rfkill edac_core tg3 shpchp
iTCO_vendor_support soundcore wmi floppy snd_page_alloc pcspkr i5k_amb
[last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: G W 2.6.39-rc2+ #35
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[a044ec34] [a044ec34]
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0 EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS: () GS:88003e40()
knlGS:
CS: 0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task
880032278000)
Stack:
0001 88002a92 881d 038d
 0005 88003aa38000 81481012
88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
[81481012] ? sub_preempt_count+0x97/0xaa
[a044f92e] run_clustered_refs+0x61b/0x700 [btrfs]
[81480f89] ? sub_preempt_count+0xe/0xaa
[a0446ee9] ? spin_lock+0xe/0x10 [btrfs]
[a044fae4] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
[8147dc1c] ? _raw_spin_unlock+0x4a/0x57
[a045af1b] __btrfs_end_transaction+0x89/0x1ed [btrfs]
[a045b0c2] btrfs_end_transaction+0x15/0x17 [btrfs]
[a0466932] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
[a04669d6] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
[a0477fd5] end_bio_extent_writepage+0xae/0x159 [btrfs]
[811457e3] bio_endio+0x2d/0x2f
[a0456c44] end_workqueue_fn+0x111/0x120 [btrfs]
[a0480a0e] worker_loop+0x192/0x4d1 [btrfs]
[a048087c] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
[81068a69] kthread+0xa0/0xa8
[8107a847] ? trace_hardirqs_on_caller+0x111/0x135
[81485364] kernel_thread_helper+0x4/0x10
[8147e398] ? retint_restore_args+0x13/0x13
[810689c9] ? __init_kthread_worker+0x5b/0x5b
[81485360] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18
4c 89 e1 48 8b 72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 040f
0b eb fe 48 8b 03 48 89 45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP [a044ec34] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP88000b4319f0
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before
ending the
transaction. However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we
still have
plenty of space that we need to use. So instead release the delalloc
bytes
_after_ we end the transaction so that we don't get this false ENOSPC.
Thanks,

Signed-off-by: Josef Bacikjo...@redhat.com
---
fs/btrfs/inode.c | 8 ++--
1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ade00e7..b1e5b11 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1783,9 +1783,13 @@ out:
if (trans)
btrfs_end_transaction_nolock(trans, root);
} else {
- btrfs_delalloc_release_metadata(inode, ordered_extent-len);
if (trans)
btrfs_end_transaction(trans, root);
+ /*
+ * Release after the transaction ends so it covers the delayed
+ * ref updates
+ */
+ btrfs_delalloc_release_metadata(inode, ordered_extent-len);


I think calling end_transaction doesn't guarantee you that all delayed
refs have run, only if end_transaction leads to commit transaction.
Another problem I see is that commit_transaction just uses the block_rsv
of whatever trans happened to call commit, even if the trans-block_rsv
have been set to a different block_rsv than trans_block_rsv or
delalloc_block_rsv. In other words, the relayed_refs are run from a non-
deterministic block_rsv.
But it's late, 

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Josef Bacik

On 04/13/2011 08:26 PM, Yan, Zheng wrote:

On Thu, Apr 14, 2011 at 2:54 AM, Josef Bacikjo...@redhat.com  wrote:

There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: GW   2.6.39-rc2+ #35 
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[a044ec34]  [a044ec34] 
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0  EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS:  () GS:88003e40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
880032278000)
Stack:
  0001 88002a92 881d 038d
   0005 88003aa38000 81481012
  88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
  [81481012] ? sub_preempt_count+0x97/0xaa
  [a044f92e] run_clustered_refs+0x61b/0x700 [btrfs]
  [81480f89] ? sub_preempt_count+0xe/0xaa
  [a0446ee9] ? spin_lock+0xe/0x10 [btrfs]
  [a044fae4] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
  [8147dc1c] ? _raw_spin_unlock+0x4a/0x57
  [a045af1b] __btrfs_end_transaction+0x89/0x1ed [btrfs]
  [a045b0c2] btrfs_end_transaction+0x15/0x17 [btrfs]
  [a0466932] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
  [a04669d6] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
  [a0477fd5] end_bio_extent_writepage+0xae/0x159 [btrfs]
  [811457e3] bio_endio+0x2d/0x2f
  [a0456c44] end_workqueue_fn+0x111/0x120 [btrfs]
  [a0480a0e] worker_loop+0x192/0x4d1 [btrfs]
  [a048087c] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
  [81068a69] kthread+0xa0/0xa8
  [8107a847] ? trace_hardirqs_on_caller+0x111/0x135
  [81485364] kernel_thread_helper+0x4/0x10
  [8147e398] ? retint_restore_args+0x13/0x13
  [810689c9] ? __init_kthread_worker+0x5b/0x5b
  [81485360] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 48 8b 
72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 040f  0b eb fe 48 8b 03 48 89 
45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP  [a044ec34] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
  RSP88000b4319f0
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before ending the
transaction.  However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we still have
plenty of space that we need to use.  So instead release the delalloc bytes
_after_ we end the transaction so that we don't get this false ENOSPC.  Thanks,



This is wrong, because btrfs_run_delayed_refs uses global block reservation.



I don't see anywhere in the delayed ref code that specifically uses the 
global block reserve, where is that?  And if that is what is supposed to 
happen, why are we charging the metadata we will use for modifying the 
extent tree to the delalloc reserve?  It seems to me we should either


1) Be using the delalloc block reserve for running the delayed ref's 
that are created by inserting our extent, since that is where the 
reservation currently is made, or


2) Stop charging the reservations for modifying the extent tree to the 
delalloc block reserve and charge it instead to the global reserve, and 
then actually make sure that the global reserve is used when we do the 
delayed ref updating.


Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at