[PATCH 0/5] btrfs: Readonly snapshots

2010-11-29 Thread Li Zefan
(Cc: Sage Weil s...@newdream.net for changes in async snapshots)

This patchset adds readonly-snapshots support. You can create a
readonly snapshot, and you can also set a snapshot readonly/writable
on the fly.

A few readonly checks are added in setattr, permission, remove_xattr
and set_xattr callbacks, as well as in some ioctls.

You can also try it out by pulling (based on the master branch of
Chris' tree):
git://repo.or.cz/linux-btrfs-devel.git readonly-snapshots

Note: I changed the async snapshot ABI. So if the patchset is acceptable,
the first patch has to be merged into .37 to avoid ABI breakage.

---
 fs/btrfs/ctree.h   |3 +
 fs/btrfs/disk-io.c |   36 +++-
 fs/btrfs/inode.c   |8 +++
 fs/btrfs/ioctl.c   |  147 +---
 fs/btrfs/ioctl.h   |   16 -
 fs/btrfs/transaction.c |8 +++
 fs/btrfs/transaction.h |1 +
 fs/btrfs/xattr.c   |   18 ++
 8 files changed, 196 insertions(+), 41 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] btrfs: Make async snapshot ioctl more generic

2010-11-29 Thread Li Zefan
So we don't have to add new structures as we create more ioctls
for snapshots.

Now to create async snapshot, set BTRFS_SNAPSHOT_CREATE_ASYNC bit of
vol_arg_v2-flags, and then call ioctl(BTRFS_IOCT_SNAP_CREATE_V2).

Note: this changes the async snapshot ioctl ABI, which was merged
in .37 merge window, so we have to make this change into .37.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |   34 +-
 fs/btrfs/ioctl.h |   12 
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 463d91b..d3f1a60 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -935,23 +935,31 @@ out:
 
 static noinline int btrfs_ioctl_snap_create(struct file *file,
void __user *arg, int subvol,
-   int async)
+   bool v2)
 {
struct btrfs_ioctl_vol_args *vol_args = NULL;
-   struct btrfs_ioctl_async_vol_args *async_vol_args = NULL;
+   struct btrfs_ioctl_vol_args_v2 *vol_args_v2 = NULL;
char *name;
u64 fd;
u64 transid = 0;
+   bool async = false;
int ret;
 
-   if (async) {
-   async_vol_args = memdup_user(arg, sizeof(*async_vol_args));
-   if (IS_ERR(async_vol_args))
-   return PTR_ERR(async_vol_args);
+   if (v2) {
+   vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
+   if (IS_ERR(vol_args_v2))
+   return PTR_ERR(vol_args_v2);
 
-   name = async_vol_args-name;
-   fd = async_vol_args-fd;
-   async_vol_args-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
+   if (vol_args_v2-flags  ~BTRFS_SNAPSHOT_CREATE_ASYNC) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   name = vol_args_v2-name;
+   fd = vol_args_v2-fd;
+   vol_args_v2-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
+   if (vol_args_v2-flags  BTRFS_SNAPSHOT_CREATE_ASYNC)
+   async = true;
} else {
vol_args = memdup_user(arg, sizeof(*vol_args));
if (IS_ERR(vol_args))
@@ -966,13 +974,13 @@ static noinline int btrfs_ioctl_snap_create(struct file 
*file,
 
if (!ret  async) {
if (copy_to_user(arg +
-   offsetof(struct btrfs_ioctl_async_vol_args,
+   offsetof(struct btrfs_ioctl_vol_args_v2,
transid), transid, sizeof(transid)))
return -EFAULT;
}
-
+out:
kfree(vol_args);
-   kfree(async_vol_args);
+   kfree(vol_args_v2);
 
return ret;
 }
@@ -2235,7 +2243,7 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_getversion(file, argp);
case BTRFS_IOC_SNAP_CREATE:
return btrfs_ioctl_snap_create(file, argp, 0, 0);
-   case BTRFS_IOC_SNAP_CREATE_ASYNC:
+   case BTRFS_IOC_SNAP_CREATE_V2:
return btrfs_ioctl_snap_create(file, argp, 0, 1);
case BTRFS_IOC_SUBVOL_CREATE:
return btrfs_ioctl_snap_create(file, argp, 1, 0);
diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
index 17c99eb..bc70584 100644
--- a/fs/btrfs/ioctl.h
+++ b/fs/btrfs/ioctl.h
@@ -30,10 +30,14 @@ struct btrfs_ioctl_vol_args {
char name[BTRFS_PATH_NAME_MAX + 1];
 };
 
-#define BTRFS_SNAPSHOT_NAME_MAX 4079
-struct btrfs_ioctl_async_vol_args {
+#define BTRFS_SNAPSHOT_CREATE_ASYNC(1ULL  0)
+
+#define BTRFS_SNAPSHOT_NAME_MAX 4039
+struct btrfs_ioctl_vol_args_v2 {
__s64 fd;
__u64 transid;
+   __u64 flags;
+   __u64 unused[4];
char name[BTRFS_SNAPSHOT_NAME_MAX + 1];
 };
 
@@ -187,6 +191,6 @@ struct btrfs_ioctl_space_args {
struct btrfs_ioctl_space_args)
 #define BTRFS_IOC_START_SYNC _IOR(BTRFS_IOCTL_MAGIC, 24, __u64)
 #define BTRFS_IOC_WAIT_SYNC  _IOW(BTRFS_IOCTL_MAGIC, 22, __u64)
-#define BTRFS_IOC_SNAP_CREATE_ASYNC _IOW(BTRFS_IOCTL_MAGIC, 23, \
-  struct btrfs_ioctl_async_vol_args)
+#define BTRFS_IOC_SNAP_CREATE_V2 _IOW(BTRFS_IOCTL_MAGIC, 23, \
+  struct btrfs_ioctl_vol_args_v2)
 #endif
-- 
1.6.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] btrfs: Fix memory leak in a failure path

2010-11-29 Thread Li Zefan
In btrfs_ioctl_snap_create(), vol_args_v2 is not freed if
copy_to_user() returns failure.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index d3f1a60..ba437ad 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -976,7 +976,7 @@ static noinline int btrfs_ioctl_snap_create(struct file 
*file,
if (copy_to_user(arg +
offsetof(struct btrfs_ioctl_vol_args_v2,
transid), transid, sizeof(transid)))
-   return -EFAULT;
+   ret = -EFAULT;
}
 out:
kfree(vol_args);
-- 
1.6.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] btrfs: Add helper __setup_root_post()

2010-11-29 Thread Li Zefan
This eliminates duplicate code in find_and_setup_root() and
btrfs_read_fs_root_no_radix().

(Prepare for the next patch)

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/disk-io.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b40dfe4..fb650e0 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -959,14 +959,25 @@ static int __setup_root(u32 nodesize, u32 leafsize, u32 
sectorsize,
return 0;
 }
 
+static void __setup_root_post(struct btrfs_root *root)
+{
+   u32 blocksize;
+   u64 generation;
+
+   generation = btrfs_root_generation(root-root_item);
+   blocksize = btrfs_level_size(root, btrfs_root_level(root-root_item));
+   root-node = read_tree_block(root, btrfs_root_bytenr(root-root_item),
+blocksize, generation);
+   BUG_ON(!root-node);
+   root-commit_root = btrfs_root_node(root);
+}
+
 static int find_and_setup_root(struct btrfs_root *tree_root,
   struct btrfs_fs_info *fs_info,
   u64 objectid,
   struct btrfs_root *root)
 {
int ret;
-   u32 blocksize;
-   u64 generation;
 
__setup_root(tree_root-nodesize, tree_root-leafsize,
 tree_root-sectorsize, tree_root-stripesize,
@@ -977,12 +988,7 @@ static int find_and_setup_root(struct btrfs_root 
*tree_root,
return -ENOENT;
BUG_ON(ret);
 
-   generation = btrfs_root_generation(root-root_item);
-   blocksize = btrfs_level_size(root, btrfs_root_level(root-root_item));
-   root-node = read_tree_block(root, btrfs_root_bytenr(root-root_item),
-blocksize, generation);
-   BUG_ON(!root-node);
-   root-commit_root = btrfs_root_node(root);
+   __setup_root_post(root);
return 0;
 }
 
@@ -1083,8 +1089,6 @@ struct btrfs_root *btrfs_read_fs_root_no_radix(struct 
btrfs_root *tree_root,
struct btrfs_fs_info *fs_info = tree_root-fs_info;
struct btrfs_path *path;
struct extent_buffer *l;
-   u64 generation;
-   u32 blocksize;
int ret = 0;
 
root = kzalloc(sizeof(*root), GFP_NOFS);
@@ -1121,12 +1125,7 @@ struct btrfs_root *btrfs_read_fs_root_no_radix(struct 
btrfs_root *tree_root,
return ERR_PTR(ret);
}
 
-   generation = btrfs_root_generation(root-root_item);
-   blocksize = btrfs_level_size(root, btrfs_root_level(root-root_item));
-   root-node = read_tree_block(root, btrfs_root_bytenr(root-root_item),
-blocksize, generation);
-   root-commit_root = btrfs_root_node(root);
-   BUG_ON(!root-node);
+   __setup_root_post(root);
 out:
if (location-objectid != BTRFS_TREE_LOG_OBJECTID)
root-ref_cows = 1;
-- 
1.6.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] btrfs: Add readonly snapshots support

2010-11-29 Thread Li Zefan
Usage:

Set BTRFS_SNAPSHOT_CREATE_RDONLY of btrfs_ioctl_vol_arg_v2-flags,
and call ioctl(BTRFS_I0CTL_SNAP_CREATE_V2).

Implementation:

- In disk set readonly bit of btrfs_root_item-flags, and in memory
set btrfs_root-readonly.

- Add readonly checks in btrfs_permission (inode_permission),
btrfs_setattr, btrfs_set/remove_xattr and some ioctls.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/ctree.h   |3 +++
 fs/btrfs/disk-io.c |5 +
 fs/btrfs/inode.c   |8 
 fs/btrfs/ioctl.c   |   33 +
 fs/btrfs/ioctl.h   |1 +
 fs/btrfs/transaction.c |8 
 fs/btrfs/transaction.h |1 +
 fs/btrfs/xattr.c   |   18 ++
 8 files changed, 69 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 8db9234..4b263ed 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -597,6 +597,8 @@ struct btrfs_dir_item {
u8 type;
 } __attribute__ ((__packed__));
 
+#define BTRFS_ROOT_SNAP_RDONLY (1ULL  0)
+
 struct btrfs_root_item {
struct btrfs_inode_item inode;
__le64 generation;
@@ -1116,6 +1118,7 @@ struct btrfs_root {
int defrag_running;
char *name;
int in_sysfs;
+   bool readonly;
 
/* the dirty list is only used by non-reference counted roots */
struct list_head dirty_list;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index fb650e0..5b88712 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -963,6 +963,7 @@ static void __setup_root_post(struct btrfs_root *root)
 {
u32 blocksize;
u64 generation;
+   u64 flags;
 
generation = btrfs_root_generation(root-root_item);
blocksize = btrfs_level_size(root, btrfs_root_level(root-root_item));
@@ -970,6 +971,10 @@ static void __setup_root_post(struct btrfs_root *root)
 blocksize, generation);
BUG_ON(!root-node);
root-commit_root = btrfs_root_node(root);
+
+   flags = btrfs_root_flags(root-root_item);
+   if (flags  BTRFS_ROOT_SNAP_RDONLY)
+   root-readonly = true;
 }
 
 static int find_and_setup_root(struct btrfs_root *tree_root,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5132c9a..08c3075 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3671,8 +3671,12 @@ static int btrfs_setattr_size(struct inode *inode, 
struct iattr *attr)
 static int btrfs_setattr(struct dentry *dentry, struct iattr *attr)
 {
struct inode *inode = dentry-d_inode;
+   struct btrfs_root *root = BTRFS_I(inode)-root;
int err;
 
+   if (root-readonly)
+   return -EROFS;
+
err = inode_change_ok(inode, attr);
if (err)
return err;
@@ -7028,6 +7032,10 @@ static int btrfs_set_page_dirty(struct page *page)
 
 static int btrfs_permission(struct inode *inode, int mask)
 {
+   struct btrfs_root *root = BTRFS_I(inode)-root;
+
+   if (root-readonly  (mask  MAY_WRITE))
+   return -EROFS;
if ((BTRFS_I(inode)-flags  BTRFS_INODE_READONLY)  (mask  
MAY_WRITE))
return -EACCES;
return generic_permission(inode, mask, btrfs_check_acl);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index ba437ad..7f9c571 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -147,6 +147,9 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
unsigned int flags, oldflags;
int ret;
 
+   if (root-readonly)
+   return -EROFS;
+
if (copy_from_user(flags, arg, sizeof(flags)))
return -EFAULT;
 
@@ -351,7 +354,8 @@ fail:
 }
 
 static int create_snapshot(struct btrfs_root *root, struct dentry *dentry,
-  char *name, int namelen, u64 *async_transid)
+  char *name, int namelen, u64 *async_transid,
+  bool readonly)
 {
struct inode *inode;
struct btrfs_pending_snapshot *pending_snapshot;
@@ -368,6 +372,7 @@ static int create_snapshot(struct btrfs_root *root, struct 
dentry *dentry,
btrfs_init_block_rsv(pending_snapshot-block_rsv);
pending_snapshot-dentry = dentry;
pending_snapshot-root = root;
+   pending_snapshot-readonly = readonly;
 
trans = btrfs_start_transaction(root-fs_info-extent_root, 5);
if (IS_ERR(trans)) {
@@ -497,7 +502,7 @@ static inline int btrfs_may_create(struct inode *dir, 
struct dentry *child)
 static noinline int btrfs_mksubvol(struct path *parent,
   char *name, int namelen,
   struct btrfs_root *snap_src,
-  u64 *async_transid)
+  u64 *async_transid, bool readonly)
 {
struct inode *dir  = parent-dentry-d_inode;
struct dentry *dentry;
@@ -529,7 +534,7 @@ static noinline int btrfs_mksubvol(struct path *parent,
 
if (snap_src) 

[PATCH 5/5] btrfs: Add ioctl to set snapshot readonly/writable

2010-11-29 Thread Li Zefan
This allows us to set a snapshot readonly or writable on the fly.

Usage:

Set BTRFS_SNAPSHOT_RDONLY/WRITABLE of btrfs_ioctl_vol_arg_v2-flags,
and then call ioctl(BTRFS_IOCTL_SNAP_SETFLAGS);

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |   88 +++--
 fs/btrfs/ioctl.h |3 ++
 2 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 7f9c571..34d8683 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -939,6 +939,24 @@ out:
return ret;
 }
 
+static int snap_check_flags(u64 flags, bool create)
+{
+   u64 valid_flags = BTRFS_SNAPSHOT_RDONLY | BTRFS_SNAPSHOT_WRITABLE;
+
+   if (create)
+   valid_flags |= BTRFS_SNAPSHOT_CREATE_ASYNC;
+
+   if (flags  valid_flags)
+   return -EINVAL;
+
+   /* readonly and writable are mutually exclusive */
+   if ((flags  BTRFS_SNAPSHOT_RDONLY) 
+   (flags  BTRFS_SNAPSHOT_WRITABLE))
+   return -EINVAL;
+
+   return 0;
+}
+
 static noinline int btrfs_ioctl_snap_create(struct file *file,
void __user *arg, int subvol,
bool v2)
@@ -957,11 +975,9 @@ static noinline int btrfs_ioctl_snap_create(struct file 
*file,
if (IS_ERR(vol_args_v2))
return PTR_ERR(vol_args_v2);
 
-   if (vol_args_v2-flags 
-   ~(BTRFS_SNAPSHOT_CREATE_ASYNC | BTRFS_SNAPSHOT_RDONLY)) {
-   ret = -EINVAL;
+   ret = snap_check_flags(vol_args_v2-flags, true);
+   if (ret)
goto out;
-   }
 
name = vol_args_v2-name;
fd = vol_args_v2-fd;
@@ -995,6 +1011,65 @@ out:
return ret;
 }
 
+static noinline int btrfs_ioctl_snap_setflags(struct file *file,
+ void __user *arg)
+{
+   struct inode *inode = fdentry(file)-d_inode;
+   struct btrfs_root *root = BTRFS_I(inode)-root;
+   struct btrfs_trans_handle *trans;
+   struct btrfs_ioctl_vol_args_v2 *vol_args_v2;
+   u64 root_flags;
+   u64 flags;
+   int err;
+
+   if (root-fs_info-sb-s_flags  MS_RDONLY)
+   return -EROFS;
+
+   vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
+   if (IS_ERR(vol_args_v2))
+   return PTR_ERR(vol_args_v2);
+   flags = vol_args_v2-flags;
+
+   err = snap_check_flags(flags, false);
+   if (err)
+   goto out;
+
+   down_write(root-fs_info-subvol_sem);
+
+   /* nothing to do */
+   if ((BTRFS_SNAPSHOT_RDONLY  root-readonly) ||
+   (BTRFS_SNAPSHOT_WRITABLE  !root-readonly))
+   goto out_unlock;
+
+   root_flags = btrfs_root_flags(root-root_item);
+   if (flags  BTRFS_SNAPSHOT_RDONLY) {
+   btrfs_set_root_flags(root-root_item,
+root_flags | BTRFS_ROOT_SNAP_RDONLY);
+   root-readonly = true;
+   }
+   if (flags  BTRFS_SNAPSHOT_WRITABLE) {
+   btrfs_set_root_flags(root-root_item,
+root_flags  ~BTRFS_ROOT_SNAP_RDONLY);
+   root-readonly = false;
+   }
+
+   trans = btrfs_start_transaction(root, 1);
+   if (IS_ERR(trans)) {
+   err = PTR_ERR(trans);
+   goto out_unlock;
+   }
+
+   err = btrfs_update_root(trans, root,
+   root-root_key, root-root_item);
+
+   btrfs_commit_transaction(trans, root);
+out_unlock:
+   up_write(root-fs_info-subvol_sem);
+out:
+   kfree(vol_args_v2);
+   return err;
+}
+
 /*
  * helper to check if the subvolume references other subvolumes
  */
@@ -1503,6 +1578,9 @@ static int btrfs_ioctl_defrag(struct file *file, void 
__user *argp)
struct btrfs_ioctl_defrag_range_args *range;
int ret;
 
+   if (root-readonly)
+   return -EROFS;
+
ret = mnt_want_write(file-f_path.mnt);
if (ret)
return ret;
@@ -2266,6 +2344,8 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_snap_create(file, argp, 1, 0);
case BTRFS_IOC_SNAP_DESTROY:
return btrfs_ioctl_snap_destroy(file, argp);
+   case BTRFS_IOC_SNAP_SETFLAGS:
+   return btrfs_ioctl_snap_setflags(file, argp);
case BTRFS_IOC_DEFAULT_SUBVOL:
return btrfs_ioctl_default_subvol(file, argp);
case BTRFS_IOC_DEFRAG:
diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
index ff15fb2..559fd27 100644
--- a/fs/btrfs/ioctl.h
+++ b/fs/btrfs/ioctl.h
@@ -32,6 +32,7 @@ struct btrfs_ioctl_vol_args {
 
 #define BTRFS_SNAPSHOT_CREATE_ASYNC(1ULL  0)
 #define BTRFS_SNAPSHOT_RDONLY  (1ULL  1)
+#define BTRFS_SNAPSHOT_WRITABLE(1ULL  2)
 
 #define BTRFS_SNAPSHOT_NAME_MAX 4039
 

Re: [next-rc] Compile Error fs/btrfs/diskio.c

2010-11-29 Thread Chris Mason
Excerpts from Mitch Harder's message of 2010-11-27 17:53:23 -0500:
 I've been getting a compile error when building the 'next-rc' branch
 of btrfs-unstable.
 
   CC  fs/btrfs/disk-io.o
 fs/btrfs/disk-io.c: In function ‘btree_migratepage’:
 fs/btrfs/disk-io.c:716: error: called object ‘0u’ is not a function
 make[2]: *** [fs/btrfs/disk-io.o] Error 1
 make[1]: *** [fs/btrfs] Error 2
 make: *** [fs] Error 2
 
 Line 716 of fs/btrfs/disk-io.c is:
 
 return migrate_page(mapping, newpage, page);
 
 This is related to the Btrfs: add migrate page for metadata inode
 patch (the first patch in the set of patches added to the next-rc
 branch).

I've pushed out a similar fix in the next-rc branch and the master
branch.  Waiting on korg mirrors to update before I send the pull
request.

Thanks!

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Btrfs updates for 2.6.37-rc

2010-11-29 Thread Chris Mason
Hi everyone,

The master branch of the btrfs unstable tree:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git master

Has a collection of btrfs bug fixes.

The three most important fixes here address crashes in the btrfs
O_DIRECT code, add a migrate_page operation to avoid metadata corruption
as btree pages go through migration, and fix up our NFS support.

Otherwise we have assorted correctness fixes, many of which were kicked
out by xfsqa.

-chris

Josef Bacik (11) commits (+241/-54):
Btrfs: make btrfs_add_nondir take parent inode as an argument (+16/-22)
Btrfs: fix typo in fallocate to make it honor actual size (+5/-4)
Btrfs: hold i_mutex when calling btrfs_log_dentry_safe (+7/-0)
Btrfs: setup blank root and fs_info for mount time (+33/-7)
Btrfs: make sure new inode size is ok in fallocate (+4/-0)
Btrfs: use dget_parent where we can UPDATED (+43/-12)
Btrfs: handle the space_cache option properly (+1/-0)
Btrfs: update inode ctime when using links (+1/-0)
Btrfs: fix more ESTALE problems with NFS (+1/-0)
Btrfs: handle NFS lookups properly (+76/-0)
Btrfs: fix fiemap (+54/-9)

Chris Mason (4) commits (+124/-9):
Btrfs: deal with DIO bios that span more than one ordered extent (+89/-4)
Btrfs: avoid NULL pointer deref in try_release_extent_buffer (+4/-2)
Btrfs: don't use migrate page without CONFIG_MIGRATION (+6/-1)
Btrfs: add migrate page for metadata inode (+25/-2)

Li Zefan (3) commits (+6/-6):
btrfs: Check if dest_offset is block-size aligned before cloning file 
(+3/-4)
btrfs: Show device attr correctly for symlinks (+1/-0)
btrfs: Set file size correctly in file clone (+2/-2)

Miao Xie (3) commits (+195/-45):
btrfs: cleanup duplicate bio allocating functions (+8/-18)
btrfs: fix free dip and dip-csums twice (+3/-6)
btrfs: fix panic caused by direct IO (+184/-21)

Mariusz Kozlowski (1) commits (+3/-3):
btrfs: make 1-bit signed fileds unsigned

Arne Jansen (1) commits (+1/-1):
btrfs: Fix early enospc because 'unused' calculated with wrong sign.

Ian Kent (1) commits (+6/-0):
Btrfs - fix race between btrfs_get_sb() and umount

Total: (24) commits (+576/-118)

 fs/btrfs/compression.c  |   15 +---
 fs/btrfs/ctree.h|6 +-
 fs/btrfs/disk-io.c  |   38 +-
 fs/btrfs/export.c   |   76 
 fs/btrfs/extent-tree.c  |2 +-
 fs/btrfs/extent_io.c|   77 ++---
 fs/btrfs/extent_io.h|3 +
 fs/btrfs/file.c |7 +
 fs/btrfs/inode.c|  294 ++-
 fs/btrfs/ioctl.c|   31 --
 fs/btrfs/ordered-data.c |   67 +++
 fs/btrfs/ordered-data.h |3 +
 fs/btrfs/super.c|   41 ++-
 fs/btrfs/transaction.c  |5 +-
 fs/btrfs/tree-log.c |   21 +++-
 15 files changed, 572 insertions(+), 114 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC-PATCH] Re: mounting arbitrary directories

2010-11-29 Thread C Anthony Risinger
On Sun, Nov 28, 2010 at 4:07 AM, C Anthony Risinger anth...@extof.me wrote:
 On Nov 27, 2010, at 10:22 PM, Calvin Walton calvin.wal...@gmail.com
 wrote:
 On Sat, 2010-11-27 at 21:19 -0600, C Anthony Risinger wrote:

 eg. if i have a regular directory (not a subvolume) in the top-
 level:

 /__boot

 i can mount it with:

 mount -o subvol=__boot /dev/sda /mnt

 The 'subvol' option actually works using the same mechanism as a bind
 mount. The fact that it works using a directory is purely a
 coincidence
 - I would not expect it to be officially supported, and you shouldn't
 rely on it.

 Hrrrm... Well I thought I'd be clever and use this little trick one
 time to update my kernel... Anyways I oops out like 3 times in a row
 (last func was something.clone in each IIRC) and now I'm stuck with
 only my mobile since my server isn't set up yet and my  SSD just
 failed on my netbook yesterday :-(

 So, if anyone can help me recover this partition long enough to
 grab a few things... I would be most grateful :-) I think I have
 everything critical, but I'd rather take another look :-) Right now I
 can't mount; mount is failing w/bad superblock.

 /me promises to never recklessly sabotage himself after being warned 
 6 hrs earlier

any suggestions for me?  i dd'ed the corrupt partition to an external
disk because i need the machine back up and running, but if possible
i'd like to get the image running again.  i mounted it via loopback
and as expected got the same errors:

(will post the dmesg output next message -- left at home)
open_ctree_fd failed
wanted  found  instead

the mount command itself fails with bad superblock or ...

i have seen others withe similar crash reports and IIRC correctly were
able to recover it (seems like something just didn't get updated right
before it locked...)

any ideas?

C Anthony
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL][PATCH v2 0/6] btrfs: Add lzo compression support

2010-11-29 Thread C Anthony Risinger
On Wed, Nov 17, 2010 at 8:08 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Hi Chris,

 Here's the updated patchset. As I still haven't got a kernel.org
 account, I have set up a git tree in another public git repository,
 and I'll use it for now.

 You can pull from:

        git://repo.or.cz/linux-btrfs-devel.git lzo-support


 Lzo is a much faster compression algorithm than gzib, so would allow
 more users to enable transparent compression, and some users can
 choose from compression ratio and compression speed.

 Usage:

 # mount -t btrfs -o compress[=zlib,lzo] dev /mnt
 or
 # mount -t btrfs -o compress-force[=zlib,lzo] dev /mnt

 -o compress without argument is still allowed for compatability.

 Compatibility:

 If we mount a filesystem with lzo compression, it will not be able be
 mounted in old kernels. One reason is, otherwise btrfs will directly
 dump compressed data, which sits in inline extent, to user.

 Performance:

 The test copied a linux source tarball (~400M) from an ext4 partition
 to the btrfs partition, and then extracted the tarball.

 (time in second)
           lzo        zlib        nocompress
 copy:      10.6       21.7        14.9
 extract:   70.1       94.4        66.6

 (data size in MB)
           lzo        zlib        nocompress
 copy:      185.87     108.69      394.49
 extract:   193.80     132.36      381.21

 Test:

 Mitch has tested the patchset, and provided some positive feedback.
 According to him, the patchset works as expected and nothing bad
 has he experienced.

 Other:

 The defrag ioctl is also updated, so one can choose lzo or zlib when
 turning on compression in defrag operation.

 Main change from v1:

 - Add incompat flag.
 - Fix build issue by selecting kernel lzo module.
 - Check compression type in defrag ioctl.

 
 Li Zefan (6):
      btrfs: Fix bugs in zlib workspace
      btrfs: Fix error handling in zlib
      btrfs: Allow to add new compression algorithm
      btrfs: Add lzo compression support
      btrfs: Allow to specify compress method when defrag
      btrfs: Extract duplicate decompress code

  fs/btrfs/Makefile       |    2 +-
  fs/btrfs/btrfs_inode.h  |    2 +-
  fs/btrfs/compression.c  |  329 +-
  fs/btrfs/compression.h  |   72 ++--
  fs/btrfs/ctree.h        |   11 +-
  fs/btrfs/extent_io.c    |    5 +-
  fs/btrfs/extent_io.h    |   17 ++-
  fs/btrfs/extent_map.c   |    2 +
  fs/btrfs/extent_map.h   |    3 +-
  fs/btrfs/file.c         |    2 +
  fs/btrfs/inode.c        |   82 ++
  fs/btrfs/ioctl.c        |   10 +-
  fs/btrfs/ioctl.h        |    9 +-
  fs/btrfs/lzo.c          |  409 
 +++
  fs/btrfs/ordered-data.c |   18 ++-
  fs/btrfs/ordered-data.h |    8 +-
  fs/btrfs/super.c        |   47 +-
  fs/btrfs/zlib.c         |  361 +++--
  18 files changed, 1013 insertions(+), 376 deletions(-)

is this in a branch somewhere?  or for inclusion in .37/.38?  this is
a very attractive feature.

what's the proper place (repo/branch) to see what is pending?

C Anthony
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Root fs on raid1

2010-11-29 Thread Goffredo Baroncelli
On Monday, 29 November, 2010, you (Erik Jensen) wrote:
 I have a similar setup on one of my computers.  I set it up a while ago
 using an initramfs, but I could not get the device scan to work properly,
 for whatever reason.  I ended up having it just trying to mount every device
 in the array in turn.  When it gets to the last device, the array mounts
 since btrfs now knows about all of the other ones.  Really ugly, I know, but
 it works.


I also had a lot of problem to get a btrfs-aware initramfs working properly. 
One of the biggest proble was to find the right list of modules. Btrfs depends 
by crc32c, and I need a lot of retries to discover it.

The main problem was that if the module is not available (it is the case of a 
initramfs) btrfs doesn't start...

When you say  it just trying to mount every device in the array in turn, do 
you mean in the initarmfs ?



 -- Erik Jensen
 
 On Wed, Nov 17, 2010 at 10:59 PM, Goffredo Baroncelli 
kreij...@libero.itwrote:
 
  On Thursday, 18 November, 2010, ad...@prnet.org wrote:
   Hi,
  
   I have read that for using raid1 btrfs device scan must be run before
  mounting the fs. How do I proceed if root filesystem is mounted from btrfs
  ?
  
 
 
  See a my previous post
 
  http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg04709.html
 
 
   Thanks in advance
   Bye,
   David Arendt
  
   --
   To unsubscribe from this list: send the line unsubscribe linux-btrfs 
in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
 
 
  --
  gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) 
  kreij...@inwind.it
  Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
  --
  To unsubscribe from this list: send the line unsubscribe linux-btrfs in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] btrfs: Make async snapshot ioctl more generic

2010-11-29 Thread Goffredo Baroncelli
Hi Li,

great work, but I have some suggestions:

On Monday, 29 November, 2010, Li Zefan wrote:
 So we don't have to add new structures as we create more ioctls
 for snapshots.
 
 Now to create async snapshot, set BTRFS_SNAPSHOT_CREATE_ASYNC bit of
 vol_arg_v2-flags, and then call ioctl(BTRFS_IOCT_SNAP_CREATE_V2).
 
 Note: this changes the async snapshot ioctl ABI, which was merged
 in .37 merge window, so we have to make this change into .37.
 
 Signed-off-by: Li Zefan l...@cn.fujitsu.com
 ---
  fs/btrfs/ioctl.c |   34 +-
  fs/btrfs/ioctl.h |   12 
  2 files changed, 29 insertions(+), 17 deletions(-)
 
 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index 463d91b..d3f1a60 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -935,23 +935,31 @@ out:
  
  static noinline int btrfs_ioctl_snap_create(struct file *file,
   void __user *arg, int subvol,
 - int async)
 + bool v2)

This is a aesthetic suggestion: instead of passing a flag called v2 I suggest 
to add two wrapper functions, which extract the parameters and passes all 
available parameter to the generic function. Something like:

static inline btrfs_ioctl_snap_create_v1(struct file *file, 
void __user *arg, int subvol)
{
vol_args = memdup_user(arg, sizeof(*vol_args));
if (IS_ERR(vol_args))
return PTR_ERR(vol_args);
name = vol_args-name;
fd = vol_args-fd;
vol_args-name[BTRFS_PATH_NAME_MAX] = '\0';

btrfs_ioctl_snap_create_generic(file, subvol, name, fd, 0, 0);

}

static inline btrfs_ioctl_snap_create_v2(struct file *file, 
void __user *arg, int subvol,
)
{
vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
if (IS_ERR(vol_args_v2))
return PTR_ERR(vol_args_v2);

ret = snap_check_flags(vol_args_v2-flags, true);
if (ret)
goto out;

name = vol_args_v2-name;
fd = vol_args_v2-fd;
vol_args_v2-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
if (vol_args_v2-flags  BTRFS_SNAPSHOT_CREATE_ASYNC)
async = true;
if (vol_args_v2-flags  BTRFS_SNAPSHOT_RDONLY)
readonly = true;

btrfs_ioctl_snap_create_generic(file, subvol, name, fd, async,
 readonly);

}

and 

case BTRFS_IOC_SNAP_CREATE:
return btrfs_ioctl_snap_create_v1(file, argp, 0);
case BTRFS_IOC_SNAP_CREATE_V2:
return btrfs_ioctl_snap_create_v2(file, argp, 0);


Frankly speaking, we could get rid of the subvol parameter adding another 
wrapper function like btrfs_ioctl_subvol_create( ), but this would be 
another story :)

  {
   struct btrfs_ioctl_vol_args *vol_args = NULL;
 - struct btrfs_ioctl_async_vol_args *async_vol_args = NULL;
 + struct btrfs_ioctl_vol_args_v2 *vol_args_v2 = NULL;
   char *name;
   u64 fd;
   u64 transid = 0;
 + bool async = false;
   int ret;
  
 - if (async) {
 - async_vol_args = memdup_user(arg, sizeof(*async_vol_args));
 - if (IS_ERR(async_vol_args))
 - return PTR_ERR(async_vol_args);
 + if (v2) {
 + vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
 + if (IS_ERR(vol_args_v2))
 + return PTR_ERR(vol_args_v2);
  
 - name = async_vol_args-name;
 - fd = async_vol_args-fd;
 - async_vol_args-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 + if (vol_args_v2-flags  ~BTRFS_SNAPSHOT_CREATE_ASYNC) {
 + ret = -EINVAL;
 + goto out;
 + }
 +
 + name = vol_args_v2-name;
 + fd = vol_args_v2-fd;
 + vol_args_v2-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 + if (vol_args_v2-flags  BTRFS_SNAPSHOT_CREATE_ASYNC)
 + async = true;
   } else {
   vol_args = memdup_user(arg, sizeof(*vol_args));
   if (IS_ERR(vol_args))
 @@ -966,13 +974,13 @@ static noinline int btrfs_ioctl_snap_create(struct 
file *file,
  
   if (!ret  async) {
   if (copy_to_user(arg +
 - offsetof(struct btrfs_ioctl_async_vol_args,
 + offsetof(struct btrfs_ioctl_vol_args_v2,
   transid), transid, sizeof(transid)))
   return -EFAULT;
   }
 -
 +out:
   kfree(vol_args);
 - kfree(async_vol_args);
 + kfree(vol_args_v2);
  
   return ret;
  }
 @@ -2235,7 

Re: [PATCH 5/5] btrfs: Add ioctl to set snapshot readonly/writable

2010-11-29 Thread Goffredo Baroncelli
Hi Li,

On Monday, 29 November, 2010, Li Zefan wrote:
 This allows us to set a snapshot readonly or writable on the fly.
 
 Usage:
 
 Set BTRFS_SNAPSHOT_RDONLY/WRITABLE of btrfs_ioctl_vol_arg_v2-flags,
 and then call ioctl(BTRFS_IOCTL_SNAP_SETFLAGS);

I really appreciate your work, but I have some doubt about this interface. In 
particolar:
- how get the flags of a subvolume ? I suggest to implement a pair of ioctls:
- subvolume_setflags  - get the flags
- subvolume_getflags  - set the flags
These ioctls would be more generic (there are a lot of flags which may be 
interested to put in the root of a subvolume: think about 
compress/nocompress, (no)datasum...)
- For the reason abowe, I suggest to replace SNAPSHOT with SUBVOLUME
- Finally, with a pair of  get/set_flags functions we can avoid the use of the 
flags BTRFS_SNAPSHOT_WRITABLE.


 
 Signed-off-by: Li Zefan l...@cn.fujitsu.com
 ---
  fs/btrfs/ioctl.c |   88 
+++--
  fs/btrfs/ioctl.h |3 ++
  2 files changed, 87 insertions(+), 4 deletions(-)
 
 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index 7f9c571..34d8683 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -939,6 +939,24 @@ out:
   return ret;
  }
  
 +static int snap_check_flags(u64 flags, bool create)
 +{
 + u64 valid_flags = BTRFS_SNAPSHOT_RDONLY | BTRFS_SNAPSHOT_WRITABLE;
 +
 + if (create)
 + valid_flags |= BTRFS_SNAPSHOT_CREATE_ASYNC;
 +
 + if (flags  valid_flags)
 + return -EINVAL;
 +
 + /* readonly and writable are mutually exclusive */
 + if ((flags  BTRFS_SNAPSHOT_RDONLY) 
 + (flags  BTRFS_SNAPSHOT_WRITABLE))
 + return -EINVAL;
 +
 + return 0;
 +}
 +
  static noinline int btrfs_ioctl_snap_create(struct file *file,
   void __user *arg, int subvol,
   bool v2)
 @@ -957,11 +975,9 @@ static noinline int btrfs_ioctl_snap_create(struct file 
*file,
   if (IS_ERR(vol_args_v2))
   return PTR_ERR(vol_args_v2);
  
 - if (vol_args_v2-flags 
 - ~(BTRFS_SNAPSHOT_CREATE_ASYNC | BTRFS_SNAPSHOT_RDONLY)) {
 - ret = -EINVAL;
 + ret = snap_check_flags(vol_args_v2-flags, true);
 + if (ret)
   goto out;
 - }
  
   name = vol_args_v2-name;
   fd = vol_args_v2-fd;
 @@ -995,6 +1011,65 @@ out:
   return ret;
  }
  
 +static noinline int btrfs_ioctl_snap_setflags(struct file *file,
 +   void __user *arg)
 +{
 + struct inode *inode = fdentry(file)-d_inode;
 + struct btrfs_root *root = BTRFS_I(inode)-root;
 + struct btrfs_trans_handle *trans;
 + struct btrfs_ioctl_vol_args_v2 *vol_args_v2;
 + u64 root_flags;
 + u64 flags;
 + int err;
 +
 + if (root-fs_info-sb-s_flags  MS_RDONLY)
 + return -EROFS;
 +
 + vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
 + if (IS_ERR(vol_args_v2))
 + return PTR_ERR(vol_args_v2);
 + flags = vol_args_v2-flags;
 +
 + err = snap_check_flags(flags, false);
 + if (err)
 + goto out;
 +
 + down_write(root-fs_info-subvol_sem);
 +
 + /* nothing to do */
 + if ((BTRFS_SNAPSHOT_RDONLY  root-readonly) ||
 + (BTRFS_SNAPSHOT_WRITABLE  !root-readonly))
 + goto out_unlock;
 +
 + root_flags = btrfs_root_flags(root-root_item);
 + if (flags  BTRFS_SNAPSHOT_RDONLY) {
 + btrfs_set_root_flags(root-root_item,
 +  root_flags | BTRFS_ROOT_SNAP_RDONLY);
 + root-readonly = true;
 + }
 + if (flags  BTRFS_SNAPSHOT_WRITABLE) {
 + btrfs_set_root_flags(root-root_item,
 +  root_flags  ~BTRFS_ROOT_SNAP_RDONLY);
 + root-readonly = false;
 + }
 +
 + trans = btrfs_start_transaction(root, 1);
 + if (IS_ERR(trans)) {
 + err = PTR_ERR(trans);
 + goto out_unlock;
 + }
 +
 + err = btrfs_update_root(trans, root,
 + root-root_key, root-root_item);
 +
 + btrfs_commit_transaction(trans, root);
 +out_unlock:
 + up_write(root-fs_info-subvol_sem);
 +out:
 + kfree(vol_args_v2);
 + return err;
 +}
 +
  /*
   * helper to check if the subvolume references other subvolumes
   */
 @@ -1503,6 +1578,9 @@ static int btrfs_ioctl_defrag(struct file *file, void 
__user *argp)
   struct btrfs_ioctl_defrag_range_args *range;
   int ret;
  
 + if (root-readonly)
 + return -EROFS;
 +
   ret = mnt_want_write(file-f_path.mnt);
   if (ret)
   return ret;
 @@ -2266,6 +2344,8 @@ long btrfs_ioctl(struct file *file, unsigned int
   return btrfs_ioctl_snap_create(file, argp, 1, 0);
   case BTRFS_IOC_SNAP_DESTROY:
   return 

Re: [PATCH 1/5] btrfs: Make async snapshot ioctl more generic

2010-11-29 Thread Goffredo Baroncelli
On Monday, 29 November, 2010, Goffredo Baroncelli wrote:
 Why the unused fields ? What happens if you use a more recent btrfs-tools 
 which take advantage of these fields but the kernel is an old one ? At the 
 minimum please check the flags so
 (flags^(BTRFS_SNAPSHOT_CREATE_ASYNC|BTRFS_SNAPSHOT_RDONLY)) == 0
 and
 unused[0..3] == 0


Sorry, the check should be

(flags  (~(BTRFS_SNAPSHOT_CREATE_ASYNC|BTRFS_SNAPSHOT_RDONLY))) == 0
 
 For future expansion I suggest to use a different ioctl. To me it seems a 
more 
 robust API.
 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] btrfs: Make async snapshot ioctl more generic

2010-11-29 Thread Sage Weil
Hi Li,

On Mon, 29 Nov 2010, Li Zefan wrote:
 So we don't have to add new structures as we create more ioctls
 for snapshots.
 
 Now to create async snapshot, set BTRFS_SNAPSHOT_CREATE_ASYNC bit of
 vol_arg_v2-flags, and then call ioctl(BTRFS_IOCT_SNAP_CREATE_V2).
 
 Note: this changes the async snapshot ioctl ABI, which was merged
 in .37 merge window, so we have to make this change into .37.

These changes all look good to me.

Thanks!
sage



 
 Signed-off-by: Li Zefan l...@cn.fujitsu.com
 ---
  fs/btrfs/ioctl.c |   34 +-
  fs/btrfs/ioctl.h |   12 
  2 files changed, 29 insertions(+), 17 deletions(-)
 
 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index 463d91b..d3f1a60 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -935,23 +935,31 @@ out:
  
  static noinline int btrfs_ioctl_snap_create(struct file *file,
   void __user *arg, int subvol,
 - int async)
 + bool v2)
  {
   struct btrfs_ioctl_vol_args *vol_args = NULL;
 - struct btrfs_ioctl_async_vol_args *async_vol_args = NULL;
 + struct btrfs_ioctl_vol_args_v2 *vol_args_v2 = NULL;
   char *name;
   u64 fd;
   u64 transid = 0;
 + bool async = false;
   int ret;
  
 - if (async) {
 - async_vol_args = memdup_user(arg, sizeof(*async_vol_args));
 - if (IS_ERR(async_vol_args))
 - return PTR_ERR(async_vol_args);
 + if (v2) {
 + vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
 + if (IS_ERR(vol_args_v2))
 + return PTR_ERR(vol_args_v2);
  
 - name = async_vol_args-name;
 - fd = async_vol_args-fd;
 - async_vol_args-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 + if (vol_args_v2-flags  ~BTRFS_SNAPSHOT_CREATE_ASYNC) {
 + ret = -EINVAL;
 + goto out;
 + }
 +
 + name = vol_args_v2-name;
 + fd = vol_args_v2-fd;
 + vol_args_v2-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 + if (vol_args_v2-flags  BTRFS_SNAPSHOT_CREATE_ASYNC)
 + async = true;
   } else {
   vol_args = memdup_user(arg, sizeof(*vol_args));
   if (IS_ERR(vol_args))
 @@ -966,13 +974,13 @@ static noinline int btrfs_ioctl_snap_create(struct file 
 *file,
  
   if (!ret  async) {
   if (copy_to_user(arg +
 - offsetof(struct btrfs_ioctl_async_vol_args,
 + offsetof(struct btrfs_ioctl_vol_args_v2,
   transid), transid, sizeof(transid)))
   return -EFAULT;
   }
 -
 +out:
   kfree(vol_args);
 - kfree(async_vol_args);
 + kfree(vol_args_v2);
  
   return ret;
  }
 @@ -2235,7 +2243,7 @@ long btrfs_ioctl(struct file *file, unsigned int
   return btrfs_ioctl_getversion(file, argp);
   case BTRFS_IOC_SNAP_CREATE:
   return btrfs_ioctl_snap_create(file, argp, 0, 0);
 - case BTRFS_IOC_SNAP_CREATE_ASYNC:
 + case BTRFS_IOC_SNAP_CREATE_V2:
   return btrfs_ioctl_snap_create(file, argp, 0, 1);
   case BTRFS_IOC_SUBVOL_CREATE:
   return btrfs_ioctl_snap_create(file, argp, 1, 0);
 diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
 index 17c99eb..bc70584 100644
 --- a/fs/btrfs/ioctl.h
 +++ b/fs/btrfs/ioctl.h
 @@ -30,10 +30,14 @@ struct btrfs_ioctl_vol_args {
   char name[BTRFS_PATH_NAME_MAX + 1];
  };
  
 -#define BTRFS_SNAPSHOT_NAME_MAX 4079
 -struct btrfs_ioctl_async_vol_args {
 +#define BTRFS_SNAPSHOT_CREATE_ASYNC  (1ULL  0)
 +
 +#define BTRFS_SNAPSHOT_NAME_MAX 4039
 +struct btrfs_ioctl_vol_args_v2 {
   __s64 fd;
   __u64 transid;
 + __u64 flags;
 + __u64 unused[4];
   char name[BTRFS_SNAPSHOT_NAME_MAX + 1];
  };
  
 @@ -187,6 +191,6 @@ struct btrfs_ioctl_space_args {
   struct btrfs_ioctl_space_args)
  #define BTRFS_IOC_START_SYNC _IOR(BTRFS_IOCTL_MAGIC, 24, __u64)
  #define BTRFS_IOC_WAIT_SYNC  _IOW(BTRFS_IOCTL_MAGIC, 22, __u64)
 -#define BTRFS_IOC_SNAP_CREATE_ASYNC _IOW(BTRFS_IOCTL_MAGIC, 23, \
 -struct btrfs_ioctl_async_vol_args)
 +#define BTRFS_IOC_SNAP_CREATE_V2 _IOW(BTRFS_IOCTL_MAGIC, 23, \
 +struct btrfs_ioctl_vol_args_v2)
  #endif
 -- 
 1.6.3
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread Mike Fedyk
On Mon, Nov 29, 2010 at 12:02 AM, Li Zefan l...@cn.fujitsu.com wrote:
 (Cc: Sage Weil s...@newdream.net for changes in async snapshots)

 This patchset adds readonly-snapshots support. You can create a
 readonly snapshot, and you can also set a snapshot readonly/writable
 on the fly.

 A few readonly checks are added in setattr, permission, remove_xattr
 and set_xattr callbacks, as well as in some ioctls.


Great work!

I have a suggestion on defaults when snapshots are created.  I think
they should default to being read-only and if they are meant to be
read-write a flag can be set at creation time (and changable at a
later time as well of course).

This way user/admin preconceptions of a snapshot being read-only can
be enforced by default, and the exception when you want a read-write
snapshot can be available with a switch at the cli level (and probably
a flag at the ioctl level).

It gives one more natural distinction between a snapshot and a
subvolume at the user conceptual level.

What do you think?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/4] Add readonly support to replace BUG_ON phrase

2010-11-29 Thread Josef Bacik
On Thu, Nov 25, 2010 at 05:52:47PM +0800, Miao Xie wrote:
 Btrfs has a number of BUG_ON()s, which may lead btrfs to unpleasant panic.
 Meanwhile, they are very ugly and should be handled more propriately.
 
 There are mainly two ways to deal with these BUG_ON()s.
 
 1. For those errors which can be handled well by callers, we just return their
 error number to callers.
 
 2. For others, We can force the filesystem readonly when it hits errors, which
  is what this patchset has done. Replaced BUG_ON() with the interface provided
  in this patchset, we will get error infomation via dmesg. Since btrfs is now 
 readonly, we can save our data safely and umount it, then a btrfsck is 
 recommended.
 
 By these ways, we can protect our filesystem from panic caused by those 
 BUG_ONs.
 
 ---
  fs/btrfs/ctree.h   |   21 ++
  fs/btrfs/disk-io.c |   23 +++
  fs/btrfs/super.c   |  100 ++-
  fs/btrfs/transaction.c |7 +++
  4 files changed, 148 insertions(+), 3 deletions(-)
 

Overall seems sane, but what about kernels that don't make these checks?  I'm ok
with well sucks for them as an answer, just want to make sure we've at least
though about it.

Also I'm not sure marking the fs as broken is the right move here.  Ext3/4 don't
do this, they just mount read-only, as long as you can still unmount the
filesystem everything comes out ok.  Think of the case where we just get a
spurious EIO, the fs should be fine the next time around, there's reason to
force the user to run fsck in this case.

Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread David Arendt

On 11/29/10 21:02, Mike Fedyk wrote:

On Mon, Nov 29, 2010 at 12:02 AM, Li Zefanl...@cn.fujitsu.com  wrote:

(Cc: Sage Weils...@newdream.net  for changes in async snapshots)

This patchset adds readonly-snapshots support. You can create a
readonly snapshot, and you can also set a snapshot readonly/writable
on the fly.

A few readonly checks are added in setattr, permission, remove_xattr
and set_xattr callbacks, as well as in some ioctls.


Great work!

I have a suggestion on defaults when snapshots are created.  I think
they should default to being read-only and if they are meant to be
read-write a flag can be set at creation time (and changable at a
later time as well of course).

This way user/admin preconceptions of a snapshot being read-only can
be enforced by default, and the exception when you want a read-write
snapshot can be available with a switch at the cli level (and probably
a flag at the ioctl level).

It gives one more natural distinction between a snapshot and a
subvolume at the user conceptual level.

What do you think?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hi,

I completely agree with you. I think lots of people use snapshots for 
backup purposes and these ones shouldn't be writable.


Bye,
David Arendt
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Errors during defragmentation

2010-11-29 Thread Andrej Podzimek

Hello,

I decided to test the 'defragment' feature on my system (after a huge number of 
system updates and prelinking):

find /bin /sbin /lib /usr/lib /usr/bin /usr/sbin -type d -exec btrfs 
filesystem defragment '{}' '+'

I have already defragmented a couple of (very large) directories with no errors 
at all, so this was expected to work somehow. Surprisingly, this time there 
were thousands of messages like this:

ioctl failed on directory name ret -1 errno 28

Most of the reported directories had zero files/subdirectories. However, *most* 
of them were *not* empty...

What does this error message mean? Could someone shed more light on this, 
please? Should I get ready for a bad crash? ;-) (There seems to be no data loss 
so far. No new messages in dmesg, no unexpected system behavior.)

Andrej




smime.p7s
Description: S/MIME Cryptographic Signature


Re: Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread Mike Fedyk
On Mon, Nov 29, 2010 at 12:41 PM, David Arendt ad...@prnet.org wrote:
 On 11/29/10 21:02, Mike Fedyk wrote:

 On Mon, Nov 29, 2010 at 12:02 AM, Li Zefanl...@cn.fujitsu.com  wrote:

 (Cc: Sage Weils...@newdream.net  for changes in async snapshots)

 This patchset adds readonly-snapshots support. You can create a
 readonly snapshot, and you can also set a snapshot readonly/writable
 on the fly.

 A few readonly checks are added in setattr, permission, remove_xattr
 and set_xattr callbacks, as well as in some ioctls.

 Great work!

 I have a suggestion on defaults when snapshots are created.  I think
 they should default to being read-only and if they are meant to be
 read-write a flag can be set at creation time (and changable at a
 later time as well of course).

 This way user/admin preconceptions of a snapshot being read-only can
 be enforced by default, and the exception when you want a read-write
 snapshot can be available with a switch at the cli level (and probably
 a flag at the ioctl level).

 It gives one more natural distinction between a snapshot and a
 subvolume at the user conceptual level.

 What do you think?

 I completely agree with you. I think lots of people use snapshots for backup
 purposes and these ones shouldn't be writable.

 by default.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/4] Add readonly support to replace BUG_ON phrase

2010-11-29 Thread Mike Fedyk
On Mon, Nov 29, 2010 at 12:10 PM, Josef Bacik jo...@redhat.com wrote:
 On Thu, Nov 25, 2010 at 05:52:47PM +0800, Miao Xie wrote:
 Btrfs has a number of BUG_ON()s, which may lead btrfs to unpleasant panic.
 Meanwhile, they are very ugly and should be handled more propriately.

 There are mainly two ways to deal with these BUG_ON()s.

 1. For those errors which can be handled well by callers, we just return 
 their
 error number to callers.

 2. For others, We can force the filesystem readonly when it hits errors, 
 which
  is what this patchset has done. Replaced BUG_ON() with the interface 
 provided
  in this patchset, we will get error infomation via dmesg. Since btrfs is now
 readonly, we can save our data safely and umount it, then a btrfsck is
 recommended.

 By these ways, we can protect our filesystem from panic caused by those
 BUG_ONs.

 ---
  fs/btrfs/ctree.h       |   21 ++
  fs/btrfs/disk-io.c     |   23 +++
  fs/btrfs/super.c       |  100 
 ++-
  fs/btrfs/transaction.c |    7 +++
  4 files changed, 148 insertions(+), 3 deletions(-)


 Overall seems sane, but what about kernels that don't make these checks?  I'm 
 ok
 with well sucks for them as an answer, just want to make sure we've at least
 though about it.

 Also I'm not sure marking the fs as broken is the right move here.  Ext3/4 
 don't
 do this, they just mount read-only, as long as you can still unmount the
 filesystem everything comes out ok.  Think of the case where we just get a
 spurious EIO, the fs should be fine the next time around, there's reason to
 force the user to run fsck in this case.


Did you mean there's no reason to?

Also I guess you mean this in the case when there is no redundancy
(single and raid0) as the other cases should recover from spurious EIO
at run time.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Errors during defragmentation

2010-11-29 Thread Hugo Mills
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Nov 29, 2010 at 10:02:56PM +0100, Andrej Podzimek wrote:
 Hello,
 
 I decided to test the 'defragment' feature on my system (after a huge number 
 of system updates and prelinking):
 
   find /bin /sbin /lib /usr/lib /usr/bin /usr/sbin -type d -exec btrfs 
 filesystem defragment '{}' '+'
 
 I have already defragmented a couple of (very large) directories with no 
 errors at all, so this was expected to work somehow. Surprisingly, this time 
 there were thousands of messages like this:
 
   ioctl failed on directory name ret -1 errno 28

   errno 28 is ENOSPC

   You've run out of disk space. (Or at least, btrfs thinks so).

 Most of the reported directories had zero files/subdirectories. However, 
 *most* of them were *not* empty...

 What does this error message mean? Could someone shed more light on
 this, please? Should I get ready for a bad crash? ;-) (There seems
 to be no data loss so far. No new messages in dmesg, no unexpected
 system behavior.)

   Hugo.

- -- 
=== Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- I believe that it's closely correlated with ---   
   the aeroswine coefficient.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)

iD8DBQFM9BoxIKyzvlFcI40RAv5SAJkB13ClPuTeRElrN1ARFhvDJ2C76gCghC/d
zFczJesxQGbd2jC2ildNNI0=
=i+tI
-END PGP SIGNATURE-


signature.asc
Description: Digital signature


Re: Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread Andrey Kuzmin
This may sound excessive as any new concept introduction that late in
development, but readonly/writable snapshots could be further
differentiated by naming the latter clones. This way end-user would
naturally perceive snapsot as read-only PIT fs image, while clone
would naturally refer to (writable) head fork.

Regards,
Andrey




On Tue, Nov 30, 2010 at 12:08 AM, Mike Fedyk mfe...@mikefedyk.com wrote:
 On Mon, Nov 29, 2010 at 12:41 PM, David Arendt ad...@prnet.org wrote:
 On 11/29/10 21:02, Mike Fedyk wrote:

 On Mon, Nov 29, 2010 at 12:02 AM, Li Zefanl...@cn.fujitsu.com  wrote:

 (Cc: Sage Weils...@newdream.net  for changes in async snapshots)

 This patchset adds readonly-snapshots support. You can create a
 readonly snapshot, and you can also set a snapshot readonly/writable
 on the fly.

 A few readonly checks are added in setattr, permission, remove_xattr
 and set_xattr callbacks, as well as in some ioctls.

 Great work!

 I have a suggestion on defaults when snapshots are created.  I think
 they should default to being read-only and if they are meant to be
 read-write a flag can be set at creation time (and changable at a
 later time as well of course).

 This way user/admin preconceptions of a snapshot being read-only can
 be enforced by default, and the exception when you want a read-write
 snapshot can be available with a switch at the cli level (and probably
 a flag at the ioctl level).

 It gives one more natural distinction between a snapshot and a
 subvolume at the user conceptual level.

 What do you think?

 I completely agree with you. I think lots of people use snapshots for backup
 purposes and these ones shouldn't be writable.

  by default.
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread Mike Fedyk
On Mon, Nov 29, 2010 at 1:31 PM, Andrey Kuzmin
andrey.v.kuz...@gmail.com wrote:
 This may sound excessive as any new concept introduction that late in
 development, but readonly/writable snapshots could be further
 differentiated by naming the latter clones. This way end-user would
 naturally perceive snapsot as read-only PIT fs image, while clone
 would naturally refer to (writable) head fork.


I'm not sure we want to take all of the terminology that zfs uses as
it may also bring the percieved drawbacks as well.  Isn't there some
additional overhead for a zfs clone compared to a snapshot?  I'm not
very familiar with zfs so that's why I ask.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Errors during defragmentation

2010-11-29 Thread Andrej Podzimek

Hello,

I decided to test the 'defragment' feature on my system (after a huge number of 
system updates and prelinking):

find /bin /sbin /lib /usr/lib /usr/bin /usr/sbin -type d -exec btrfs 
filesystem defragment '{}' '+'

I have already defragmented a couple of (very large) directories with no errors 
at all, so this was expected to work somehow. Surprisingly, this time there 
were thousands of messages like this:

ioctl failed ondirectory name  ret -1 errno 28


errno 28 is ENOSPC

You've run out of disk space. (Or at least, btrfs thinks so).


Pleased to hear that this is not a fatal error. :-)

The filesystem still has quite a lot of free space. New files can be created. I 
have just tried to add about 10 GB of data, which worked fine. The output from 
'df' indicates that only 71% of the partition (177 GB out of 250 GB) is used.

The built-in df shows similar numbers -- if I understand it well, there is 
plenty of free space left.

# btrfs filesystem df /
Data: total=175.01GB, used=169.57GB
Metadata: total=6.51GB, used=3.64GB
System: total=12.00MB, used=32.00KB

# btrfs filesystem show octopus
failed to read /dev/sdb
failed to read /dev/sr0
Label: 'octopus'  uuid: 8576b57b-b934-424e-9a8a-04abc780c963
Total devices 1 FS bytes used 173.21GB
devid1 size 249.50GB used 188.04GB path /dev/dm-2

Btrfs Btrfs v0.19

Does defragmentation have any unexpected (and not yet documented) free space 
requirements? (Most of the files I was attempting to defragment were smaller 
than 10 MB, as the directory names suggest.)

Is there a workaround for this issue? Or should I just leave the 
defragmentation feature alone for the time being?

Andrej



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [patch] fs: fix deadlocks in writeback_if_idle

2010-11-29 Thread Andrew Morton
On Thu, 25 Nov 2010 14:53:56 +1100
Nick Piggin npig...@kernel.dk wrote:

 On Wed, Nov 24, 2010 at 02:10:28PM +0100, Jan Kara wrote:
  On Wed 24-11-10 12:03:43, Nick Piggin wrote:
For the _nr variant that btrfs uses, it's worse for the filesystems
that don't have a 1:1 bdi-sb mapping.  It might not actually write any
of the pages from the SB that is out of space.
   
   That's true, but it might not write anything anyway (and after we
   check whether writeout is in progress, the writeout thread could go
   to sleep and not do anything anyway).
   
   So it's a pretty hacky interface anyway. If you want to do anything
   deterministic, you obviously need real coupling between producer and
   consumer. This should only be a performance tweak (or a workaround
   hack in worst case).
Yes, the current interface is a band aid for the problem and better
  interface is welcome. But it's not trivial to do better...
  
 It makes no further guarantees, and anyway
 the sb has to compete for normal writeback within this bdi.

 
 I think Christoph is right because filesystems should not really
 know about how bdi writeback queueing works. But I don't know if it's
 worth doing anything more complex for this functionality?

I think we should make a writeback_inodes_sb_unlocked() that doesn't
warn when the semaphore isn't held.  That should be enough given where
btrfs and ext4 are calling it from.
   
   It doesn't solve the bugs -- calling and waiting for writeback is
   still broken because completion requires i_mutex and it is called
   from under i_mutex.
Well, as I wrote in my previous email, only ext4 has the problem with
  i_mutex and I personally view it as a bug. But ultimately it's Ted's call
  to decide.
 
 Well, for now, the easiest and simplest fix is my patch, I think. The
 objection is that we may not write out anything for the specified sb,
 but the current implementation provides no such guarantees at all
 anyway, so I don't think it's a big issue.

Well yes.  We take something which will fail occasionally and with your
patch replace it with something which will fail a bit more often.  Why
don't we go all the way and do something which will fail *even more
often*.  Namely, just delete the damn function in the hope that the
resulting failures will provoke the ext4 crew into doing something sane
this time?

Guys, this delalloc thing *sucks*.  And here we are just sticking new
bandaids on top of the old bandaids.  And the btrfs approach isn't
exactly a thing of glory, either.

So...  nope.  I won't be applying Nick's patch.  Please fix this thing
properly - you have a whole month!
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] fs: fix deadlocks in writeback_if_idle

2010-11-29 Thread Nick Piggin
On Mon, Nov 29, 2010 at 02:26:03PM -0800, Andrew Morton wrote:
   On Thu, 25 Nov 2010 14:53:56 +1100
 Nick Piggin npig...@kernel.dk wrote:
  On Wed, Nov 24, 2010 at 02:10:28PM +0100, Jan Kara wrote:
  Well, for now, the easiest and simplest fix is my patch, I think. The
  objection is that we may not write out anything for the specified sb,
  but the current implementation provides no such guarantees at all
  anyway, so I don't think it's a big issue.
 
 Well yes.  We take something which will fail occasionally and with your
 patch replace it with something which will fail a bit more often.  Why
 don't we go all the way and do something which will fail *even more
 often*.  Namely, just delete the damn function in the hope that the
 resulting failures will provoke the ext4 crew into doing something sane
 this time?

I just need it fixed because the deadlocks are constantly hanging my
tests and/or switching off lockdep.

 
 Guys, this delalloc thing *sucks*.  And here we are just sticking new
 bandaids on top of the old bandaids.  And the btrfs approach isn't
 exactly a thing of glory, either.
 
 So...  nope.  I won't be applying Nick's patch.  Please fix this thing
 properly - you have a whole month!

Testers have less. It would be better to fix it now and rip it out at
the start of the next merge window if you're that way inclined :)

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread C Anthony Risinger
On Nov 29, 2010, at 3:48 PM, Andrey Kuzmin andrey.v.kuz...@gmail.com
wrote:

 I'm not sure why zfs came up, they don't own the term :). As to
 zfs/overhead topic, I doubt there's any difference between clone and
 writable shapshot (there should be none, of course, it's just two
 different names for the same concept).

 Regards,
 Andrey




 On Tue, Nov 30, 2010 at 12:43 AM, Mike Fedyk mfe...@mikefedyk.com
 wrote:
 On Mon, Nov 29, 2010 at 1:31 PM, Andrey Kuzmin
 andrey.v.kuz...@gmail.com wrote:
 This may sound excessive as any new concept introduction that late
 in
 development, but readonly/writable snapshots could be further
 differentiated by naming the latter clones. This way end-user would
 naturally perceive snapsot as read-only PIT fs image, while clone
 would naturally refer to (writable) head fork.


 I'm not sure we want to take all of the terminology that zfs uses as
 it may also bring the percieved drawbacks as well.  Isn't there some
 additional overhead for a zfs clone compared to a snapshot?  I'm not
 very familiar with zfs so that's why I ask.

 --
 To unsubscribe from this list: send the line unsubscribe linux-
 btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

I don't like the idea of readonly by default, or further changes to
terminology, for several reasons:

) readonly by default offers no real enhancement whatsoever other than
breaking _anything_ that's written right now
) btrfs readonly is not even really readonly; as superuser could
simply flip a flag to enable writes, readonly merely prevents
accidental writes or misbehaving apps... ie. protecting you from
yourself
) backups are the simple/obvious use case; I personally use btrfs
heavily for LXC containers, in which case nearly every single snapshot
is intended to be writable -- usually cloning a template into a new
domain
) I also use an initramfs hook to provide system rollbacks, also
writable; the hook also provides multiple versions of the branch...
all writable
) adding new terms is not a good idea imo; I've already spewed out
many sentences explaining the difference between subvolumes and
snapshots, ie. that there is none... adding another term only adds to
this problem; they each describe the same thing, but differentiate
based on origin or current state, neither of which actually describe
what it _is_-- a new named pointer to a tree, like a git branch -- a
subvolume.

I think a better solution/compromise would be to leave snapshots
writeable by default, since that's more true to what's happening
internally anyway, but maybe introduce a mount option controlling the
default action for that mount point.

C Anthony [mobile]
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] btrfs: Make async snapshot ioctl more generic

2010-11-29 Thread Li Zefan
Goffredo Baroncelli wrote:
 Hi Li,
 
 great work, but I have some suggestions:
 
 On Monday, 29 November, 2010, Li Zefan wrote:
 So we don't have to add new structures as we create more ioctls
 for snapshots.

 Now to create async snapshot, set BTRFS_SNAPSHOT_CREATE_ASYNC bit of
 vol_arg_v2-flags, and then call ioctl(BTRFS_IOCT_SNAP_CREATE_V2).

 Note: this changes the async snapshot ioctl ABI, which was merged
 in .37 merge window, so we have to make this change into .37.

 Signed-off-by: Li Zefan l...@cn.fujitsu.com
 ---
  fs/btrfs/ioctl.c |   34 +-
  fs/btrfs/ioctl.h |   12 
  2 files changed, 29 insertions(+), 17 deletions(-)

 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index 463d91b..d3f1a60 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -935,23 +935,31 @@ out:
  
  static noinline int btrfs_ioctl_snap_create(struct file *file,
  void __user *arg, int subvol,
 -int async)
 +bool v2)
 
 This is a aesthetic suggestion: instead of passing a flag called v2 I suggest 
 to add two wrapper functions, which extract the parameters and passes all 
 available parameter to the generic function. Something like:
 

Sure, as long as it won't result in code duplication and can
improve readability.

 static inline btrfs_ioctl_snap_create_v1(struct file *file, 
   void __user *arg, int subvol)
 {
 vol_args = memdup_user(arg, sizeof(*vol_args));
 if (IS_ERR(vol_args))
 return PTR_ERR(vol_args);
 name = vol_args-name;
 fd = vol_args-fd;
 vol_args-name[BTRFS_PATH_NAME_MAX] = '\0';
 
   btrfs_ioctl_snap_create_generic(file, subvol, name, fd, 0, 0);
 
 }
 
 static inline btrfs_ioctl_snap_create_v2(struct file *file, 
   void __user *arg, int subvol,
   )
 {
 vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
 if (IS_ERR(vol_args_v2))
 return PTR_ERR(vol_args_v2);
 
 ret = snap_check_flags(vol_args_v2-flags, true);
 if (ret)
 goto out;
 
 name = vol_args_v2-name;
 fd = vol_args_v2-fd;
 vol_args_v2-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 if (vol_args_v2-flags  BTRFS_SNAPSHOT_CREATE_ASYNC)
 async = true;
 if (vol_args_v2-flags  BTRFS_SNAPSHOT_RDONLY)
 readonly = true;
 
   btrfs_ioctl_snap_create_generic(file, subvol, name, fd, async,
readonly);
 
 }
 
 and 
 
   case BTRFS_IOC_SNAP_CREATE:
   return btrfs_ioctl_snap_create_v1(file, argp, 0);
   case BTRFS_IOC_SNAP_CREATE_V2:
   return btrfs_ioctl_snap_create_v2(file, argp, 0);
 
 
 Frankly speaking, we could get rid of the subvol parameter adding another 
 wrapper function like btrfs_ioctl_subvol_create( ), but this would be 
 another story :)
 

I thought about that too.

  {
  struct btrfs_ioctl_vol_args *vol_args = NULL;
 -struct btrfs_ioctl_async_vol_args *async_vol_args = NULL;
 +struct btrfs_ioctl_vol_args_v2 *vol_args_v2 = NULL;
  char *name;
  u64 fd;
  u64 transid = 0;
 +bool async = false;
  int ret;
  
 -if (async) {
 -async_vol_args = memdup_user(arg, sizeof(*async_vol_args));
 -if (IS_ERR(async_vol_args))
 -return PTR_ERR(async_vol_args);
 +if (v2) {
 +vol_args_v2 = memdup_user(arg, sizeof(*vol_args_v2));
 +if (IS_ERR(vol_args_v2))
 +return PTR_ERR(vol_args_v2);
  
 -name = async_vol_args-name;
 -fd = async_vol_args-fd;
 -async_vol_args-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 +if (vol_args_v2-flags  ~BTRFS_SNAPSHOT_CREATE_ASYNC) {
 +ret = -EINVAL;
 +goto out;
 +}
 +
 +name = vol_args_v2-name;
 +fd = vol_args_v2-fd;
 +vol_args_v2-name[BTRFS_SNAPSHOT_NAME_MAX] = '\0';
 +if (vol_args_v2-flags  BTRFS_SNAPSHOT_CREATE_ASYNC)
 +async = true;
  } else {
  vol_args = memdup_user(arg, sizeof(*vol_args));
  if (IS_ERR(vol_args))
 @@ -966,13 +974,13 @@ static noinline int btrfs_ioctl_snap_create(struct 
 file *file,
  
  if (!ret  async) {
  if (copy_to_user(arg +
 -offsetof(struct btrfs_ioctl_async_vol_args,
 +offsetof(struct btrfs_ioctl_vol_args_v2,
  transid), transid, sizeof(transid)))
  return -EFAULT;
  

Re: [RFC PATCH 0/4] Add readonly support to replace BUG_ON phrase

2010-11-29 Thread liubo
On 11/30/2010 04:10 AM, Josef Bacik wrote:
 On Thu, Nov 25, 2010 at 05:52:47PM +0800, Miao Xie wrote:
 Btrfs has a number of BUG_ON()s, which may lead btrfs to unpleasant panic.
 Meanwhile, they are very ugly and should be handled more propriately.

 There are mainly two ways to deal with these BUG_ON()s.

 1. For those errors which can be handled well by callers, we just return 
 their
 error number to callers.

 2. For others, We can force the filesystem readonly when it hits errors, 
 which
  is what this patchset has done. Replaced BUG_ON() with the interface 
 provided
  in this patchset, we will get error infomation via dmesg. Since btrfs is 
 now 
 readonly, we can save our data safely and umount it, then a btrfsck is 
 recommended.

 By these ways, we can protect our filesystem from panic caused by those 
 BUG_ONs.

 ---
  fs/btrfs/ctree.h   |   21 ++
  fs/btrfs/disk-io.c |   23 +++
  fs/btrfs/super.c   |  100 
 ++-
  fs/btrfs/transaction.c |7 +++
  4 files changed, 148 insertions(+), 3 deletions(-)

 
 Overall seems sane, but what about kernels that don't make these checks?  I'm 
 ok
 with well sucks for them as an answer, just want to make sure we've at least
 though about it.

You mean those code that does nothing on ret-checks?

IMO, if the code really needs ret-check, we should deal with them seriously, or 
just
leave it alone. And this is a step-by-step job.

 
 Also I'm not sure marking the fs as broken is the right move here.  Ext3/4 
 don't
 do this, they just mount read-only, as long as you can still unmount the
 filesystem everything comes out ok.  Think of the case where we just get a
 spurious EIO, the fs should be fine the next time around, there's reason to
 force the user to run fsck in this case.
 

Yes, I agree on this.
For spurious EIO, it mainly depends on coders, returning the errno to caller 
may work on 
bypassing fsck.

While I'm working on this readonly stuff, it is difficult to solve the 
potential 
deadlock when we write the super block to disk. 
Since btrfs supports multi-device, before write-super, we must get the device 
lock 
device_list_mutex first, and this has puzzled me a lot.

BTW, I've tried another way to bypass deadlock. I made the write-super stuff 
into umount, 
which can make us free from deadlock, however, while testing this, it seemes 
that umount 
cannot work due to a ext3/4 jbd oops, I'm digging on this oops...

So, any ideas about free from deadlock?

 Thanks,
 
 Josef
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Default to read-only on snapshot creation and have a flag if snapshot should be writable (was: [PATCH 0/5] btrfs: Readonly snapshots)

2010-11-29 Thread Li Zefan
C Anthony Risinger wrote:
 On Nov 29, 2010, at 3:48 PM, Andrey Kuzmin andrey.v.kuz...@gmail.com
 wrote:
 
 I'm not sure why zfs came up, they don't own the term :). As to
 zfs/overhead topic, I doubt there's any difference between clone and
 writable shapshot (there should be none, of course, it's just two
 different names for the same concept).

 Regards,
 Andrey




 On Tue, Nov 30, 2010 at 12:43 AM, Mike Fedyk mfe...@mikefedyk.com
 wrote:
 On Mon, Nov 29, 2010 at 1:31 PM, Andrey Kuzmin
 andrey.v.kuz...@gmail.com wrote:
 This may sound excessive as any new concept introduction that late
 in
 development, but readonly/writable snapshots could be further
 differentiated by naming the latter clones. This way end-user would
 naturally perceive snapsot as read-only PIT fs image, while clone
 would naturally refer to (writable) head fork.

 I'm not sure we want to take all of the terminology that zfs uses as
 it may also bring the percieved drawbacks as well.  Isn't there some
 additional overhead for a zfs clone compared to a snapshot?  I'm not
 very familiar with zfs so that's why I ask.

 --
 To unsubscribe from this list: send the line unsubscribe linux-
 btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 I don't like the idea of readonly by default, or further changes to
 terminology, for several reasons:
 

I quite agree with you. LVM2 also defaults to read/write for snapshots.

 ) readonly by default offers no real enhancement whatsoever other than
 breaking _anything_ that's written right now

This was the first thing that came to my mind.

 ) btrfs readonly is not even really readonly; as superuser could
 simply flip a flag to enable writes, readonly merely prevents
 accidental writes or misbehaving apps... ie. protecting you from
 yourself
 ) backups are the simple/obvious use case; I personally use btrfs
 heavily for LXC containers, in which case nearly every single snapshot
 is intended to be writable -- usually cloning a template into a new
 domain
 ) I also use an initramfs hook to provide system rollbacks, also
 writable; the hook also provides multiple versions of the branch...
 all writable
 ) adding new terms is not a good idea imo; I've already spewed out
 many sentences explaining the difference between subvolumes and
 snapshots, ie. that there is none... adding another term only adds to
 this problem; they each describe the same thing, but differentiate
 based on origin or current state, neither of which actually describe
 what it _is_-- a new named pointer to a tree, like a git branch -- a
 subvolume.
 
 I think a better solution/compromise would be to leave snapshots
 writeable by default, since that's more true to what's happening
 internally anyway, but maybe introduce a mount option controlling the
 default action for that mount point.
 
 C Anthony [mobile]
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/4] Add readonly support to replace BUG_ON phrase

2010-11-29 Thread Josef Bacik
On Tue, Nov 30, 2010 at 10:03:58AM +0800, liubo wrote:
 On 11/30/2010 04:10 AM, Josef Bacik wrote:
  On Thu, Nov 25, 2010 at 05:52:47PM +0800, Miao Xie wrote:
  Btrfs has a number of BUG_ON()s, which may lead btrfs to unpleasant panic.
  Meanwhile, they are very ugly and should be handled more propriately.
 
  There are mainly two ways to deal with these BUG_ON()s.
 
  1. For those errors which can be handled well by callers, we just return 
  their
  error number to callers.
 
  2. For others, We can force the filesystem readonly when it hits errors, 
  which
   is what this patchset has done. Replaced BUG_ON() with the interface 
  provided
   in this patchset, we will get error infomation via dmesg. Since btrfs is 
  now 
  readonly, we can save our data safely and umount it, then a btrfsck is 
  recommended.
 
  By these ways, we can protect our filesystem from panic caused by those 
  BUG_ONs.
 
  ---
   fs/btrfs/ctree.h   |   21 ++
   fs/btrfs/disk-io.c |   23 +++
   fs/btrfs/super.c   |  100 
  ++-
   fs/btrfs/transaction.c |7 +++
   4 files changed, 148 insertions(+), 3 deletions(-)
 
  
  Overall seems sane, but what about kernels that don't make these checks?  
  I'm ok
  with well sucks for them as an answer, just want to make sure we've at 
  least
  though about it.
 
 You mean those code that does nothing on ret-checks?
 
 IMO, if the code really needs ret-check, we should deal with them seriously, 
 or just
 leave it alone. And this is a step-by-step job.
 

Sorry I mean for older kernels that don't know about these hey your fs is
screwed flags.  It seems like they'll just get ignored, are we sure thats what
we want to happen?  I'm fine with that, but if we don't want that to happen it
may be good to have a incompat flag.

  
  Also I'm not sure marking the fs as broken is the right move here.  Ext3/4 
  don't
  do this, they just mount read-only, as long as you can still unmount the
  filesystem everything comes out ok.  Think of the case where we just get a
  spurious EIO, the fs should be fine the next time around, there's reason to
  force the user to run fsck in this case.
  
 
 Yes, I agree on this.
 For spurious EIO, it mainly depends on coders, returning the errno to caller 
 may work on 
 bypassing fsck.
 

Right I'm worried about the flipping read only stuff being kicked by EIO, which
happens with ext* and could happen with btrfs in the right cases.  I'm not
saying thats wrong, its what should happen, I'm just saying we need to be able
to unmount the filesystem and mount it back up without needing to run an fsck in
between.

 While I'm working on this readonly stuff, it is difficult to solve the 
 potential 
 deadlock when we write the super block to disk. 
 Since btrfs supports multi-device, before write-super, we must get the device 
 lock 
 device_list_mutex first, and this has puzzled me a lot.
 
 BTW, I've tried another way to bypass deadlock. I made the write-super stuff 
 into umount, 
 which can make us free from deadlock, however, while testing this, it seemes 
 that umount 
 cannot work due to a ext3/4 jbd oops, I'm digging on this oops...
 
 So, any ideas about free from deadlock?
 

None :).  The best thing I can think of is do like we're doing with the read
only stuff and only write out the super right before we flip read only, and then
make umount make sure that if we're mounted read only to not do anything.

Truth be told I hate this mark the fs as broken idea.  We don't know if the
error we got means the filesystem is broken (for example the EIO case).  If we
do hit actual corruption maybe it would be good, and in that case we should
write out the super at the point we find that corruption and then flip read only
and have that be the only time we have to worry about writing out the super.

So I guess that's 2 options

1) Ditch the the fs is broken flag.  This makes things nice and simple since
on-disk is already consistent, all we have to do is drop anything thats dirty
and we're home free.

2) Keep the flag, but only worry about writing it out on a case by case basis.
So we have a btrfs_corrupt_fs() function that writes out the super with the
appropriate flag, and then flips the fs read only.  Then we don't have to do
anything special in the common paths, just the normal hey is this fs read
only? things, so for all other cases we can just flip the fs read only and
everything works.

I hope that makes sense, if not feel free to ignore me and just keep doing what
you've been doing :).  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/4] Add readonly support to replace BUG_ON phrase

2010-11-29 Thread liubo
On 11/30/2010 10:30 AM, Josef Bacik wrote:
 On Tue, Nov 30, 2010 at 10:03:58AM +0800, liubo wrote:
 On 11/30/2010 04:10 AM, Josef Bacik wrote:
 On Thu, Nov 25, 2010 at 05:52:47PM +0800, Miao Xie wrote:
 Btrfs has a number of BUG_ON()s, which may lead btrfs to unpleasant panic.
 Meanwhile, they are very ugly and should be handled more propriately.

 There are mainly two ways to deal with these BUG_ON()s.

 1. For those errors which can be handled well by callers, we just return 
 their
 error number to callers.

 2. For others, We can force the filesystem readonly when it hits errors, 
 which
  is what this patchset has done. Replaced BUG_ON() with the interface 
 provided
  in this patchset, we will get error infomation via dmesg. Since btrfs is 
 now 
 readonly, we can save our data safely and umount it, then a btrfsck is 
 recommended.

 By these ways, we can protect our filesystem from panic caused by those 
 BUG_ONs.

 ---
  fs/btrfs/ctree.h   |   21 ++
  fs/btrfs/disk-io.c |   23 +++
  fs/btrfs/super.c   |  100 
 ++-
  fs/btrfs/transaction.c |7 +++
  4 files changed, 148 insertions(+), 3 deletions(-)

 Overall seems sane, but what about kernels that don't make these checks?  
 I'm ok
 with well sucks for them as an answer, just want to make sure we've at 
 least
 though about it.
 You mean those code that does nothing on ret-checks?

 IMO, if the code really needs ret-check, we should deal with them seriously, 
 or just
 leave it alone. And this is a step-by-step job.

 
 Sorry I mean for older kernels that don't know about these hey your fs is
 screwed flags.  It seems like they'll just get ignored, are we sure thats 
 what
 we want to happen?  I'm fine with that, but if we don't want that to happen it
 may be good to have a incompat flag.
 

Ohh, got it, thanks for pointing it out. Will do it later.

 Also I'm not sure marking the fs as broken is the right move here.  Ext3/4 
 don't
 do this, they just mount read-only, as long as you can still unmount the
 filesystem everything comes out ok.  Think of the case where we just get a
 spurious EIO, the fs should be fine the next time around, there's reason to
 force the user to run fsck in this case.

 Yes, I agree on this.
 For spurious EIO, it mainly depends on coders, returning the errno to caller 
 may work on 
 bypassing fsck.

 
 Right I'm worried about the flipping read only stuff being kicked by EIO, 
 which
 happens with ext* and could happen with btrfs in the right cases.  I'm not
 saying thats wrong, its what should happen, I'm just saying we need to be able
 to unmount the filesystem and mount it back up without needing to run an fsck 
 in
 between.
 

hm, this really makes sense. 

Since it is difficult to tell whether a fake corruption it is, what about just 
implementing readonly stuff like this and making it more friendly to EIO in 
future?

 While I'm working on this readonly stuff, it is difficult to solve the 
 potential 
 deadlock when we write the super block to disk. 
 Since btrfs supports multi-device, before write-super, we must get the 
 device lock 
 device_list_mutex first, and this has puzzled me a lot.

 BTW, I've tried another way to bypass deadlock. I made the write-super stuff 
 into umount, 
 which can make us free from deadlock, however, while testing this, it seemes 
 that umount 
 cannot work due to a ext3/4 jbd oops, I'm digging on this oops...

 So, any ideas about free from deadlock?

 
 None :).  The best thing I can think of is do like we're doing with the read
 only stuff and only write out the super right before we flip read only, and 
 then
 make umount make sure that if we're mounted read only to not do anything.
 
 Truth be told I hate this mark the fs as broken idea.  We don't know if the
 error we got means the filesystem is broken (for example the EIO case).  If we
 do hit actual corruption maybe it would be good, and in that case we should
 write out the super at the point we find that corruption and then flip read 
 only
 and have that be the only time we have to worry about writing out the super.
 
 So I guess that's 2 options
 
 1) Ditch the the fs is broken flag.  This makes things nice and simple since
 on-disk is already consistent, all we have to do is drop anything thats dirty
 and we're home free.
 
 2) Keep the flag, but only worry about writing it out on a case by case basis.
 So we have a btrfs_corrupt_fs() function that writes out the super with the
 appropriate flag, and then flips the fs read only.  Then we don't have to do
 anything special in the common paths, just the normal hey is this fs read
 only? things, so for all other cases we can just flip the fs read only and
 everything works.
 

The 2) is what I have just done. :)

 I hope that makes sense, if not feel free to ignore me and just keep doing 
 what
 you've been doing :).  Thanks,
 

They are very helpful.

Thanks,
Liu Bo

 Josef
 --
 To unsubscribe 

R: Re: [PATCH 5/5] btrfs: Add ioctl to set snapshot readonly/writable

2010-11-29 Thread Goffredo Baroncelli kreij...@libero.it
Hi Li,

Messaggio originale
Da: l...@cn.fujitsu.com
Data: 30/11/2010 8.03
A: kreij...@libero.it
Cc: linux-btrfs@vger.kernel.org
Ogg: Re: [PATCH 5/5] btrfs: Add ioctl to set snapshot readonly/writable

Goffredo Baroncelli wrote:
 Hi Li,
 
 On Monday, 29 November, 2010, Li Zefan wrote:
 This allows us to set a snapshot readonly or writable on the fly.

 Usage:

 Set BTRFS_SNAPSHOT_RDONLY/WRITABLE of btrfs_ioctl_vol_arg_v2-flags,
 and then call ioctl(BTRFS_IOCTL_SNAP_SETFLAGS);
 
 I really appreciate your work, but I have some doubt about this interface. 
In 
 particolar:

It's the interface that I would like to be discussed. Thanks!

 - how get the flags of a subvolume ? I suggest to implement a pair of 
ioctls:
  - subvolume_setflags  - get the flags
  - subvolume_getflags  - set the flags
 These ioctls would be more generic (there are a lot of flags which may be 
 interested to put in the root of a subvolume: think about 
 compress/nocompress, (no)datasum...)
 - For the reason abowe, I suggest to replace SNAPSHOT with SUBVOLUME
 - Finally, with a pair of  get/set_flags functions we can avoid the use of 
the 
 flags BTRFS_SNAPSHOT_WRITABLE.
 

There are some reasons that I created this interface:

- set/getflags should set/get root flags which reflect in struct
btrfs_root_item-flags.

- btrfs_root_item-flags was not used at all before this patch, so
(no)compress and (no)datasum is not reflect in -flags.

- _CREATE_ASYNC flag is to create snapshot asynchronously, so it's not
a flag of tree root.

Of course I never mind about  _CREATE_ASYNC to be set in btrfs_root_item-
flags. 
_CREATE_ASYNC is not a snapshot properties but a way of creating a subvolume.
But other flags may make sense to live in btrfs_root_item-flags.
 So I am suggesting to develop a more general 
interface for future improvement. These pair of functions (*_set/get) should 
be use to 
set the subvolume/snapshot properties. And the RDONLY is one of them.

In the detail to set an attributue an user should be:

- get the subvolume flags (ioctl(fd, BTRFS_IOC_SNAP_GETFLAGS) )
- compute the new flags ( flags |= BTRFS_FLAGS_ or flags = 
~BTRFS_FLAG_)
- set the subvolume flags (ioctl(fd, BTRFS_IOC_SNAP_SGETFLAGS) )


- It seems to me there's no user requirement for getflags ioctl to
return _RDONLY/_WRITABLE flags of a tree root?

And how an user/admin can understand that a snapshot/subvolume is readonly ? 
How 


- By suggesting BTRFS_SUBVOL_RDONLY, does it impliy not only snapshot
but also a subvolume can be made readonly?

IIRC a snapshot is a subvolume already filled from the beginning. Why doesn't 
share 
the capability of make a subvolume RO ?

Finally I have another suggestion: make sense to check that the file 
descriptor is referring 
to the root of a subvolume instead of a the tree. I highlight that because the 
other ioctls
suffer  the same problem and confused the user sometime. For example a lot of
people tough that was possible to snapshot a directory, because the ioctl 
doesn't return
any error. But instead of the directory the snapshot was of the full 
subvolume.

Goffredo

[...]

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html