[PATCH v2 1/2] btrfs-progs: refactor check_label()

2013-01-30 Thread Jeff Liu
Refactor check_label().

- Make it be static at first, this is a preparation step since we'll remove
btrfslabel.[c|h] and move those functions from there to utils.[c|h], we can
do pre-checking against the input label string with it.
- Fix the label length check up from BTRFS_LABEL_SIZE to BTRFS_LABEL_SIZE - 1.
- Kill the check of label contains an invalid character, see below commits for 
detail:
  79e0e445fc2365e47fc7f060d5a4445d37e184b8
  btrfs-progs: kill check for /'s in labels.

Signed-off-by: Jie Liu jeff@oracle.com
CC: David Sterba dste...@suse.cz
CC: Gene Czarcinski g...@czarc.net

---
 utils.c |   14 --
 utils.h |1 -
 2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/utils.c b/utils.c
index d59bca3..9dc688a 100644
--- a/utils.c
+++ b/utils.c
@@ -1120,23 +1120,17 @@ char *pretty_sizes(u64 size)
  * Returns:
0if everything is safe and usable
   -1if the label is too long
-  -2if the label contains an invalid character
  */
-int check_label(char *input)
+static int check_label(const char *input)
 {
-   int i;
int len = strlen(input);
 
-   if (len  BTRFS_LABEL_SIZE) {
+   if (len  BTRFS_LABEL_SIZE - 1) {
+   fprintf(stderr, ERROR: Label %s is too long (max %d)\n,
+   input, BTRFS_LABEL_SIZE - 1);
return -1;
}
 
-   for (i = 0; i  len; i++) {
-   if (input[i] == '/' || input[i] == '\\') {
-   return -2;
-   }
-   }
-
return 0;
 }
 
diff --git a/utils.h b/utils.h
index 8750f28..a0b782b 100644
--- a/utils.h
+++ b/utils.h
@@ -42,7 +42,6 @@ int check_mounted_where(int fd, const char *file, char 
*where, int size,
 int btrfs_device_already_in_root(struct btrfs_root *root, int fd,
 int super_offset);
 char *pretty_sizes(u64 size);
-int check_label(char *input);
 int get_mountpt(char *dev, char *mntpt, size_t size);
 
 int btrfs_scan_block_devices(int run_ioctl);
-- 
1.7.9.5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] btrfs-progs: move btrfslabel.[c|h] stuff to utils.[c|h]

2013-01-30 Thread Jeff Liu
Clean btrfslabel.[c|h] out of the source tree and move those related
functions to utils.[c|h].

Signed-off-by: Jie Liu jeff@oracle.com
CC: David Sterba dste...@suse.cz
CC: Gene Czarcinski g...@czarc.net

---
 Makefile  |4 +-
 btrfslabel.c  |  178 -
 btrfslabel.h  |5 --
 cmds-filesystem.c |1 -
 utils.c   |  130 ++
 utils.h   |2 +
 6 files changed, 134 insertions(+), 186 deletions(-)
 delete mode 100644 btrfslabel.c
 delete mode 100644 btrfslabel.h

diff --git a/Makefile b/Makefile
index 4894903..e54b21e 100644
--- a/Makefile
+++ b/Makefile
@@ -4,8 +4,8 @@ CFLAGS = -g -O1
 objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
  root-tree.o dir-item.o file-item.o inode-item.o \
  inode-map.o crc32c.o rbtree.o extent-cache.o extent_io.o \
- volumes.o utils.o btrfs-list.o btrfslabel.o repair.o \
- send-stream.o send-utils.o qgroup.o
+ volumes.o utils.o btrfs-list.o repair.o send-stream.o \
+ send-utils.o qgroup.o
 cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
   cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \
   cmds-quota.o cmds-qgroup.o
diff --git a/btrfslabel.c b/btrfslabel.c
deleted file mode 100644
index 2826050..000
--- a/btrfslabel.c
+++ /dev/null
@@ -1,178 +0,0 @@
-/*
- * Copyright (C) 2008 Morey Roof.   All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License v2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- */
-
-#define _GNU_SOURCE
-
-#ifndef __CHECKER__
-#include sys/ioctl.h
-#include sys/mount.h
-#include ioctl.h
-#endif /* __CHECKER__ */
-
-#include stdio.h
-#include stdlib.h
-#include sys/types.h
-#include sys/stat.h
-#include dirent.h
-#include fcntl.h
-#include unistd.h
-#include linux/fs.h
-#include linux/limits.h
-#include ctype.h
-#include kerncompat.h
-#include ctree.h
-#include utils.h
-#include version.h
-#include disk-io.h
-#include transaction.h
-
-#define MOUNTED1
-#define UNMOUNTED  2
-#define GET_LABEL  3
-#define SET_LABEL  4
-
-static int set_label_unmounted(const char *dev, const char *label)
-{
-   struct btrfs_trans_handle *trans;
-   struct btrfs_root *root;
-   int ret;
-
-   ret = check_mounted(dev);
-   if (ret  0) {
-  fprintf(stderr, FATAL: error checking %s mount status\n, dev);
-  return -1;
-   }
-   if (ret  0) {
-   fprintf(stderr, ERROR: dev %s is mounted, use mount point\n,
-   dev);
-   return -1;
-   }
-
-   if (strlen(label)  BTRFS_LABEL_SIZE - 1) {
-   fprintf(stderr, ERROR: Label %s is too long (max %d)\n,
-   label, BTRFS_LABEL_SIZE - 1);
-   return -1;
-   }
-
-   /* Open the super_block at the default location
-* and as read-write.
-*/
-   root = open_ctree(dev, 0, 1);
-   if (!root) /* errors are printed by open_ctree() */
-   return -1;
-
-   trans = btrfs_start_transaction(root, 1);
-   snprintf(root-fs_info-super_copy.label, BTRFS_LABEL_SIZE, %s,
-label);
-   btrfs_commit_transaction(trans, root);
-
-   /* Now we close it since we are done. */
-   close_ctree(root);
-   return 0;
-}
-
-static int set_label_mounted(const char *mount_path, const char *label)
-{
-   int fd;
-
-   fd = open(mount_path, O_RDONLY | O_NOATIME);
-   if (fd  0) {
-   fprintf(stderr, ERROR: unable access to '%s'\n, mount_path);
-   return -1;
-   }
-
-   if (ioctl(fd, BTRFS_IOC_SET_FSLABEL, label)  0) {
-   fprintf(stderr, ERROR: unable to set label %s\n,
-   strerror(errno));
-   close(fd);
-   return -1;
-   }
-
-   return 0;
-}
-
-static int get_label_unmounted(const char *dev)
-{
-   struct btrfs_root *root;
-   int ret;
-
-   ret = check_mounted(dev);
-   if (ret  0) {
-  fprintf(stderr, FATAL: error checking %s mount status\n, dev);
-  return -1;
-   }
-   if (ret  0) {
-   fprintf(stderr, ERROR: dev %s is mounted, use mount point\n,
-   

RAID 0 across SSD and HDD

2013-01-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I've been unable to find anything definitive about what happens if I use
RAID0 to join an SSD and HDD together with respect to performance
(latency, throughput).  The future is obvious (hot data tracking, using
most appropriate device for the data, data migration).

In my specific case I have a 250GB SSD and a 500GB HDD, and about 250GB of
files (constantly growing).  One message I saw said that new blocks are
allocated on the device with the most free space which implies the SSD
would be virtually unused in my case, except for metadata which would only
be used half the time.

At the moment I have two independent filesystems (one per device) and
manually move data files between them using symlinks to keep pathnames the
same.  This requires keeping lots of slop free space on the SSD as well as
administration whenever it runs out of space.

My hope would be overall performance between that of the two devices, and
closer to that of the SSD.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlEI54kACgkQmOOfHg372QR1HwCfROJ10FAC51V0wuLSRwPq0LSL
2GwAmQF1F2k3cthGThEbf67Xn3usKS1K
=HFi8
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/12 v5] Btrfs-progs: add show sub-command for btrfs subvol cli

2013-01-30 Thread Anand Jain
David,

 Please find this patch-set rebased with your integration-20130130 branch.

v4-v5:
Fix a memory leak in the original code ref Patch 12/12
Fix the compiler warning in the patch 9/12

Anand Jain (11):
  Btrfs-progs: move printing subvol list outside of btrfs_list_subvols
  Btrfs-progs: add parent uuid for snapshots
  Btrfs-progs: move struct root_info to btrfs-list.h
  Btrfs-progs: add function btrfs_get_subvol to get root_info of a
subvol
  Btrfs-progs: add method to filter snapshots by parent uuid
  Btrfs-progs: put find_mount_root() in commands.h
  Btrfs-progs: make printing subvol extensible to newer layouts
  Btrfs-progs: make get_subvol_name non cmds-send specific
  Btrfs-progs: add show subcommand to subvol cli
  Btrfs-progs: update btrfs_get_subvol to be inline with resolve_root
ret changes
  Btrfs-progs: Fix a small memory leak in managing the btrfs list filter

Wang Shilong (1):
  Btrfs-progs: filter the deleted subvolumes when listing snapshots

 btrfs-list.c | 239 ---
 btrfs-list.h |  58 +-
 cmds-send.c  |  12 +--
 cmds-subvolume.c | 235 +-
 commands.h   |   4 +
 man/btrfs.8.in   |   6 ++
 6 files changed, 443 insertions(+), 111 deletions(-)

-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/12] Btrfs-progs: move printing subvol list outside of btrfs_list_subvols

2013-01-30 Thread Anand Jain
To improve the code reuse its better to have btrfs_list_subvols
just return list of subvols witout printing

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 28 ++--
 btrfs-list.h |  2 +-
 cmds-subvolume.c |  4 ++--
 3 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index e09ee2d..ab42a33 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1440,15 +1440,11 @@ static void print_all_volume_info(struct root_lookup 
*sorted_tree,
}
 }
 
-int btrfs_list_subvols(int fd, struct btrfs_list_filter_set *filter_set,
-  struct btrfs_list_comparer_set *comp_set,
-  int is_tab_result)
+int btrfs_list_subvols(int fd, struct root_lookup *root_lookup)
 {
-   struct root_lookup root_lookup;
-   struct root_lookup root_sort;
int ret;
 
-   ret = __list_subvol_search(fd, root_lookup);
+   ret = __list_subvol_search(fd, root_lookup);
if (ret) {
fprintf(stderr, ERROR: can't perform the search - %s\n,
strerror(errno));
@@ -1459,16 +1455,28 @@ int btrfs_list_subvols(int fd, struct 
btrfs_list_filter_set *filter_set,
 * now we have an rbtree full of root_info objects, but we need to fill
 * in their path names within the subvol that is referencing each one.
 */
-   ret = __list_subvol_fill_paths(fd, root_lookup);
-   if (ret  0)
-   return ret;
+   ret = __list_subvol_fill_paths(fd, root_lookup);
+   return ret;
+}
 
+int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
+  struct btrfs_list_comparer_set *comp_set,
+  int is_tab_result)
+{
+   struct root_lookup root_lookup;
+   struct root_lookup root_sort;
+   int ret;
+
+   ret = btrfs_list_subvols(fd, root_lookup);
+   if (ret)
+   return ret;
__filter_and_sort_subvol(root_lookup, root_sort, filter_set,
 comp_set, fd);
 
print_all_volume_info(root_sort, is_tab_result);
__free_all_subvolumn(root_lookup);
-   return ret;
+
+   return 0;
 }
 
 static int print_one_extent(int fd, struct btrfs_ioctl_search_header *sh,
diff --git a/btrfs-list.h b/btrfs-list.h
index cde4b3c..71fe0f3 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -98,7 +98,7 @@ int btrfs_list_setup_comparer(struct btrfs_list_comparer_set 
**comp_set,
  enum btrfs_list_comp_enum comparer,
  int is_descending);
 
-int btrfs_list_subvols(int fd, struct btrfs_list_filter_set *filter_set,
+int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
   struct btrfs_list_comparer_set *comp_set,
int is_tab_result);
 int btrfs_list_find_updated_files(int fd, u64 root_id, u64 oldest_gen);
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index e3cdb1e..c35dff7 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -406,7 +406,7 @@ static int cmd_subvol_list(int argc, char **argv)
BTRFS_LIST_FILTER_TOPID_EQUAL,
top_id);
 
-   ret = btrfs_list_subvols(fd, filter_set, comparer_set,
+   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
is_tab_result);
if (ret)
return 19;
@@ -613,7 +613,7 @@ static int cmd_subvol_get_default(int argc, char **argv)
btrfs_list_setup_filter(filter_set, BTRFS_LIST_FILTER_ROOTID,
default_id);
 
-   ret = btrfs_list_subvols(fd, filter_set, NULL, 0);
+   ret = btrfs_list_subvols_print(fd, filter_set, NULL, 0);
if (ret)
return 19;
return 0;
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/12] Btrfs-progs: move struct root_info to btrfs-list.h

2013-01-30 Thread Anand Jain
As we would add more ways to list and manage the subvols
and snapshots, its better if we have struct root_info
defined in the header file.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 47 ---
 btrfs-list.h | 47 ++-
 2 files changed, 46 insertions(+), 48 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 03a0d02..f41c008 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -46,53 +46,6 @@ struct root_lookup {
struct rb_root root;
 };
 
-/*
- * one of these for each root we find.
- */
-struct root_info {
-   struct rb_node rb_node;
-   struct rb_node sort_node;
-
-   /* this root's id */
-   u64 root_id;
-
-   /* equal the offset of the root's key */
-   u64 root_offset;
-
-   /* flags of the root */
-   u64 flags;
-
-   /* the id of the root that references this one */
-   u64 ref_tree;
-
-   /* the dir id we're in from ref_tree */
-   u64 dir_id;
-
-   u64 top_id;
-
-   /* generation when the root is created or last updated */
-   u64 gen;
-
-   /* creation generation of this root in sec*/
-   u64 ogen;
-
-   /* creation time of this root in sec*/
-   time_t otime;
-
-   u8 uuid[BTRFS_UUID_SIZE];
-   u8 puuid[BTRFS_UUID_SIZE];
-
-   /* path from the subvol we live in to this root, including the
-* root's name.  This is null until we do the extra lookup ioctl.
-*/
-   char *path;
-
-   /* the name of this root in the directory it lives in */
-   char *name;
-
-   char *full_path;
-};
-
 struct {
char*name;
char*column_name;
diff --git a/btrfs-list.h b/btrfs-list.h
index 855e73d..3b7b680 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -18,7 +18,52 @@
 
 #include kerncompat.h
 
-struct root_info;
+/*
+ * one of these for each root we find.
+ */
+struct root_info {
+   struct rb_node rb_node;
+   struct rb_node sort_node;
+
+   /* this root's id */
+   u64 root_id;
+
+   /* equal the offset of the root's key */
+   u64 root_offset;
+
+   /* flags of the root */
+   u64 flags;
+
+   /* the id of the root that references this one */
+   u64 ref_tree;
+
+   /* the dir id we're in from ref_tree */
+   u64 dir_id;
+
+   u64 top_id;
+
+   /* generation when the root is created or last updated */
+   u64 gen;
+
+   /* creation generation of this root in sec*/
+   u64 ogen;
+
+   /* creation time of this root in sec*/
+   time_t otime;
+
+   u8 uuid[BTRFS_UUID_SIZE];
+   u8 puuid[BTRFS_UUID_SIZE];
+
+   /* path from the subvol we live in to this root, including the
+* root's name.  This is null until we do the extra lookup ioctl.
+*/
+   char *path;
+
+   /* the name of this root in the directory it lives in */
+   char *name;
+
+   char *full_path;
+};
 
 typedef int (*btrfs_list_filter_func)(struct root_info *, u64);
 typedef int (*btrfs_list_comp_func)(struct root_info *, struct root_info *,
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/12] Btrfs-progs: add parent uuid for snapshots

2013-01-30 Thread Anand Jain
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 34 --
 btrfs-list.h |  1 +
 cmds-subvolume.c |  6 +-
 3 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index ab42a33..03a0d02 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -80,6 +80,7 @@ struct root_info {
time_t otime;
 
u8 uuid[BTRFS_UUID_SIZE];
+   u8 puuid[BTRFS_UUID_SIZE];
 
/* path from the subvol we live in to this root, including the
 * root's name.  This is null until we do the extra lookup ioctl.
@@ -128,6 +129,11 @@ struct {
.need_print = 0,
},
{
+   .name   = parent_uuid,
+   .column_name= Parent UUID,
+   .need_print = 0,
+   },
+   {
.name   = uuid,
.column_name= UUID,
.need_print = 0,
@@ -435,7 +441,7 @@ static struct root_info *root_tree_search(struct 
root_lookup *root_tree,
 static int update_root(struct root_lookup *root_lookup,
   u64 root_id, u64 ref_tree, u64 root_offset, u64 flags,
   u64 dir_id, char *name, int name_len, u64 ogen, u64 gen,
-  time_t ot, void *uuid)
+  time_t ot, void *uuid, void *puuid)
 {
struct root_info *ri;
 
@@ -472,6 +478,8 @@ static int update_root(struct root_lookup *root_lookup,
ri-otime = ot;
if (uuid)
memcpy(ri-uuid, uuid, BTRFS_UUID_SIZE);
+   if (puuid)
+   memcpy(ri-puuid, puuid, BTRFS_UUID_SIZE);
 
return 0;
 }
@@ -489,17 +497,18 @@ static int update_root(struct root_lookup *root_lookup,
  * gen: the current generation of the root
  * ot: the original time(create time) of the root
  * uuid: uuid of the root
+ * puuid: uuid of the root parent if any
  */
 static int add_root(struct root_lookup *root_lookup,
u64 root_id, u64 ref_tree, u64 root_offset, u64 flags,
u64 dir_id, char *name, int name_len, u64 ogen, u64 gen,
-   time_t ot, void *uuid)
+   time_t ot, void *uuid, void *puuid)
 {
struct root_info *ri;
int ret;
 
ret = update_root(root_lookup, root_id, ref_tree, root_offset, flags,
- dir_id, name, name_len, ogen, gen, ot, uuid);
+ dir_id, name, name_len, ogen, gen, ot, uuid, puuid);
if (!ret)
return 0;
 
@@ -537,9 +546,12 @@ static int add_root(struct root_lookup *root_lookup,
if (ot)
ri-otime = ot;
 
-   if (uuid) 
+   if (uuid)
memcpy(ri-uuid, uuid, BTRFS_UUID_SIZE);
 
+   if (puuid)
+   memcpy(ri-puuid, puuid, BTRFS_UUID_SIZE);
+
ret = root_tree_insert(root_lookup, ri);
if (ret) {
printf(failed to insert tree %llu\n, (unsigned long 
long)root_id);
@@ -1022,6 +1034,7 @@ static int __list_subvol_search(int fd, struct 
root_lookup *root_lookup)
int i;
time_t t;
u8 uuid[BTRFS_UUID_SIZE];
+   u8 puuid[BTRFS_UUID_SIZE];
 
root_lookup_init(root_lookup);
memset(args, 0, sizeof(args));
@@ -1074,7 +1087,7 @@ static int __list_subvol_search(int fd, struct 
root_lookup *root_lookup)
 
add_root(root_lookup, sh.objectid, sh.offset,
 0, 0, dir_id, name, name_len, 0, 0, 0,
-NULL);
+NULL, NULL);
} else if (sh.type == BTRFS_ROOT_ITEM_KEY) {
ri = (struct btrfs_root_item *)(args.buf + off);
gen = btrfs_root_generation(ri);
@@ -1084,15 +1097,17 @@ static int __list_subvol_search(int fd, struct 
root_lookup *root_lookup)
t = ri-otime.sec;
ogen = btrfs_root_otransid(ri);
memcpy(uuid, ri-uuid, BTRFS_UUID_SIZE);
+   memcpy(puuid, ri-parent_uuid, 
BTRFS_UUID_SIZE);
} else {
t = 0;
ogen = 0;
memset(uuid, 0, BTRFS_UUID_SIZE);
+   memset(puuid, 0, BTRFS_UUID_SIZE);
}
 
add_root(root_lookup, sh.objectid, 0,
 sh.offset, flags, 0, NULL, 0, ogen,
-gen, t, uuid);
+gen, t, uuid, puuid);
}
 
off += sh.len;
@@ -1346,6 +1361,13 @@ static void print_subvolume_column(struct root_info 

[PATCH 05/12] Btrfs-progs: add method to filter snapshots by parent uuid

2013-01-30 Thread Anand Jain
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 6 ++
 btrfs-list.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/btrfs-list.c b/btrfs-list.c
index 0e4b3eb..93d167e 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1143,6 +1143,11 @@ static int filter_topid_equal(struct root_info *ri, u64 
data)
return ri-top_id == data;
 }
 
+static int filter_by_parent(struct root_info *ri, u64 data)
+{
+   return !uuid_compare(ri-puuid, (u8 *)data);
+}
+
 static btrfs_list_filter_func all_filter_funcs[] = {
[BTRFS_LIST_FILTER_ROOTID]  = filter_by_rootid,
[BTRFS_LIST_FILTER_SNAPSHOT_ONLY]   = filter_snapshot,
@@ -1154,6 +1159,7 @@ static btrfs_list_filter_func all_filter_funcs[] = {
[BTRFS_LIST_FILTER_CGEN_LESS]   = filter_cgen_less,
[BTRFS_LIST_FILTER_CGEN_EQUAL]  = filter_cgen_equal,
[BTRFS_LIST_FILTER_TOPID_EQUAL] = filter_topid_equal,
+   [BTRFS_LIST_FILTER_BY_PARENT]   = filter_by_parent,
 };
 
 struct btrfs_list_filter_set *btrfs_list_alloc_filter_set(void)
diff --git a/btrfs-list.h b/btrfs-list.h
index 580d4d1..cde7a3f 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -117,6 +117,7 @@ enum btrfs_list_filter_enum {
BTRFS_LIST_FILTER_CGEN_LESS,
BTRFS_LIST_FILTER_CGEN_MORE,
BTRFS_LIST_FILTER_TOPID_EQUAL,
+   BTRFS_LIST_FILTER_BY_PARENT,
BTRFS_LIST_FILTER_MAX,
 };
 
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/12] Btrfs-progs: put find_mount_root() in commands.h

2013-01-30 Thread Anand Jain
A useful function need to define it in a header file.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 commands.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/commands.h b/commands.h
index 61d74d7..1dd6180 100644
--- a/commands.h
+++ b/commands.h
@@ -105,3 +105,6 @@ int cmd_replace(int argc, char **argv);
 
 /* subvolume exported functions */
 int test_issubvolume(char *path);
+
+/* send.c */
+int find_mount_root(const char *path, char **mount_root);
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/12] Btrfs-progs: make printing subvol extensible to newer layouts

2013-01-30 Thread Anand Jain
Currently you can print subvol in a list or table format.
This patch will provide a way to extend this to other formats
like the upcoming raw format.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 26 +++---
 btrfs-list.h |  3 +++
 cmds-subvolume.c | 23 ---
 3 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 93d167e..20b84ab 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -54,12 +54,12 @@ struct {
{
.name   = ID,
.column_name= ID,
-   .need_print = 1,
+   .need_print = 0,
},
{
.name   = gen,
.column_name= Gen,
-   .need_print = 1,
+   .need_print = 0,
},
{
.name   = cgen,
@@ -74,7 +74,7 @@ struct {
{
.name   = top level,
.column_name= Top Level,
-   .need_print = 1,
+   .need_print = 0,
},
{
.name   = otime,
@@ -94,7 +94,7 @@ struct {
{
.name   = path,
.column_name= Path,
-   .need_print = 1,
+   .need_print = 0,
},
{
.name   = NULL,
@@ -1402,21 +1402,25 @@ static void print_all_volume_info_tab_head()
 }
 
 static void print_all_volume_info(struct root_lookup *sorted_tree,
- int is_tab_result)
+ int layout)
 {
struct rb_node *n;
struct root_info *entry;
 
-   if (is_tab_result)
+   if (layout == BTRFS_LIST_LAYOUT_TABLE)
print_all_volume_info_tab_head();
 
n = rb_first(sorted_tree-root);
while (n) {
entry = rb_entry(n, struct root_info, sort_node);
-   if (is_tab_result)
-   print_single_volume_info_table(entry);
-   else
+   switch (layout) {
+   case BTRFS_LIST_LAYOUT_DEFAULT:
print_single_volume_info_default(entry);
+   break;
+   case BTRFS_LIST_LAYOUT_TABLE:
+   print_single_volume_info_table(entry);
+   break;
+   }
n = rb_next(n);
}
 }
@@ -1442,7 +1446,7 @@ int btrfs_list_subvols(int fd, struct root_lookup 
*root_lookup)
 
 int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
   struct btrfs_list_comparer_set *comp_set,
-  int is_tab_result)
+  int layout)
 {
struct root_lookup root_lookup;
struct root_lookup root_sort;
@@ -1454,7 +1458,7 @@ int btrfs_list_subvols_print(int fd, struct 
btrfs_list_filter_set *filter_set,
__filter_and_sort_subvol(root_lookup, root_sort, filter_set,
 comp_set, fd);
 
-   print_all_volume_info(root_sort, is_tab_result);
+   print_all_volume_info(root_sort, layout);
__free_all_subvolumn(root_lookup);
 
return 0;
diff --git a/btrfs-list.h b/btrfs-list.h
index cde7a3f..5b60068 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -18,6 +18,9 @@
 
 #include kerncompat.h
 
+#define BTRFS_LIST_LAYOUT_DEFAULT  0
+#define BTRFS_LIST_LAYOUT_TABLE1
+
 /*
  * one of these for each root we find.
  */
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index a1e6893..bb9629f 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -410,8 +410,18 @@ static int cmd_subvol_list(int argc, char **argv)
BTRFS_LIST_FILTER_TOPID_EQUAL,
top_id);
 
-   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
-   is_tab_result);
+   /* by default we shall print the following columns*/
+   btrfs_list_setup_print_column(BTRFS_LIST_OBJECTID);
+   btrfs_list_setup_print_column(BTRFS_LIST_GENERATION);
+   btrfs_list_setup_print_column(BTRFS_LIST_TOP_LEVEL);
+   btrfs_list_setup_print_column(BTRFS_LIST_PATH);
+
+   if (is_tab_result)
+   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
+   BTRFS_LIST_LAYOUT_TABLE);
+   else
+   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
+   BTRFS_LIST_LAYOUT_DEFAULT);
if (ret)
return 19;
return 0;
@@ -617,7 +627,14 @@ static int cmd_subvol_get_default(int argc, char **argv)
btrfs_list_setup_filter(filter_set, BTRFS_LIST_FILTER_ROOTID,
default_id);
 
-   ret = btrfs_list_subvols_print(fd, filter_set, NULL, 0);
+   /* by default we shall print the following columns*/
+   

[PATCH 09/12] Btrfs-progs: add show subcommand to subvol cli

2013-01-30 Thread Anand Jain
This adds show sub-command to the btrfs subvol cli
to display detailed inforamtion of the given subvol
or snapshot.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c |  25 +++--
 btrfs-list.h |   3 +-
 cmds-subvolume.c | 155 +--
 man/btrfs.8.in   |   6 +++
 4 files changed, 182 insertions(+), 7 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 20b84ab..545aa15 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1336,6 +1336,22 @@ static void print_subvolume_column(struct root_info 
*subv,
}
 }
 
+static void print_single_volume_info_raw(struct root_info *subv, char 
*raw_prefix)
+{
+   int i;
+
+   for (i = 0; i  BTRFS_LIST_ALL; i++) {
+   if (!btrfs_list_columns[i].need_print)
+   continue;
+
+   if (raw_prefix)
+   printf(%s,raw_prefix);
+
+   print_subvolume_column(subv, i);
+   }
+   printf(\n);
+}
+
 static void print_single_volume_info_table(struct root_info *subv)
 {
int i;
@@ -1402,7 +1418,7 @@ static void print_all_volume_info_tab_head()
 }
 
 static void print_all_volume_info(struct root_lookup *sorted_tree,
- int layout)
+ int layout, char *raw_prefix)
 {
struct rb_node *n;
struct root_info *entry;
@@ -1420,6 +1436,9 @@ static void print_all_volume_info(struct root_lookup 
*sorted_tree,
case BTRFS_LIST_LAYOUT_TABLE:
print_single_volume_info_table(entry);
break;
+   case BTRFS_LIST_LAYOUT_RAW:
+   print_single_volume_info_raw(entry, raw_prefix);
+   break;
}
n = rb_next(n);
}
@@ -1446,7 +1465,7 @@ int btrfs_list_subvols(int fd, struct root_lookup 
*root_lookup)
 
 int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
   struct btrfs_list_comparer_set *comp_set,
-  int layout)
+  int layout, char *raw_prefix)
 {
struct root_lookup root_lookup;
struct root_lookup root_sort;
@@ -1458,7 +1477,7 @@ int btrfs_list_subvols_print(int fd, struct 
btrfs_list_filter_set *filter_set,
__filter_and_sort_subvol(root_lookup, root_sort, filter_set,
 comp_set, fd);
 
-   print_all_volume_info(root_sort, layout);
+   print_all_volume_info(root_sort, layout, raw_prefix);
__free_all_subvolumn(root_lookup);
 
return 0;
diff --git a/btrfs-list.h b/btrfs-list.h
index 5b60068..09d35f7 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -20,6 +20,7 @@
 
 #define BTRFS_LIST_LAYOUT_DEFAULT  0
 #define BTRFS_LIST_LAYOUT_TABLE1
+#define BTRFS_LIST_LAYOUT_RAW  2
 
 /*
  * one of these for each root we find.
@@ -150,7 +151,7 @@ int btrfs_list_setup_comparer(struct 
btrfs_list_comparer_set **comp_set,
 
 int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
   struct btrfs_list_comparer_set *comp_set,
-   int is_tab_result);
+   int layout, char *raw_prefix);
 int btrfs_list_find_updated_files(int fd, u64 root_id, u64 oldest_gen);
 int btrfs_list_get_default_subvolume(int fd, u64 *default_id);
 char *btrfs_list_path_for_root(int fd, u64 root);
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index bb9629f..9f1d2a4 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -24,6 +24,7 @@
 #include libgen.h
 #include limits.h
 #include getopt.h
+#include uuid/uuid.h
 
 #include kerncompat.h
 #include ioctl.h
@@ -418,10 +419,10 @@ static int cmd_subvol_list(int argc, char **argv)
 
if (is_tab_result)
ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
-   BTRFS_LIST_LAYOUT_TABLE);
+   BTRFS_LIST_LAYOUT_TABLE, NULL);
else
ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
-   BTRFS_LIST_LAYOUT_DEFAULT);
+   BTRFS_LIST_LAYOUT_DEFAULT, NULL);
if (ret)
return 19;
return 0;
@@ -634,7 +635,7 @@ static int cmd_subvol_get_default(int argc, char **argv)
btrfs_list_setup_print_column(BTRFS_LIST_PATH);
 
ret = btrfs_list_subvols_print(fd, filter_set, NULL,
-   BTRFS_LIST_LAYOUT_DEFAULT);
+   BTRFS_LIST_LAYOUT_DEFAULT, NULL);
if (ret)
return 19;
return 0;
@@ -721,6 +722,153 @@ static int cmd_find_new(int argc, char **argv)
return 0;
 }
 
+static const char * const cmd_subvol_show_usage[] = {
+   btrfs subvolume show subvol-path,
+   Show more information of the subvolume,
+   NULL
+};
+
+static int cmd_subvol_show(int argc, char **argv)
+{
+   struct 

[PATCH 11/12] Btrfs-progs: update btrfs_get_subvol to be inline with resolve_root ret changes

2013-01-30 Thread Anand Jain
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 69ee3e7..eadfba4 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1503,19 +1503,24 @@ int btrfs_list_subvols_print(int fd, struct 
btrfs_list_filter_set *filter_set,
 
 int btrfs_get_subvol(int fd, struct root_info *the_ri)
 {
-   int ret = -1;
+   int ret = 1, rr;
struct root_lookup rl;
struct rb_node *rbn;
struct root_info *ri;
u64 root_id = btrfs_list_get_path_rootid(fd);
 
if (btrfs_list_subvols(fd, rl))
-   return 1;
+   return ret;
 
rbn = rb_first(rl.root);
while(rbn) {
ri = rb_entry(rbn, struct root_info, rb_node);
-   resolve_root(rl, ri, root_id);
+   rr = resolve_root(rl, ri, root_id);
+   if (rr == -ENOENT) {
+   ret = -ENOENT;
+   rbn = rb_next(rbn);
+   continue;
+   }
if (!comp_entry_with_rootid(the_ri, ri, 0)) {
memcpy(the_ri, ri, offsetof(struct root_info, path));
if (ri-path)
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/12] Btrfs-progs: filter the deleted subvolumes when listing snapshots

2013-01-30 Thread Anand Jain
From: Wang Shilong wangsl-f...@cn.fujistu.com

btrfs snapshot list command will stop by the deleted subvolumes.

The problem may happen by two ways:
1. a subvolume deletion is not commited, that is ROOT_BACKREF has been deleted,
   but ROOT_ITEM still exists. The command will fail to fill the path of
   the deleted subvolumes because we can not get the parent fs/file tree.
2. a subvolume is possibly deleted when we fill the path, For example,
   Fs tree
 |-subv0
  |-subv1

   We may fill the path of subv1 firstly, after that, some user deletes subv1
   and subv0, and then we fill the path of subv0. The command will fail to
   fill the path of subv0 because we can not get path of subv0. And the command
   also will fail to make the full path of subv1 because we don't have the path
   of subv0.

Since these subvolumes have been deleted, we should filter them. This patch
fixed the above problem by this way.

For the 1st case, -ref_tree of the deleted subvolumes are 0.
For the 2nd case, if we found the error number that ioctl() returns is ENOENT,
we will set -ref_tree to 0.
And when we make the full path of the subvolumes, we will check -ref_tree of
them and their parent. If someone's -ref_tree or its parent's -ref_tree is 0,
we will filter it.

Reported-by: Stefan Priebe s.pri...@profihost.ag
Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 42 --
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 545aa15..69ee3e7 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -564,6 +564,12 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri,
while (1) {
char *tmp;
u64 next;
+   /*
+   * ref_tree = 0 indicates the subvolumes
+   * has been deleted.
+   */
+   if (!found-ref_tree)
+   return -ENOENT;
int add_len = strlen(found-path);
 
/* room for / and for null */
@@ -592,6 +598,10 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri,
break;
}
 
+   /*
+   * if the ref_tree = BTRFS_FS_TREE_OBJECTID,
+   * we are at the top
+   */
if (next == BTRFS_FS_TREE_OBJECTID) {
char p[] = FS_TREE;
add_len = strlen(p);
@@ -608,14 +618,12 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri,
}
 
/*
-* if the ref_tree wasn't in our tree of roots, we're
-* at the top
-*/
+   * if the ref_tree wasn't in our tree of roots, the
+   * subvolume was deleted.
+   */
found = root_tree_search(rl, next);
-   if (!found) {
-   ri-top_id = next;
-   break;
-   }
+   if (!found)
+   return -ENOENT;
}
 
ri-full_path = full_path;
@@ -638,6 +646,9 @@ static int lookup_ino_path(int fd, struct root_info *ri)
if (ri-path)
return 0;
 
+   if (!ri-ref_tree)
+   return -ENOENT;
+
memset(args, 0, sizeof(args));
args.treeid = ri-ref_tree;
args.objectid = ri-dir_id;
@@ -645,6 +656,10 @@ static int lookup_ino_path(int fd, struct root_info *ri)
ret = ioctl(fd, BTRFS_IOC_INO_LOOKUP, args);
e = errno;
if (ret) {
+   if (e == ENOENT) {
+   ri-ref_tree = 0;
+   return -ENOENT;
+   }
fprintf(stderr, ERROR: Failed to lookup path for root %llu - 
%s\n,
(unsigned long long)ri-ref_tree,
strerror(e));
@@ -1255,10 +1270,13 @@ static void __filter_and_sort_subvol(struct root_lookup 
*all_subvols,
while (n) {
entry = rb_entry(n, struct root_info, rb_node);
 
-   resolve_root(all_subvols, entry, top_id);
+   ret = resolve_root(all_subvols, entry, top_id);
+   if (ret == -ENOENT)
+   goto skip;
ret = filter_root(entry, filter_set);
if (ret)
sort_tree_insert(sort_tree, entry, comp_set);
+skip:
n = rb_prev(n);
}
 }
@@ -1273,7 +1291,7 @@ static int __list_subvol_fill_paths(int fd, struct 
root_lookup *root_lookup)
int ret;
entry = rb_entry(n, struct root_info, rb_node);
ret = lookup_ino_path(fd, entry);
-   if(ret  0)
+   if (ret  ret != -ENOENT)
return ret;
n = rb_next(n);
}
@@ -1721,7 +1739,11 @@ char 

[PATCH 12/12] Btrfs-progs: Fix a small memory leak in managing the btrfs list filter

2013-01-30 Thread Anand Jain
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 cmds-subvolume.c | 57 +++-
 1 file changed, 40 insertions(+), 17 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 9f1d2a4..5e51a26 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -303,9 +303,9 @@ static int cmd_subvol_list(int argc, char **argv)
struct btrfs_list_filter_set *filter_set;
struct btrfs_list_comparer_set *comparer_set;
u64 flags = 0;
-   int fd;
+   int fd = -1;
u64 top_id;
-   int ret;
+   int ret = -1, uerr = 0;
int c;
char *subvol;
int is_tab_result = 0;
@@ -356,8 +356,10 @@ static int cmd_subvol_list(int argc, char **argv)
ret = btrfs_list_parse_filter_string(optarg,
filter_set,
BTRFS_LIST_FILTER_GEN);
-   if (ret)
-   usage(cmd_subvol_list_usage);
+   if (ret) {
+   uerr = 1;
+   goto out;
+   }
break;
 
case 'c':
@@ -365,18 +367,23 @@ static int cmd_subvol_list(int argc, char **argv)
ret = btrfs_list_parse_filter_string(optarg,
filter_set,
BTRFS_LIST_FILTER_CGEN);
-   if (ret)
-   usage(cmd_subvol_list_usage);
+   if (ret) {
+   uerr = 1;
+   goto out;
+   }
break;
case 'S':
ret = btrfs_list_parse_sort_string(optarg,
   comparer_set);
-   if (ret)
-   usage(cmd_subvol_list_usage);
+   if (ret) {
+   uerr = 1;
+   goto out;
+   }
break;
 
default:
-   usage(cmd_subvol_list_usage);
+   uerr = 1;
+   goto out;
}
}
 
@@ -384,25 +391,29 @@ static int cmd_subvol_list(int argc, char **argv)
btrfs_list_setup_filter(filter_set, BTRFS_LIST_FILTER_FLAGS,
flags);
 
-   if (check_argc_exact(argc - optind, 1))
-   usage(cmd_subvol_list_usage);
+   if (check_argc_exact(argc - optind, 1)) {
+   uerr = 1;
+   goto out;
+   }
 
subvol = argv[optind];
 
ret = test_issubvolume(subvol);
if (ret  0) {
fprintf(stderr, ERROR: error accessing '%s'\n, subvol);
-   return 12;
+   goto out;
}
if (!ret) {
fprintf(stderr, ERROR: '%s' is not a subvolume\n, subvol);
-   return 13;
+   ret = -1;
+   goto out;
}
 
fd = open_file_or_dir(subvol);
if (fd  0) {
+   ret = -1;
fprintf(stderr, ERROR: can't access '%s'\n, subvol);
-   return 12;
+   goto out;
}
 
top_id = btrfs_list_get_path_rootid(fd);
@@ -423,9 +434,16 @@ static int cmd_subvol_list(int argc, char **argv)
else
ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
BTRFS_LIST_LAYOUT_DEFAULT, NULL);
-   if (ret)
-   return 19;
-   return 0;
+
+out:
+   if (filter_set)
+   btrfs_list_free_filter_set(filter_set);
+   if (comparer_set)
+   btrfs_list_free_comparer_set(comparer_set);
+   if (uerr)
+   usage(cmd_subvol_list_usage);
+
+   return ret;
 }
 
 static const char * const cmd_snapshot_usage[] = {
@@ -636,6 +654,9 @@ static int cmd_subvol_get_default(int argc, char **argv)
 
ret = btrfs_list_subvols_print(fd, filter_set, NULL,
BTRFS_LIST_LAYOUT_DEFAULT, NULL);
+
+   if (filter_set)
+   btrfs_list_free_filter_set(filter_set);
if (ret)
return 19;
return 0;
@@ -855,6 +876,8 @@ static int cmd_subvol_show(int argc, char **argv)
free(get_ri.name);
if (get_ri.full_path)
free(get_ri.full_path);
+   if (filter_set)
+   btrfs_list_free_filter_set(filter_set);
 
 out:
if (mntfd = 0)
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/12] Btrfs-progs: make get_subvol_name non cmds-send specific

2013-01-30 Thread Anand Jain
get_subvol_name can be used other than the just with in cmds-send.c
so this patch will make it possible with out changing the original
intentions.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 cmds-send.c | 12 ++--
 commands.h  |  1 +
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/cmds-send.c b/cmds-send.c
index 4a8478d..0ec63a0 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -334,12 +334,12 @@ out:
return ret;
 }
 
-static const char *get_subvol_name(struct btrfs_send *s, const char *full_path)
+char *get_subvol_name(char *mnt, char *full_path)
 {
-   int len = strlen(s-root_path);
+   int len = strlen(mnt);
if (!len)
return full_path;
-   if (s-root_path[len - 1] != '/')
+   if (mnt[len - 1] != '/')
len += 1;
 
return full_path + len;
@@ -454,7 +454,7 @@ int cmd_send_start(int argc, char **argv)
if (ret  0)
goto out;
 
-   ret = get_root_id(send, get_subvol_name(send, subvol),
+   ret = get_root_id(send, 
get_subvol_name(send.root_path, subvol),
root_id);
if (ret  0) {
fprintf(stderr, ERROR: could not resolve 
@@ -524,7 +524,7 @@ int cmd_send_start(int argc, char **argv)
 
if (snapshot_parent != NULL) {
ret = get_root_id(send,
-   get_subvol_name(send, snapshot_parent),
+   get_subvol_name(send.root_path, 
snapshot_parent),
parent_root_id);
if (ret  0) {
fprintf(stderr, ERROR: could not resolve root_id 
@@ -583,7 +583,7 @@ int cmd_send_start(int argc, char **argv)
goto out;
}
 
-   ret = get_root_id(send, get_subvol_name(send, subvol),
+   ret = get_root_id(send, get_subvol_name(send.root_path, 
subvol),
root_id);
if (ret  0) {
fprintf(stderr, ERROR: could not resolve root_id 
diff --git a/commands.h b/commands.h
index 1dd6180..ce0f3b9 100644
--- a/commands.h
+++ b/commands.h
@@ -108,3 +108,4 @@ int test_issubvolume(char *path);
 
 /* send.c */
 int find_mount_root(const char *path, char **mount_root);
+char *get_subvol_name(char *mnt, char *full_path);
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/12] Btrfs-progs: add function btrfs_get_subvol to get root_info of a subvol

2013-01-30 Thread Anand Jain
We need a function which can get the root_info of a given
subvol. This is in preparation to add support for the show
sub-cli.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c | 38 ++
 btrfs-list.h |  1 +
 2 files changed, 39 insertions(+)

diff --git a/btrfs-list.c b/btrfs-list.c
index f41c008..0e4b3eb 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1454,6 +1454,44 @@ int btrfs_list_subvols_print(int fd, struct 
btrfs_list_filter_set *filter_set,
return 0;
 }
 
+int btrfs_get_subvol(int fd, struct root_info *the_ri)
+{
+   int ret = -1;
+   struct root_lookup rl;
+   struct rb_node *rbn;
+   struct root_info *ri;
+   u64 root_id = btrfs_list_get_path_rootid(fd);
+
+   if (btrfs_list_subvols(fd, rl))
+   return 1;
+
+   rbn = rb_first(rl.root);
+   while(rbn) {
+   ri = rb_entry(rbn, struct root_info, rb_node);
+   resolve_root(rl, ri, root_id);
+   if (!comp_entry_with_rootid(the_ri, ri, 0)) {
+   memcpy(the_ri, ri, offsetof(struct root_info, path));
+   if (ri-path)
+   the_ri-path = strdup(ri-path);
+   else
+   the_ri-path = NULL;
+   if (ri-name)
+   the_ri-name = strdup(ri-name);
+   else
+   the_ri-name = NULL;
+   if (ri-full_path)
+   the_ri-full_path = strdup(ri-full_path);
+   else
+   the_ri-name = NULL;
+   ret = 0;
+   break;
+   }
+   rbn = rb_next(rbn);
+   }
+   __free_all_subvolumn(rl);
+   return ret;
+}
+
 static int print_one_extent(int fd, struct btrfs_ioctl_search_header *sh,
struct btrfs_file_extent_item *item,
u64 found_gen, u64 *cache_dirid,
diff --git a/btrfs-list.h b/btrfs-list.h
index 3b7b680..580d4d1 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -151,3 +151,4 @@ int btrfs_list_find_updated_files(int fd, u64 root_id, u64 
oldest_gen);
 int btrfs_list_get_default_subvolume(int fd, u64 *default_id);
 char *btrfs_list_path_for_root(int fd, u64 root);
 u64 btrfs_list_get_path_rootid(int fd);
+int btrfs_get_subvol(int fd, struct root_info *the_ri);
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/12] Btrfs-progs: move printing subvol list outside of btrfs_list_subvols

2013-01-30 Thread Anand Jain



Thanks for the review. Comments accepted. V5 sent out.

Anand


On 01/30/2013 11:27 AM, Wang Shilong wrote:

Hi,

To improve the code reuse its better to have btrfs_list_subvols
just return list of subvols witout printing

Signed-off-by: Anand Jain anand.j...@oracle.com
---
  btrfs-list.c | 28 ++--
  btrfs-list.h |  2 +-
  cmds-subvolume.c |  4 ++--
  3 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index cb42fbc..b404e1d 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1439,15 +1439,11 @@ static void print_all_volume_info(struct root_lookup 
*sorted_tree,
}
  }

-int btrfs_list_subvols(int fd, struct btrfs_list_filter_set *filter_set,
-  struct btrfs_list_comparer_set *comp_set,
-  int is_tab_result)
+int btrfs_list_subvols(int fd, struct root_lookup *root_lookup)
  {
-   struct root_lookup root_lookup;
-   struct root_lookup root_sort;
int ret;

-   ret = __list_subvol_search(fd, root_lookup);
+   ret = __list_subvol_search(fd, root_lookup);
if (ret) {
fprintf(stderr, ERROR: can't perform the search - %s\n,
strerror(errno));
@@ -1458,16 +1454,28 @@ int btrfs_list_subvols(int fd, struct 
btrfs_list_filter_set *filter_set,
 * now we have an rbtree full of root_info objects, but we need to fill
 * in their path names within the subvol that is referencing each one.
 */
-   ret = __list_subvol_fill_paths(fd, root_lookup);
-   if (ret  0)
-   return ret;
+   ret = __list_subvol_fill_paths(fd, root_lookup);
+   return ret;
+}

+int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
+  struct btrfs_list_comparer_set *comp_set,
+  int is_tab_result)
+{
+   struct root_lookup root_lookup;
+   struct root_lookup root_sort;
+   int ret;
+
+   ret = btrfs_list_subvols(fd, root_lookup);
+   if (ret)
+   return ret;
__filter_and_sort_subvol(root_lookup, root_sort, filter_set,
 comp_set, fd);

print_all_volume_info(root_sort, is_tab_result);
__free_all_subvolumn(root_lookup);

 Here we forget to free filter and comp_set before..i hope you can add it 
to your patchset..
 Maybe you can have patch 13...

 if (filter_set)
 btrfs_list_free_filter_set(filter_set);
 if (comp_set)
 btrfs_list_free_comparer_set(comp_set);

 Thanks,
 Wang

-   return ret;
+
+   return 0;
  }

  static int print_one_extent(int fd, struct btrfs_ioctl_search_header *sh,
diff --git a/btrfs-list.h b/btrfs-list.h
index cde4b3c..71fe0f3 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -98,7 +98,7 @@ int btrfs_list_setup_comparer(struct btrfs_list_comparer_set 
**comp_set,
  enum btrfs_list_comp_enum comparer,
  int is_descending);

-int btrfs_list_subvols(int fd, struct btrfs_list_filter_set *filter_set,
+int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set *filter_set,
   struct btrfs_list_comparer_set *comp_set,
int is_tab_result);
  int btrfs_list_find_updated_files(int fd, u64 root_id, u64 oldest_gen);
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index e3cdb1e..c35dff7 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -406,7 +406,7 @@ static int cmd_subvol_list(int argc, char **argv)
BTRFS_LIST_FILTER_TOPID_EQUAL,
top_id);

-   ret = btrfs_list_subvols(fd, filter_set, comparer_set,
+   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
is_tab_result);
if (ret)
return 19;
@@ -613,7 +613,7 @@ static int cmd_subvol_get_default(int argc, char **argv)
btrfs_list_setup_filter(filter_set, BTRFS_LIST_FILTER_ROOTID,
default_id);

-   ret = btrfs_list_subvols(fd, filter_set, NULL, 0);
+   ret = btrfs_list_subvols_print(fd, filter_set, NULL, 0);
if (ret)
return 19;
return 0;


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Hugo Mills
On Wed, Jan 30, 2013 at 01:27:37AM -0800, Roger Binns wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 I've been unable to find anything definitive about what happens if I use
 RAID0 to join an SSD and HDD together with respect to performance
 (latency, throughput).  The future is obvious (hot data tracking, using
 most appropriate device for the data, data migration).
 
 In my specific case I have a 250GB SSD and a 500GB HDD, and about 250GB of
 files (constantly growing).  One message I saw said that new blocks are
 allocated on the device with the most free space which implies the SSD
 would be virtually unused in my case, except for metadata which would only
 be used half the time.

   That would be the case with single mode, not with RAID-0.

   With RAID-0, you'd get data striped equally across all (in this
case, both) the devices, up to the size of the second-largest one, at
which point it'll stop allocating space.

 At the moment I have two independent filesystems (one per device) and
 manually move data files between them using symlinks to keep pathnames the
 same.  This requires keeping lots of slop free space on the SSD as well as
 administration whenever it runs out of space.
 
 My hope would be overall performance between that of the two devices, and
 closer to that of the SSD.

   We don't have any kind of hot-data management yet, but it's on the
list of things we'd like to have at some point.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Some days,  it's just not worth gnawing through the straps. ---   


signature.asc
Description: Digital signature


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 30/01/13 02:02, Hugo Mills wrote:
 On Wed, Jan 30, 2013 at 01:27:37AM -0800, Roger Binns wrote:
 In my specific case I have a 250GB SSD and a 500GB HDD, and about
 250GB of files (constantly growing).  One message I saw said that new
 blocks are allocated on the device with the most free space which
 implies the SSD would be virtually unused in my case, except for
 metadata which would only be used half the time.
 
 That would be the case with single mode, not with RAID-0.

Ah, I hadn't realised there was a major difference.

 With RAID-0, you'd get data striped equally across all (in this case,
 both) the devices, up to the size of the second-largest one, at which
 point it'll stop allocating space.

By stop allocating space I assume you mean it will return out of space
errors, even though there is technically 250GB of unused space.  I presume
there is no way to say that RAID-0 should be used where possible and then
fallback to single for the remaining space.

It looks like my choices are:

* RAID 0 and getting 500GB of usable space, with performance 50% of the
accesses at HDD levels and 50% at SSD levels

* Single and getting 750GB of usable space with performance and usage
mostly on the HDD

 We don't have any kind of hot-data management yet, but it's on the list
 of things we'd like to have at some point.

I'm happy to wait till it is available.  btrfs has been beneficial to me
in so many other respects (eg checksums, compression, online everything,
not having to deal with LVM and friends).  I was just hoping that joining
an SSD and HDD would be somewhat worthwhile now even if it isn't close to
what hot data will deliver in the future.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlEI+qUACgkQmOOfHg372QT/pwCfd0UiGGlQpIjCBtCpysPZtGEs
wEQAoNVIzFIkPp/EzHTDDaD9RD178dkB
=VUqP
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/12] Btrfs-progs: add show subcommand to subvol cli

2013-01-30 Thread Wang Shilong
Hi,
 This adds show sub-command to the btrfs subvol cli
 to display detailed inforamtion of the given subvol
 or snapshot.

 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
  btrfs-list.c |  25 +++--
  btrfs-list.h |   3 +-
  cmds-subvolume.c | 155 
 +--
  man/btrfs.8.in   |   6 +++
  4 files changed, 182 insertions(+), 7 deletions(-)

 diff --git a/btrfs-list.c b/btrfs-list.c
 index 656de10..1915ece 100644
 --- a/btrfs-list.c
 +++ b/btrfs-list.c
 @@ -1335,6 +1335,22 @@ static void print_subvolume_column(struct root_info 
 *subv,
   }
  }
  
 +static void print_single_volume_info_raw(struct root_info *subv, char 
 *raw_prefix)
 +{
 + int i;
 +
 + for (i = 0; i  BTRFS_LIST_ALL; i++) {
 + if (!btrfs_list_columns[i].need_print)
 + continue;
 +
 + if (raw_prefix)
 + printf(%s,raw_prefix);
 +
 + print_subvolume_column(subv, i);
 + }
 + printf(\n);
 +}
 +
  static void print_single_volume_info_table(struct root_info *subv)
  {
   int i;
 @@ -1401,7 +1417,7 @@ static void print_all_volume_info_tab_head()
  }
  
  static void print_all_volume_info(struct root_lookup *sorted_tree,
 -   int layout)
 +   int layout, char *raw_prefix)
  {
   struct rb_node *n;
   struct root_info *entry;
 @@ -1419,6 +1435,9 @@ static void print_all_volume_info(struct root_lookup 
 *sorted_tree,
   case BTRFS_LIST_LAYOUT_TABLE:
   print_single_volume_info_table(entry);
   break;
 + case BTRFS_LIST_LAYOUT_RAW:
 + print_single_volume_info_raw(entry, raw_prefix);
 + break;
   }
   n = rb_next(n);
   }
 @@ -1445,7 +1464,7 @@ int btrfs_list_subvols(int fd, struct root_lookup 
 *root_lookup)
  
  int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set 
 *filter_set,
  struct btrfs_list_comparer_set *comp_set,
 -int layout)
 +int layout, char *raw_prefix)
  {
   struct root_lookup root_lookup;
   struct root_lookup root_sort;
 @@ -1457,7 +1476,7 @@ int btrfs_list_subvols_print(int fd, struct 
 btrfs_list_filter_set *filter_set,
   __filter_and_sort_subvol(root_lookup, root_sort, filter_set,
comp_set, fd);
  
 - print_all_volume_info(root_sort, layout);
 + print_all_volume_info(root_sort, layout, raw_prefix);
   __free_all_subvolumn(root_lookup);
  
   return 0;
 diff --git a/btrfs-list.h b/btrfs-list.h
 index 5b60068..09d35f7 100644
 --- a/btrfs-list.h
 +++ b/btrfs-list.h
 @@ -20,6 +20,7 @@
  
  #define BTRFS_LIST_LAYOUT_DEFAULT0
  #define BTRFS_LIST_LAYOUT_TABLE  1
 +#define BTRFS_LIST_LAYOUT_RAW2
  
  /*
   * one of these for each root we find.
 @@ -150,7 +151,7 @@ int btrfs_list_setup_comparer(struct 
 btrfs_list_comparer_set **comp_set,
  
  int btrfs_list_subvols_print(int fd, struct btrfs_list_filter_set 
 *filter_set,
  struct btrfs_list_comparer_set *comp_set,
 - int is_tab_result);
 + int layout, char *raw_prefix);
  int btrfs_list_find_updated_files(int fd, u64 root_id, u64 oldest_gen);
  int btrfs_list_get_default_subvolume(int fd, u64 *default_id);
  char *btrfs_list_path_for_root(int fd, u64 root);
 diff --git a/cmds-subvolume.c b/cmds-subvolume.c
 index bb9629f..6a14c4c 100644
 --- a/cmds-subvolume.c
 +++ b/cmds-subvolume.c
 @@ -24,6 +24,7 @@
  #include libgen.h
  #include limits.h
  #include getopt.h
 +#include uuid/uuid.h
  
  #include kerncompat.h
  #include ioctl.h
 @@ -418,10 +419,10 @@ static int cmd_subvol_list(int argc, char **argv)
  
   if (is_tab_result)
   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
 - BTRFS_LIST_LAYOUT_TABLE);
 + BTRFS_LIST_LAYOUT_TABLE, NULL);
   else
   ret = btrfs_list_subvols_print(fd, filter_set, comparer_set,
 - BTRFS_LIST_LAYOUT_DEFAULT);
 + BTRFS_LIST_LAYOUT_DEFAULT, NULL);
   if (ret)
   return 19;
   return 0;
 @@ -634,7 +635,7 @@ static int cmd_subvol_get_default(int argc, char **argv)
   btrfs_list_setup_print_column(BTRFS_LIST_PATH);
  
   ret = btrfs_list_subvols_print(fd, filter_set, NULL,
 - BTRFS_LIST_LAYOUT_DEFAULT);
 + BTRFS_LIST_LAYOUT_DEFAULT, NULL);
   if (ret)
   return 19;
   return 0;
 @@ -721,6 +722,153 @@ static int cmd_find_new(int argc, char **argv)
   return 0;
  }
  
 +static const char * const cmd_subvol_show_usage[] = {
 + btrfs subvolume show subvol-path,
 + Show more information of the subvolume,
 + NULL
 +};
 +
 +static int cmd_subvol_show(int argc, 

Fwd: btrfsck and ctree version

2013-01-30 Thread polack christian
i know that the proposed ctree.c file is from a kernel source but
btrfsck is user space only, since the btrfs-next is newer than
btrfs-prog i was hoping for a commit of this change for the user-space
version.

since this file-system have been created prior kernel 3.2 there is no
tree root backup

 i was hoping using  btrfsck to regenerate the  csum which are failing
during mount time (Input/output error)

/var/log/messages: btrfs csum failed ino 1048522 off 5124096 csum
1219517398 private 836806197

 i didn't find any way to deactivate csum check  with a mount option

or as chris say is there a way to regenerate  the cache on the block device.

is there a solution ?

thanks for your responses

olivier


2013/1/29 Chris Mason chris.ma...@fusionio.com

 On Mon, Jan 28, 2013 at 03:03:08PM -0700, David Sterba wrote:
  On Mon, Jan 28, 2013 at 03:07:13PM +0100, polack christian wrote:
   i did use btrfsck to recover it
   i got the tool from
  
   git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
  
   and i got this error message:
   ...
   Check tree block failed, want=294555648, have=0
   Check tree block failed, want=294559744, have=0
   Check tree block failed, want=294559744, have=0
   btrfsck: ctree.c:1690: leaf_space_used: Assertion `!(data_len  0)' 
   failed.
   Aborted (core dumped)
  
   looking at
  
   git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git
 
  but this is a kernel source repository, not progs, I wonder
 
   this error in ctree.c have been corrected by this commit
  
   http://git.kernel.org/?p=linux/kernel/git/josef/btrfs-next.git;a=commit;h=41be1f3b40b87de33cd2e7463dce88596dbdccc4
 
  how this could happen. I have looked at the whether it does not silently
  fix a bug, nothing wrong I can see now.  How did you verify that the
  patch fixes the fsck problem?

 It sounds much more like the reboot or remount cleared the cache on the
 block device.

 -chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Sander
Roger Binns wrote (ao):
 I'm happy to wait till it is available. btrfs has been beneficial to
 me in so many other respects (eg checksums, compression, online
 everything, not having to deal with LVM and friends). I was just
 hoping that joining an SSD and HDD would be somewhat worthwhile now
 even if it isn't close to what hot data will deliver in the future.

Do you know about bcache and EnhanceIO ?

http://bcache.evilpiepirate.org/
and
https://github.com/stec-inc/EnhanceIO

Sander
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Btrfs-progs: move path modification to filters

2013-01-30 Thread Lukáš Czerner
On Thu, 10 Jan 2013, Lukáš Czerner wrote:

 Date: Thu, 10 Jan 2013 13:02:42 +0100 (CET)
 From: Lukáš Czerner lczer...@redhat.com
 To: Lukas Czerner lczer...@redhat.com
 Cc: linux-btrfs@vger.kernel.org, chris.ma...@fusionio.com, cwi...@cwillu.com
 Subject: Re: [PATCH 1/3] Btrfs-progs: move path modification to filters
 
 On Tue, 11 Dec 2012, Lukas Czerner wrote:
 
  Date: Tue, 11 Dec 2012 15:24:58 +0100
  From: Lukas Czerner lczer...@redhat.com
  To: linux-btrfs@vger.kernel.org
  Cc: chris.ma...@fusionio.com, cwi...@cwillu.com,
  Lukas Czerner lczer...@redhat.com
  Subject: [PATCH 1/3] Btrfs-progs: move path modification to filters
  
  Commit 8e8e019e910f20947fea7eff5da40753639d8870 introduces -a option
  which will list all subvolumes with distinguishing between relative and
  absolute by prepending absolute patch with FS_TREE.
  
  This commit moves the path modification to a filter code rather than
  doing so in path construction in resolve_root(). This gives us more
  flexibility in formatting path output.
 
 ping
 
 any comments on this ?
 
 -Lukas

ping

 
  
  Signed-off-by: Lukas Czerner lczer...@redhat.com
  ---
   btrfs-list.c |   32 +++-
   btrfs-list.h |1 +
   cmds-subvolume.c |   11 +--
   man/btrfs.8.in   |3 ++-
   4 files changed, 35 insertions(+), 12 deletions(-)
  
  diff --git a/btrfs-list.c b/btrfs-list.c
  index e5f0f96..77d99f8 100644
  --- a/btrfs-list.c
  +++ b/btrfs-list.c
  @@ -628,15 +628,6 @@ static int resolve_root(struct root_lookup *rl, struct 
  root_info *ri,
  }
   
  if (next == BTRFS_FS_TREE_OBJECTID) {
  -   char p[] = FS_TREE;
  -   add_len = strlen(p);
  -   len = strlen(full_path);
  -   tmp = malloc(len + add_len + 2);
  -   memcpy(tmp + add_len + 1, full_path, len);
  -   tmp[add_len] = '/';
  -   memcpy(tmp, p, add_len);
  -   free(full_path);
  -   full_path = tmp;
  ri-top_id = next;
  break;
  }
  @@ -1176,6 +1167,28 @@ static int filter_topid_equal(struct root_info *ri, 
  u64 data)
  return ri-top_id == data;
   }
   
  +static int filter_full_path(struct root_info *ri, u64 data)
  +{
  +   if (ri-full_path  ri-top_id != data) {
  +   char *tmp;
  +   char p[] = FS_TREE;
  +   int add_len = strlen(p);
  +   int len = strlen(ri-full_path);
  +
  +   tmp = malloc(len + add_len + 2);
  +   if (!tmp) {
  +   fprintf(stderr, memory allocation failed\n);
  +   exit(1);
  +   }
  +   memcpy(tmp + add_len + 1, ri-full_path, len);
  +   tmp[add_len] = '/';
  +   memcpy(tmp, p, add_len);
  +   free(ri-full_path);
  +   ri-full_path = tmp;
  +   }
  +   return 1;
  +}
  +
   static btrfs_list_filter_func all_filter_funcs[] = {
  [BTRFS_LIST_FILTER_ROOTID]  = filter_by_rootid,
  [BTRFS_LIST_FILTER_SNAPSHOT_ONLY]   = filter_snapshot,
  @@ -1187,6 +1200,7 @@ static btrfs_list_filter_func all_filter_funcs[] = {
  [BTRFS_LIST_FILTER_CGEN_LESS]   = filter_cgen_less,
  [BTRFS_LIST_FILTER_CGEN_EQUAL]  = filter_cgen_equal,
  [BTRFS_LIST_FILTER_TOPID_EQUAL] = filter_topid_equal,
  +   [BTRFS_LIST_FILTER_FULL_PATH]   = filter_full_path,
   };
   
   struct btrfs_list_filter_set *btrfs_list_alloc_filter_set(void)
  diff --git a/btrfs-list.h b/btrfs-list.h
  index cde4b3c..f7fbea6 100644
  --- a/btrfs-list.h
  +++ b/btrfs-list.h
  @@ -71,6 +71,7 @@ enum btrfs_list_filter_enum {
  BTRFS_LIST_FILTER_CGEN_LESS,
  BTRFS_LIST_FILTER_CGEN_MORE,
  BTRFS_LIST_FILTER_TOPID_EQUAL,
  +   BTRFS_LIST_FILTER_FULL_PATH,
  BTRFS_LIST_FILTER_MAX,
   };
   
  diff --git a/cmds-subvolume.c b/cmds-subvolume.c
  index ac39f7b..37cb8cc 100644
  --- a/cmds-subvolume.c
  +++ b/cmds-subvolume.c
  @@ -277,7 +277,9 @@ static const char * const cmd_subvol_list_usage[] = {
  List subvolumes (and snapshots),
  ,
  -p   print parent ID,
  -   -a   print all the subvolumes in the filesystem.,
  +   -a   print all the subvolumes in the filesystem and,
  +distinguish absolute and relative path with respect,
  +to the given path,
  -u   print the uuid of subvolumes (and snapshots),
  -t   print the result as a table,
  -s   list snapshots only in the filesystem,
  @@ -400,7 +402,12 @@ static int cmd_subvol_list(int argc, char **argv)
  }
   
  top_id = btrfs_list_get_path_rootid(fd);
  -   if (!is_list_all)
  +
  +   if (is_list_all)
  +   btrfs_list_setup_filter(filter_set,
  +   BTRFS_LIST_FILTER_FULL_PATH,
  +   top_id);
  +   else
  

Poor performance of btrfs. Suspected unidentified btrfs housekeeping process which writes a lot

2013-01-30 Thread Adam Ryczkowski

Welcome,

I've been using btrfs for over a 3 months to store my personal data on 
my NAS server. Almost all interactions with files on the server are done 
using unison synchronizer. After another use of bedup 
(https://github.com/g2p/bedup) on my btrfs volume I experienced huge 
perfomance loss with synchronization. It now takes over 3 hours what 
have taken only 15 minutes! File browsing is not affected; but it takes 
forever to read contents of the files!


When I use `iotop -o -d 30` (which measures I/O activity for 30-second 
interval) I can see:


Total DISK READ:  98.66 K/s | Total DISK WRITE: 826.55 K/s
  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO COMMAND
 4296 be/4 root3.99 K/s  408.59 K/s  0.00 % 98.64 % 
[btrfs-transacti]

 6407 be/4 adam   94.14 K/s0.00 B/s  0.00 % 85.24 % unison -server
  311 be/4 root0.00 B/s0.00 B/s  0.00 % 58.20 % [md1_raid6]
  354 be/3 root0.00 B/s2.26 K/s  0.00 % 24.29 % [jbd2/md0-8]
  306 be/4 root0.00 B/s0.00 B/s  0.00 %  4.79 % [md0_raid1]
 1229 be/4 syslog  0.00 B/s  136.15 B/s  0.00 %  0.00 % rsyslogd -c5
 1744 be/4 root0.00 B/s  136.15 B/s  0.00 %  0.00 % 
console-kit-daemon --no-daemon


I expect no writes at all since the statistics were taken during the 
Looking for changes phse. Normally, the `unison -server` process shold 
have at least 5 M/s disk read speed. (The block device the btrfs is 
build on has a measured capability of 50 M/s sequential throughput)


When I pause the `unison -server` process (with htop), the disk activity 
persists of another 5-30 seconds, so I am infer, that the btrfs is doing 
some house-keeping work, and this is the reason I decided to post the 
email on this list. I suspect, that this house-keeping work has a time 
granularity of 5-30 seconds, and during this time access to the 
filesystem is delayed. The problem is not specific to the unison. This 
background process is triggered by just reading the file contents. Once 
the system is through and the file is read, than all subsequent attempts 
to read it are fine, even if I drop the cache (i.e. echo 3  
/proc/sys/vm/drop_caches). But after a while (after reboot) the 
performance hit recurs.


The questions are:
1. What sort of work is btrfs doing?
What is it writing (and why is it writing 100x bytes more than reading)?
2. Why does it take it so long?
3. What can I do to speed-up the process?
4. What can I do to prevent it from happening again?

Here are details about my system that might help you with the diagnose. 
If it is not enough,


I suspect it has something to do with snapshots I make for backup. I 
have 35 of them, and I ask bedup to find duplicates across all 
subvolumes. But on the other hand it is supposed to work since kernel 
3.5, and the filesystem has never seen kernel older than 3.6.


My filesystem /dev/vg-adama-docs/lv-adama-docs is 372GB in size, and is 
a quite complex setup:
It is based on logical volume (LVM2), which has a single physical volume 
made by dm-crypt device /dev/dm-1, which subsequently sits on top of 
/dev/md1 linux raid 6, which is built with 4 identical 186GB GPT 
partitions on each of my SATA 3TB hard drives.


There are 272k files on the system (excluding 35 snaphosts), 23k folders 
and 104 GB data.

$ df /mnt/adama-docs -h
Filesystem   Size  Used Avail Use% 
Mounted on
/dev/mapper/vg--adama--docs-lv--adama--docs  373G   85G  288G  23% 
/mnt/adama-docs


I was always using the latest kernel (its 3.7.1-030701-generic at the 
moment) on my Ubuntu Quantal server.


--

Adam Ryczkowski
Skype:sisteczko skype:sisteczko

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix a deadlock on chunk mutex

2013-01-30 Thread Josef Bacik
On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote:
 On 01/29/2013 01:04 PM, Josef Bacik wrote:
  On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote:
   On 01/28/2013 02:23 PM, Josef Bacik wrote:
On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote:
Hi Josef,
   
Thanks for the patch - sorry for the long delay in testing...
   

Jim,

I've been trying to reason out how this happens, could you do a btrfs 
fi df on
the filesystem thats giving you trouble so I can see if what I think 
is
happening is what's actually happening.  Thanks,
   
   Here's an example, using a slightly different kernel than
   my previous report.  It's your btrfs-next master branch
   (commit 8f139e59d5 Btrfs: use bit operation for -fs_state)
   with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree).
   
   
   Here I'm finding the file system in question:
   
   # ls -l /dev/mapper | grep dm-93
   lrwxrwxrwx 1 root root   8 Jan 29 11:13 cs53s19p2 - ../dm-93
   
   # df -h | grep -A 1 cs53s19p2
   /dev/mapper/cs53s19p2
 896G  1.1G  896G   1% /ram/mnt/ceph/data.osd.522
   
   
   Here's the info you asked for:
   
   # btrfs fi df /ram/mnt/ceph/data.osd.522
   Data: total=2.01GB, used=1.00GB
   System: total=4.00MB, used=64.00KB
   Metadata: total=8.00MB, used=7.56MB
   
  How big is the disk you are using, and what mount options?  I have a patch 
  to
  keep the panic from happening and hopefully the abort, could you try this?  
  I
  still want to keep the underlying error from happening because it shouldn't 
  be,
  but no reason I can't fix the error case while you can easily reproduce it 
  :).
  Thanks,
  
  Josef
  
 From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001
  From: Josef Bacik jba...@fusionio.com
  Date: Tue, 29 Jan 2013 15:03:37 -0500
  Subject: [PATCH] Btrfs: fix chunk allocation error handling
  
  If we error out allocating a dev extent we will have already created the
  block group and such which will cause problems since the allocator may have
  tried to allocate out of the block group that no longer exists.  This will
  cause BUG_ON()'s in the bio submission path.  This also makes a failure to
  allocate a dev extent a non-abort error, we will just clean up the dev
  extents we did allocate and exit.  Now if we fail to delete the dev extents
  we will abort since we can't have half of the dev extents hanging around,
  but this will make us much less likely to abort.  Thanks,
  
  Signed-off-by: Josef Bacik jba...@fusionio.com
  ---
 
 Interesting - with your patch applied I triggered the following, just
 bringing up a fresh Ceph filesystem - I didn't even get a chance to
 mount it on my Ceph clients:
 

Well that makes me a sad panda, but hey it didn't panic this time.  What
workload are you running on this fs/ceph cluster?  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: fix race between snapshot deletion and getting inode

2013-01-30 Thread Mitch Harder
On Mon, Jan 28, 2013 at 9:52 PM, Chris Mason chris.ma...@fusionio.com wrote:
 On Mon, Jan 28, 2013 at 08:22:10PM -0700, Liu Bo wrote:
 While running snapshot testscript created by Mitch and David,
 the race between autodefrag and snapshot deletion can lead to
 corruption of dead_root list so that we can get crash on
 btrfs_clean_old_snapshots().

 Really nice.  Thanks to everyone that hashed this out.

 -chris

I've been testing [PATCH v2] Btrfs: fix race between snapshot
deletion and getting inode along with [PATCH v6] Btrfs:
snapshot-aware defrag using the same work flow that was reproducing
the dead_root list corruptions.

I've been unable to reproduce the error in ~24 hours of testing.

Normally, I'd hit the error within an hour of testing on a single run.
 I've made three separate runs, and let the last run proceed
overnight.

I'll keep using these patches, and let you know if anything turns up.

Thanks for all your work on this patch set.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix a deadlock on chunk mutex

2013-01-30 Thread Josef Bacik
On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote:
 On 01/29/2013 01:04 PM, Josef Bacik wrote:
  On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote:
   On 01/28/2013 02:23 PM, Josef Bacik wrote:
On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote:
Hi Josef,
   
Thanks for the patch - sorry for the long delay in testing...
   

Jim,

I've been trying to reason out how this happens, could you do a btrfs 
fi df on
the filesystem thats giving you trouble so I can see if what I think 
is
happening is what's actually happening.  Thanks,
   
   Here's an example, using a slightly different kernel than
   my previous report.  It's your btrfs-next master branch
   (commit 8f139e59d5 Btrfs: use bit operation for -fs_state)
   with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree).
   
   
   Here I'm finding the file system in question:
   
   # ls -l /dev/mapper | grep dm-93
   lrwxrwxrwx 1 root root   8 Jan 29 11:13 cs53s19p2 - ../dm-93
   
   # df -h | grep -A 1 cs53s19p2
   /dev/mapper/cs53s19p2
 896G  1.1G  896G   1% /ram/mnt/ceph/data.osd.522
   
   
   Here's the info you asked for:
   
   # btrfs fi df /ram/mnt/ceph/data.osd.522
   Data: total=2.01GB, used=1.00GB
   System: total=4.00MB, used=64.00KB
   Metadata: total=8.00MB, used=7.56MB
   
  How big is the disk you are using, and what mount options?  I have a patch 
  to
  keep the panic from happening and hopefully the abort, could you try this?  
  I
  still want to keep the underlying error from happening because it shouldn't 
  be,
  but no reason I can't fix the error case while you can easily reproduce it 
  :).
  Thanks,
  
  Josef
  
 From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001
  From: Josef Bacik jba...@fusionio.com
  Date: Tue, 29 Jan 2013 15:03:37 -0500
  Subject: [PATCH] Btrfs: fix chunk allocation error handling
  
  If we error out allocating a dev extent we will have already created the
  block group and such which will cause problems since the allocator may have
  tried to allocate out of the block group that no longer exists.  This will
  cause BUG_ON()'s in the bio submission path.  This also makes a failure to
  allocate a dev extent a non-abort error, we will just clean up the dev
  extents we did allocate and exit.  Now if we fail to delete the dev extents
  we will abort since we can't have half of the dev extents hanging around,
  but this will make us much less likely to abort.  Thanks,
  
  Signed-off-by: Josef Bacik jba...@fusionio.com
  ---
 
 Interesting - with your patch applied I triggered the following, just
 bringing up a fresh Ceph filesystem - I didn't even get a chance to
 mount it on my Ceph clients:
 

Actually nevermind it looks like I figured out how to reproduce.  I'll let you
know when I have something to test.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix a deadlock on chunk mutex

2013-01-30 Thread Josef Bacik
On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote:
 On 01/29/2013 01:04 PM, Josef Bacik wrote:
  On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote:
   On 01/28/2013 02:23 PM, Josef Bacik wrote:
On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote:
Hi Josef,
   
Thanks for the patch - sorry for the long delay in testing...
   

Jim,

I've been trying to reason out how this happens, could you do a btrfs 
fi df on
the filesystem thats giving you trouble so I can see if what I think 
is
happening is what's actually happening.  Thanks,
   
   Here's an example, using a slightly different kernel than
   my previous report.  It's your btrfs-next master branch
   (commit 8f139e59d5 Btrfs: use bit operation for -fs_state)
   with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree).
   
   
   Here I'm finding the file system in question:
   
   # ls -l /dev/mapper | grep dm-93
   lrwxrwxrwx 1 root root   8 Jan 29 11:13 cs53s19p2 - ../dm-93
   
   # df -h | grep -A 1 cs53s19p2
   /dev/mapper/cs53s19p2
 896G  1.1G  896G   1% /ram/mnt/ceph/data.osd.522
   
   
   Here's the info you asked for:
   
   # btrfs fi df /ram/mnt/ceph/data.osd.522
   Data: total=2.01GB, used=1.00GB
   System: total=4.00MB, used=64.00KB
   Metadata: total=8.00MB, used=7.56MB
   
  How big is the disk you are using, and what mount options?  I have a patch 
  to
  keep the panic from happening and hopefully the abort, could you try this?  
  I
  still want to keep the underlying error from happening because it shouldn't 
  be,
  but no reason I can't fix the error case while you can easily reproduce it 
  :).
  Thanks,
  
  Josef
  
 From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001
  From: Josef Bacik jba...@fusionio.com
  Date: Tue, 29 Jan 2013 15:03:37 -0500
  Subject: [PATCH] Btrfs: fix chunk allocation error handling
  
  If we error out allocating a dev extent we will have already created the
  block group and such which will cause problems since the allocator may have
  tried to allocate out of the block group that no longer exists.  This will
  cause BUG_ON()'s in the bio submission path.  This also makes a failure to
  allocate a dev extent a non-abort error, we will just clean up the dev
  extents we did allocate and exit.  Now if we fail to delete the dev extents
  we will abort since we can't have half of the dev extents hanging around,
  but this will make us much less likely to abort.  Thanks,
  
  Signed-off-by: Josef Bacik jba...@fusionio.com
  ---
 
 Interesting - with your patch applied I triggered the following, just
 bringing up a fresh Ceph filesystem - I didn't even get a chance to
 mount it on my Ceph clients:
 

Ok can you give this patch a whirl as well?  It seems to fix the problem for me.
Thanks,

Josef

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index dca5679..874bcf2 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3677,8 +3677,18 @@ static int can_overcommit(struct btrfs_root *root,
u64 used;
 
used = space_info-bytes_used + space_info-bytes_reserved +
-   space_info-bytes_pinned + space_info-bytes_readonly +
-   space_info-bytes_may_use;
+   space_info-bytes_pinned + space_info-bytes_readonly;
+
+   /*
+* We only want to allow over committing if we have lots of actual space
+* free, but if we've tied up more than 80% of the space with actual
+* space reservation (not including bytes we _might_ use) then don't
+* allow overcommitting as it will just make things go badly for us.
+*/
+   if (used  div_factor(space_info-total_bytes, 8))
+   return 0;
+
+   used += space_info-bytes_may_use;
 
spin_lock(root-fs_info-free_chunk_lock);
avail = root-fs_info-free_chunk_space;
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Btrfs-progs: move path modification to filters

2013-01-30 Thread Gene Czarcinski
Ignoring for the moment whether these patches are a good idea or not, 
what is the base upon which these patches were built.  You might want to 
consider rebasing them to David Sterba's integration-20130130


Gene

On 01/30/2013 08:32 AM, Lukáš Czerner wrote:

On Thu, 10 Jan 2013, Lukáš Czerner wrote:


Date: Thu, 10 Jan 2013 13:02:42 +0100 (CET)
From: Lukáš Czerner lczer...@redhat.com
To: Lukas Czerner lczer...@redhat.com
Cc: linux-btrfs@vger.kernel.org, chris.ma...@fusionio.com, cwi...@cwillu.com
Subject: Re: [PATCH 1/3] Btrfs-progs: move path modification to filters

On Tue, 11 Dec 2012, Lukas Czerner wrote:


Date: Tue, 11 Dec 2012 15:24:58 +0100
From: Lukas Czerner lczer...@redhat.com
To: linux-btrfs@vger.kernel.org
Cc: chris.ma...@fusionio.com, cwi...@cwillu.com,
 Lukas Czerner lczer...@redhat.com
Subject: [PATCH 1/3] Btrfs-progs: move path modification to filters

Commit 8e8e019e910f20947fea7eff5da40753639d8870 introduces -a option
which will list all subvolumes with distinguishing between relative and
absolute by prepending absolute patch with FS_TREE.

This commit moves the path modification to a filter code rather than
doing so in path construction in resolve_root(). This gives us more
flexibility in formatting path output.

ping

any comments on this ?

-Lukas

ping


Signed-off-by: Lukas Czerner lczer...@redhat.com
---
  btrfs-list.c |   32 +++-
  btrfs-list.h |1 +
  cmds-subvolume.c |   11 +--
  man/btrfs.8.in   |3 ++-
  4 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index e5f0f96..77d99f8 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -628,15 +628,6 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri,
}
  
  		if (next == BTRFS_FS_TREE_OBJECTID) {

-   char p[] = FS_TREE;
-   add_len = strlen(p);
-   len = strlen(full_path);
-   tmp = malloc(len + add_len + 2);
-   memcpy(tmp + add_len + 1, full_path, len);
-   tmp[add_len] = '/';
-   memcpy(tmp, p, add_len);
-   free(full_path);
-   full_path = tmp;
ri-top_id = next;
break;
}
@@ -1176,6 +1167,28 @@ static int filter_topid_equal(struct root_info *ri, u64 
data)
return ri-top_id == data;
  }
  
+static int filter_full_path(struct root_info *ri, u64 data)

+{
+   if (ri-full_path  ri-top_id != data) {
+   char *tmp;
+   char p[] = FS_TREE;
+   int add_len = strlen(p);
+   int len = strlen(ri-full_path);
+
+   tmp = malloc(len + add_len + 2);
+   if (!tmp) {
+   fprintf(stderr, memory allocation failed\n);
+   exit(1);
+   }
+   memcpy(tmp + add_len + 1, ri-full_path, len);
+   tmp[add_len] = '/';
+   memcpy(tmp, p, add_len);
+   free(ri-full_path);
+   ri-full_path = tmp;
+   }
+   return 1;
+}
+
  static btrfs_list_filter_func all_filter_funcs[] = {
[BTRFS_LIST_FILTER_ROOTID]  = filter_by_rootid,
[BTRFS_LIST_FILTER_SNAPSHOT_ONLY]   = filter_snapshot,
@@ -1187,6 +1200,7 @@ static btrfs_list_filter_func all_filter_funcs[] = {
[BTRFS_LIST_FILTER_CGEN_LESS]   = filter_cgen_less,
[BTRFS_LIST_FILTER_CGEN_EQUAL]  = filter_cgen_equal,
[BTRFS_LIST_FILTER_TOPID_EQUAL] = filter_topid_equal,
+   [BTRFS_LIST_FILTER_FULL_PATH]   = filter_full_path,
  };
  
  struct btrfs_list_filter_set *btrfs_list_alloc_filter_set(void)

diff --git a/btrfs-list.h b/btrfs-list.h
index cde4b3c..f7fbea6 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -71,6 +71,7 @@ enum btrfs_list_filter_enum {
BTRFS_LIST_FILTER_CGEN_LESS,
BTRFS_LIST_FILTER_CGEN_MORE,
BTRFS_LIST_FILTER_TOPID_EQUAL,
+   BTRFS_LIST_FILTER_FULL_PATH,
BTRFS_LIST_FILTER_MAX,
  };
  
diff --git a/cmds-subvolume.c b/cmds-subvolume.c

index ac39f7b..37cb8cc 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -277,7 +277,9 @@ static const char * const cmd_subvol_list_usage[] = {
List subvolumes (and snapshots),
,
-p   print parent ID,
-   -a   print all the subvolumes in the filesystem.,
+   -a   print all the subvolumes in the filesystem and,
+distinguish absolute and relative path with respect,
+to the given path,
-u   print the uuid of subvolumes (and snapshots),
-t   print the result as a table,
-s   list snapshots only in the filesystem,
@@ -400,7 +402,12 @@ static int cmd_subvol_list(int argc, char **argv)
}
  
  	top_id = btrfs_list_get_path_rootid(fd

Re: corrupted file size on inline extent conversion?

2013-01-30 Thread Mike Lowe
Well I found this, so I think it's likely:

root@gwboss2:~# dmesg |grep bitten
[ 3196.193238] this would have bitten us in the ass
[ 3196.193784] this would have bitten us in the ass

On Jan 29, 2013, at 9:54 AM, Josef Bacik jba...@fusionio.com wrote:

 On Mon, Jan 28, 2013 at 05:12:12PM -0700, Sage Weil wrote:
 A ceph user observed a incorrect i_size on btrfs.  The pattern looks like 
 this:
 
 - some writes at low file offsets
 - a write to 4185600 len 8704 (i_size should be 4MB)
 - more writes to low offsets
 - a write to 4181504 len 4096 (abutts the write above)
 - a bit of time goes by...
 - stat returns 4186112 (4MB - 8192)
 - that's a fwe bytes to the right of the top write above.
 
 There are some logs showing the full read/write activity to the file at
 
  http://tracker.newdream.net/attachments/658/object_log.txt
 
 on issue
 
  http://tracker.newdream.net/issues/3810
 
 The kernel was 3.7.0-030700-generic (and probably also observed on 3.7.1).
 
 Is this a known bug?
 
 Not known but I took a long hard look at our ordered i size updating and I 
 think
 I spotted the bug.  Could you run this patch and see if you get the printk?  
 If
 you do then that was the problem and you should be good to go.  It definitely
 needs to be fixed, hopefully it's also your bug.  Thanks,
 
 Josef
 
 
 diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
 index cbd4838..dbd4905 100644
 --- a/fs/btrfs/ordered-data.c
 +++ b/fs/btrfs/ordered-data.c
 @@ -895,8 +895,14 @@ int btrfs_ordered_update_i_size(struct inode *inode, u64 
 offset,
* if the disk i_size is already at the inode-i_size, or
* this ordered extent is inside the disk i_size, we're done
*/
 - if (disk_i_size == i_size || offset = disk_i_size) {
 + if (disk_i_size == i_size)
   goto out;
 +
 + if (offset = disk_i_size) {
 + if (ordered  ordered-outstanding_isize  disk_i_size)
 + printk(KERN_ERR this would have bitten us in the 
 ass\n);
 + else
 + goto out;
   }
 
   /*

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corrupted file size on inline extent conversion?

2013-01-30 Thread Josef Bacik
On Wed, Jan 30, 2013 at 11:17:25AM -0700, Mike Lowe wrote:
 Well I found this, so I think it's likely:
 
 root@gwboss2:~# dmesg |grep bitten
 [ 3196.193238] this would have bitten us in the ass
 [ 3196.193784] this would have bitten us in the ass
 

Well that makes me happy since I had almost talked myself out of this being a
possiblity.  How long did it take you to hit this problem before and how long
have you been running with this patch?  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs error -5

2013-01-30 Thread David Merris
Hi,

I had an error the other day, and I either fail at Google, or Google
has nothing helpful to give me, so I thought I would ask here if
anyone had any information that will help me figure out what has gone
wrong and, hopefully, how to fix it. (Either a go here and ask or if
answers through here.) I am totally willing to live with Well, you're
hosed, hope you had a backup! (It was working on that, but hadn't
finished yet. That'll teach me to not keep a closer eye on the
device.)

The other day I noticed my CrashPlan clients could no longer backup to
my fileserver. I didn't think too much of it, and attempted to restart
CrashPlan. That spit out an error that I didn't think to write down,
but being that it uses Java, I shrugged it off, and asked the machine
to reboot. A bit later, I noticed that the machine still was not
responding, fired up the console in vSphere and saw the error:
btrfs: could not do orphan cleanup -5
btrfs: open_ctree failed

There are a few other errors about the parent transid verification
failing before those, and (since I didn't have copy/paste) I have
transcribed them at http://pastebin.com/6smrqPkP

I ran btrfsck on the device, and put the output of that at
http://pastebin.com/HcQxy6a1
I also ran btrfs-debug-tree on the device, but it created a 9.4 gig
file, so, I haven't tried to drop that on pastebin, but I can throw
them somewhere if someone would like to see them. (I'm hoping there is
some specific bit I can search for on my end and put that somewhere
helpful.)

Some information about my system:
Operating System: Ubuntu 12.04.1 LTS
Kernel: Linux 3.2.0-29-generic
btrfsck tells me it is Btrfs Btrfs v0.19
The filesystem is on a 10 disk raid 6 array that does not appear to be unhappy.
All of my btrfs tools came from Ubuntu's default repos.
The system is a virtual machine running under a VMware ESXi 5.0.0
host, with the 10 drives passed through to the guest OS. I have
allocated 8GB of memory to the guest.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] Btrfs: fix full backref problem when inserting shared block reference

2013-01-30 Thread Alex Lyakas
Hi Miao,
I was following this thread in the past, but I did not understand it
fully, maybe you can explain?

   # mkfs.btrfs partition
   # mount partition mnt
   # cd mnt
   # for ((i=0;i2400;i++)); do touch long_name_to_make_tree_more_deep$i; 
  done
   # for ((i=0; i4; i++))
do
mkdir $i
for ((j=0; j200; j++))
do
btrfs sub snap . $i/$j
done 
done
 
  snapshot creation has a critical section.  Once we copy a given root to
  its snapshot, we're not allowed to change it until the transaction
  is fully committed.

Is the limitation that if we are creating a snap B of root A, and
placing the root of B somewhere into the tree of A, then we can do
this only once per transaction? Does this limitation still exist or
your fix fixes it?

Also, according to your reproducer, each btrfs sub snap will
start/join a transaction, but then it will call
btrfs_commit_transaction() and not btrfs_commit_transaction_async(),
so it will wait until the transaction commits. So how it may happen
that you create more than one snap in the same transaction with your
reproducer?

The reason I am asking, is that I want to try to write code that
creates several snaps in one transaction and only then commits. Should
this be possible or there is some limitation, like I mentioned above?

Thanks for your help,
Alex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corrupted file size on inline extent conversion?

2013-01-30 Thread Mike Lowe
I've been running rsync against a rbd device backed by btrfs filesystems that 
are about 11% full for about 45 minutes before I checked and noticed the printk 
message.  That was the first go with the patch.  Seems like I was able to get 
by without any problems until the btrfs filesystems got some use and filled up 
a little bit.

On Jan 30, 2013, at 1:22 PM, Josef Bacik jba...@fusionio.com wrote:

 On Wed, Jan 30, 2013 at 11:17:25AM -0700, Mike Lowe wrote:
 Well I found this, so I think it's likely:
 
 root@gwboss2:~# dmesg |grep bitten
 [ 3196.193238] this would have bitten us in the ass
 [ 3196.193784] this would have bitten us in the ass
 
 
 Well that makes me happy since I had almost talked myself out of this being a
 possiblity.  How long did it take you to hit this problem before and how long
 have you been running with this patch?  Thanks,
 
 Josef

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Chris Murphy

On Jan 30, 2013, at 3:02 AM, Hugo Mills h...@carfax.org.uk wrote:

 
   That would be the case with single mode, not with RAID-0.
 
   With RAID-0, you'd get data striped equally across all (in this
 case, both) the devices, up to the size of the second-largest one, at
 which point it'll stop allocating space.

This raises a question about the desirability/feasibility of changing this 
behavior. It's common to have odd sized disks. It's unfortunate that most of 
the life of a 'single' paring of disks, there is no performance improvement 
possible; and also unfortunate that in 'raid0' it doesn't fall back to 'single' 
behavior to fill up the remaining space, instead of ending allocation.

md raid0 will work on odd sized block devices, and it will fill up all the 
space. Presumably it does this by allocating chunks round robbin, and the point 
where a block device is full, it just starts allocating more chunks to the 
device(s) that have space. This means there's a distinction in behavior between 
md's level 'raid10' and separately creating a stripe of mirrors, i.e. first 
creating raid1 arrays, then striping them with raid0.

Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corrupted file size on inline extent conversion?

2013-01-30 Thread Josef Bacik
On Wed, Jan 30, 2013 at 11:30:49AM -0700, Mike Lowe wrote:
 I've been running rsync against a rbd device backed by btrfs filesystems that 
 are about 11% full for about 45 minutes before I checked and noticed the 
 printk message.  That was the first go with the patch.  Seems like I was able 
 to get by without any problems until the btrfs filesystems got some use and 
 filled up a little bit.
 

Ok since you are seeing the message I'll go ahead and post the patch and get it
moving along, let me know if you still see the problem.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Filipe Brandenburger
Hi,

On Wed, Jan 30, 2013 at 2:49 AM, Roger Binns rog...@rogerbinns.com wrote:
 It looks like my choices are:

 * RAID 0 and getting 500GB of usable space, with performance 50% of the
 accesses at HDD levels and 50% at SSD levels

 * Single and getting 750GB of usable space with performance and usage
 mostly on the HDD

You could try something like -l=linear on md-raid or something
similar on LVM to build a 750GB volume where the first 250GB are the
SSD and the last 500GB are the HDD. But that would probably work best
(as in, use more blocks from the beginning of the disk before moving
to the end) with a non-COW filesystem like ext4 instead of Btrfs
(although I can be wrong about that, I never really tried something
similar.)

Cheers,
Filipe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix missing i_size update

2013-01-30 Thread Josef Bacik
If we have an ordered extent before the ordered extent we are currently
completing that is after the current disk_i_size we will put our i_size
update into that ordered extent so that we do not expose stale data.  The
problem is that if our disk i_size is updated past the previous ordered
extent we won't update the i_size with the pending i_size update.  So check
the pending i_size update and if its above the current disk i_size we need
to go ahead and try to update.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/ordered-data.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index cbd4838..c447e4c 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -895,9 +895,16 @@ int btrfs_ordered_update_i_size(struct inode *inode, u64 
offset,
 * if the disk i_size is already at the inode-i_size, or
 * this ordered extent is inside the disk i_size, we're done
 */
-   if (disk_i_size == i_size || offset = disk_i_size) {
+   if (disk_i_size == i_size)
+   goto out;
+
+   /*
+* We still need to update disk_i_size if outstanding_isize is greater
+* than disk_i_size.
+*/
+   if (offset = disk_i_size 
+   (!ordered || ordered-outstanding_isize  disk_i_size))
goto out;
-   }
 
/*
 * walk backward from this ordered extent to disk_i_size.
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix missing i_size update

2013-01-30 Thread Filipe Brandenburger
On Wed, Jan 30, 2013 at 11:28 AM, Josef Bacik jba...@fusionio.com wrote:
 +   /*
 +* We still need to update disk_i_size if outstanding_isize is greater
 +* than disk_i_size.
 +*/
 +   if (offset = disk_i_size 
 +   (!ordered || ordered-outstanding_isize  disk_i_size))

= for the comparison? ordered-outstanding_isize = disk_i_size

Cheers,
Filipe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 30/01/13 04:01, Sander wrote:
 Do you know about bcache and EnhanceIO ?

Yes, but there are two reasons I don't use them.  One is that the capacity
of your cache is not included in the filesystem - ie with a 250GB SSD and
500GB the filesystem capacity will be 500GB not 750GB.

The second is that I use btrfs for my root filesystem so I'd have to get
bcache/EnhanceIO integrated into the distributor's initramfs build
mechanism, as well as worry about livecd/network boots without it.  This
is a lot of unnecessary work and worry.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlEJfUgACgkQmOOfHg372QR42wCfUV9MK6luScTtu59g4p9BsTdf
6/8AoLlumP6NeEsSv/pmgd+857m/2LUF
=Eigx
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix missing i_size update

2013-01-30 Thread Josef Bacik
On Wed, Jan 30, 2013 at 12:36:22PM -0700, Filipe Brandenburger wrote:
 On Wed, Jan 30, 2013 at 11:28 AM, Josef Bacik jba...@fusionio.com wrote:
  +   /*
  +* We still need to update disk_i_size if outstanding_isize is 
  greater
  +* than disk_i_size.
  +*/
  +   if (offset = disk_i_size 
  +   (!ordered || ordered-outstanding_isize  disk_i_size))
 
 = for the comparison? ordered-outstanding_isize = disk_i_size
 

Yeah good point.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 0 across SSD and HDD

2013-01-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 30/01/13 11:10, Filipe Brandenburger wrote:
 You could try something like -l=linear on md-raid or something 
 similar on LVM to build a 750GB volume

That would also require wiping the filesystems and starting again(*).  One
of the joys of btrfs has been not dealing with LVM.  On my workstation I
have two 2GB disks, but on one there is a sizeable Windows partition.
Getting LVM to stripe across the common sized space and then just use the
rest took quite a while to work out, requires running several different
commands and was something I had to write down.  There was nothing
intuitive.  It was a happy day when I could wipe and replace with btrfs.

Contrast with btrfs where 'btrfs --help' is almost always sufficient and
adding/removing/resizing is trivial (and online).

(*) I realise I could do things like add an external disk, btrfs add that
and then btrfs delete the internals, redo the internal storage, btrfs add
those back and then btrfs delete the external.  It would take a long time,
and is a reminder as to why I would prefer to be all btrfs everywhere
rather than also dealing with LVM and similar.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlEJgCEACgkQmOOfHg372QSwEwCdG5GDUC2Ab/eVZo36t3Zs691R
otAAn3p4Gq8lV2NgPp79799BflBwt/cW
=yl2B
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix possible stale data exposure

2013-01-30 Thread Josef Bacik
We specifically do not update the disk i_size if there are ordered extents
outstanding for any area between the current disk_i_size and our ordered
extent so that we do not expose stale data.  The problem is the check we
have only checks if the ordered extent starts at or after the current
disk_i_size, which doesn't take into account an ordered extent that starts
before the current disk_i_size and ends past the disk_i_size.  Fix this by
checking if the extent ends past the disk_i_size.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/ordered-data.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index f8b13e8..cd8f6e9 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -936,7 +936,7 @@ int btrfs_ordered_update_i_size(struct inode *inode, u64 
offset,
break;
if (test-file_offset = i_size)
break;
-   if (test-file_offset = disk_i_size) {
+   if (entry_end(test)  disk_i_size) {
/*
 * we don't update disk_i_size now, so record this
 * undealt i_size. Or we will not know the real
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RFC] include btrfsck in btrfs - including name check

2013-01-30 Thread Filipe Brandenburger
Hi Ian,

On Tue, Jan 29, 2013 at 3:03 PM, Ian Kumlien po...@demius.net wrote:
 This patch includes fsck as a subcommand of btrfs, but if you rename
 the binary to btrfsck (or, preferably, use a symlink) it will act like
 the old btrfs command.

You can rename files in your git (there's git mv for that), only
thing is when you generate the patch with format-patch (or git show,
git diff etc.) pass it the -M option to detect moves and act
appropriately.

Regarding your patches, I really like the idea of btrfs fsck but I
think I'd prefer to keep the external commands as wrapper scripts
instead of adding busybox-style name detection to btrfs... But then,
that's just my opinion.

I guess I would have a btrfsck that would simply contain:

#! /bin/sh
exec btrfs fsck $@

Downside is that error reporting (e.g. invalid syntax, etc.) would
show btrfs fsck instead of the command the user actually typed...

Cheers,
Filipe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix freeing delayed ref head while still holding its mutex

2013-01-30 Thread Josef Bacik
I hit this error when reproducing a bug that would end in a transaction
abort.  We take the delayed ref head's mutex to keep anybody from processing
it while we're destroying it, but we fail to drop the mutex before we carry
on and free the damned thing.  Fix this by doing the remove logic for the
head ourselves and unlock the mutex, that way we can avoid use after free's
or hung tasks waiting on that mutex to come back so they know the delayed
ref completed.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/disk-io.c |   11 ---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 12a9547..51bff86 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3640,10 +3640,15 @@ int btrfs_destroy_delayed_refs(struct btrfs_transaction 
*trans,
if (list_empty(head-cluster))
delayed_refs-num_heads_ready--;
list_del_init(head-cluster);
+   ref-in_tree = 0;
+   rb_erase(ref-rb_node, delayed_refs-root);
+   delayed_refs-num_entries--;
+   mutex_unlock(head-mutex);
+   } else {
+   ref-in_tree = 0;
+   rb_erase(ref-rb_node, delayed_refs-root);
+   delayed_refs-num_entries--;
}
-   ref-in_tree = 0;
-   rb_erase(ref-rb_node, delayed_refs-root);
-   delayed_refs-num_entries--;
 
spin_unlock(delayed_refs-lock);
btrfs_put_delayed_ref(ref);
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RFC] include btrfsck in btrfs - including name check

2013-01-30 Thread Ian Kumlien
On Wed, Jan 30, 2013 at 12:33:42PM -0800, Filipe Brandenburger wrote:
 Hi Ian,
 
 On Tue, Jan 29, 2013 at 3:03 PM, Ian Kumlien po...@demius.net wrote:
  This patch includes fsck as a subcommand of btrfs, but if you rename
  the binary to btrfsck (or, preferably, use a symlink) it will act like
  the old btrfs command.
 
 You can rename files in your git (there's git mv for that), only
 thing is when you generate the patch with format-patch (or git show,
 git diff etc.) pass it the -M option to detect moves and act
 appropriately.

git send-email seems to send the full diff, diffing against /dev/null =P
This is why i skipped that part.

 Regarding your patches, I really like the idea of btrfs fsck but I
 think I'd prefer to keep the external commands as wrapper scripts
 instead of adding busybox-style name detection to btrfs... But then,
 that's just my opinion.

Well, now both works.

 I guess I would have a btrfsck that would simply contain:
 
 #! /bin/sh
 exec btrfs fsck $@
 
 Downside is that error reporting (e.g. invalid syntax, etc.) would
 show btrfs fsck instead of the command the user actually typed...

Actually it still does, due to how btrfs handles things... It's a simple
enough fix and it will make rescue cd's or dracut images, or just about
anything.

I understand your point, but i think this is a simpler solution =)

 Cheers,
 Filipe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix a deadlock on chunk mutex

2013-01-30 Thread Jim Schutt
On 01/30/2013 09:38 AM, Josef Bacik wrote:
 On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote:
  On 01/29/2013 01:04 PM, Josef Bacik wrote:
   On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote:
On 01/28/2013 02:23 PM, Josef Bacik wrote:
 On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote:
 Hi Josef,

 Thanks for the patch - sorry for the long delay in 
 testing...

 
 Jim,
 
 I've been trying to reason out how this happens, could you do 
 a btrfs fi df on
 the filesystem thats giving you trouble so I can see if what 
 I think is
 happening is what's actually happening.  Thanks,

Here's an example, using a slightly different kernel than
my previous report.  It's your btrfs-next master branch
(commit 8f139e59d5 Btrfs: use bit operation for -fs_state)
with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree).


Here I'm finding the file system in question:

# ls -l /dev/mapper | grep dm-93
lrwxrwxrwx 1 root root   8 Jan 29 11:13 cs53s19p2 - ../dm-93

# df -h | grep -A 1 cs53s19p2
/dev/mapper/cs53s19p2
  896G  1.1G  896G   1% 
/ram/mnt/ceph/data.osd.522


Here's the info you asked for:

# btrfs fi df /ram/mnt/ceph/data.osd.522
Data: total=2.01GB, used=1.00GB
System: total=4.00MB, used=64.00KB
Metadata: total=8.00MB, used=7.56MB

   How big is the disk you are using, and what mount options?  I have a 
   patch to
   keep the panic from happening and hopefully the abort, could you try 
   this?  I
   still want to keep the underlying error from happening because it 
   shouldn't be,
   but no reason I can't fix the error case while you can easily reproduce 
   it :).
   Thanks,
   
   Josef
   
  From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001
   From: Josef Bacik jba...@fusionio.com
   Date: Tue, 29 Jan 2013 15:03:37 -0500
   Subject: [PATCH] Btrfs: fix chunk allocation error handling
   
   If we error out allocating a dev extent we will have already created the
   block group and such which will cause problems since the allocator may 
   have
   tried to allocate out of the block group that no longer exists.  This 
   will
   cause BUG_ON()'s in the bio submission path.  This also makes a failure 
   to
   allocate a dev extent a non-abort error, we will just clean up the dev
   extents we did allocate and exit.  Now if we fail to delete the dev 
   extents
   we will abort since we can't have half of the dev extents hanging 
   around,
   but this will make us much less likely to abort.  Thanks,
   
   Signed-off-by: Josef Bacik jba...@fusionio.com
   ---
  
  Interesting - with your patch applied I triggered the following, just
  bringing up a fresh Ceph filesystem - I didn't even get a chance to
  mount it on my Ceph clients:
  
 Ok can you give this patch a whirl as well?  It seems to fix the problem for 
 me.

With this patch on top of your previous patch, after several trials of
my test I am also unable to reproduce the issue.  Since I had been
having trouble first time, every time, I think it also seems to fix
the problem for me.

Thanks again!

-- Jim

 Thanks,
 
 Josef


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix a deadlock on chunk mutex

2013-01-30 Thread Josef Bacik
On Wed, Jan 30, 2013 at 02:37:40PM -0700, Jim Schutt wrote:
 On 01/30/2013 09:38 AM, Josef Bacik wrote:
  On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote:
   On 01/29/2013 01:04 PM, Josef Bacik wrote:
On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote:
 On 01/28/2013 02:23 PM, Josef Bacik wrote:
  On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote:
  Hi Josef,
 
  Thanks for the patch - sorry for the long delay in 
  testing...
 
  
  Jim,
  
  I've been trying to reason out how this happens, could you 
  do a btrfs fi df on
  the filesystem thats giving you trouble so I can see if 
  what I think is
  happening is what's actually happening.  Thanks,
 
 Here's an example, using a slightly different kernel than
 my previous report.  It's your btrfs-next master branch
 (commit 8f139e59d5 Btrfs: use bit operation for -fs_state)
 with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree).
 
 
 Here I'm finding the file system in question:
 
 # ls -l /dev/mapper | grep dm-93
 lrwxrwxrwx 1 root root   8 Jan 29 11:13 cs53s19p2 - ../dm-93
 
 # df -h | grep -A 1 cs53s19p2
 /dev/mapper/cs53s19p2
   896G  1.1G  896G   1% 
 /ram/mnt/ceph/data.osd.522
 
 
 Here's the info you asked for:
 
 # btrfs fi df /ram/mnt/ceph/data.osd.522
 Data: total=2.01GB, used=1.00GB
 System: total=4.00MB, used=64.00KB
 Metadata: total=8.00MB, used=7.56MB
 
How big is the disk you are using, and what mount options?  I have a 
patch to
keep the panic from happening and hopefully the abort, could you try 
this?  I
still want to keep the underlying error from happening because it 
shouldn't be,
but no reason I can't fix the error case while you can easily 
reproduce it :).
Thanks,

Josef

   From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001
From: Josef Bacik jba...@fusionio.com
Date: Tue, 29 Jan 2013 15:03:37 -0500
Subject: [PATCH] Btrfs: fix chunk allocation error handling

If we error out allocating a dev extent we will have already created 
the
block group and such which will cause problems since the allocator 
may have
tried to allocate out of the block group that no longer exists.  This 
will
cause BUG_ON()'s in the bio submission path.  This also makes a 
failure to
allocate a dev extent a non-abort error, we will just clean up the dev
extents we did allocate and exit.  Now if we fail to delete the dev 
extents
we will abort since we can't have half of the dev extents hanging 
around,
but this will make us much less likely to abort.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
   
   Interesting - with your patch applied I triggered the following, just
   bringing up a fresh Ceph filesystem - I didn't even get a chance to
   mount it on my Ceph clients:
   
  Ok can you give this patch a whirl as well?  It seems to fix the problem 
  for me.
 
 With this patch on top of your previous patch, after several trials of
 my test I am also unable to reproduce the issue.  Since I had been
 having trouble first time, every time, I think it also seems to fix
 the problem for me.
 
 Thanks again!
 

Awesome thanks for testing!

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RFC] include btrfsck in btrfs - including name check

2013-01-30 Thread Ilya Dryomov
On Wed, Jan 30, 2013 at 10:11:44PM +0100, Ian Kumlien wrote:
 On Wed, Jan 30, 2013 at 12:33:42PM -0800, Filipe Brandenburger wrote:
  Hi Ian,
  
  On Tue, Jan 29, 2013 at 3:03 PM, Ian Kumlien po...@demius.net wrote:
   This patch includes fsck as a subcommand of btrfs, but if you rename
   the binary to btrfsck (or, preferably, use a symlink) it will act like
   the old btrfs command.
  
  You can rename files in your git (there's git mv for that), only
  thing is when you generate the patch with format-patch (or git show,
  git diff etc.) pass it the -M option to detect moves and act
  appropriately.
 
 git send-email seems to send the full diff, diffing against /dev/null =P
 This is why i skipped that part.
 
  Regarding your patches, I really like the idea of btrfs fsck but I
  think I'd prefer to keep the external commands as wrapper scripts
  instead of adding busybox-style name detection to btrfs... But then,
  that's just my opinion.
 
 Well, now both works.
 
  I guess I would have a btrfsck that would simply contain:
  
  #! /bin/sh
  exec btrfs fsck $@
  
  Downside is that error reporting (e.g. invalid syntax, etc.) would
  show btrfs fsck instead of the command the user actually typed...
 
 Actually it still does, due to how btrfs handles things... It's a simple
 enough fix and it will make rescue cd's or dracut images, or just about
 anything.
 
 I understand your point, but i think this is a simpler solution =)

FWIW I agree with Filipe, this name detection thing looks ugly to me.
The merge itself is a good idea, but I think we should stick with shell
wrappers for everything else.

Thanks,

Ilya
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix freeing delayed ref head while still holding its mutex

2013-01-30 Thread Zach Brown
On Wed, Jan 30, 2013 at 04:06:18PM -0500, Josef Bacik wrote:
 I hit this error when reproducing a bug that would end in a transaction
 abort.  We take the delayed ref head's mutex to keep anybody from processing
 it while we're destroying it, but we fail to drop the mutex before we carry
 on and free the damned thing.  Fix this by doing the remove logic for the
 head ourselves and unlock the mutex, that way we can avoid use after free's
 or hung tasks waiting on that mutex to come back so they know the delayed
 ref completed.  Thanks,
 

 + ref-in_tree = 0;
 + rb_erase(ref-rb_node, delayed_refs-root);
 + delayed_refs-num_entries--;
 + mutex_unlock(head-mutex);
 + } else {
 + ref-in_tree = 0;
 + rb_erase(ref-rb_node, delayed_refs-root);
 + delayed_refs-num_entries--;

Do you really need to duplicate the removal under the mutex?  Isn't all
that protected by the delayed_refs-lock?

Isn't it enough to just add the mutex_unlock()?

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] btrfs-progs: better support for external users of send, V3

2013-01-30 Thread Mark Fasheh
Hi,

The following 5 patches make changes to btrfs-progs in order to
provide support for external software that wants to make use of the
excellent btrfs send ioctl.

The first patch introduces support for the BTRFS_SEND_FLAG_NO_FILE_DATA flag
which is introduced in my kernel patch titled:

btrfs: add no file data flag to btrfs send ioctl

which can be found on the btrfs list, and for convenience is also attached
at the end of this e-mail.

The 2nd patch creates a libbtrfs and links the rest of the build to it. The
functionality I chose to export as of right now centers on send support. 
With this library, an external program has a much easier time processing the
stream which a send ioctl provides. It's worth nothing btw that this patch
can stand alone if need be.

The 3rd patch introduces send-test, a small piece of software (not built by
default) to allow for testing of the send ioctl (including our new flag). As
send-test is a client of libbtrfs it might also serve as example code for
developers looking to make use of send.

The 4th patch makes minor changes so that libbtrfs is usable from C++.

The 5th patch was contributed by Jeff Mahoney and it changes the build to
use autotools. This wound up making packaging of the resulting library far
less painful so I felt it would be advantageous for us to have it upstream
so all packagers can benefit from it.

The patches can also be viewed on github:

https://github.com/markfasheh/btrfs-progs-patches/tree/no-data-and-libify

Testing has been pretty straight-forward - I build the software, verify that
things work by making a file system or using send-test.

Please review. Thanks,
--Mark

Changelog:

- included patch by Jeff Mahoney to use autotools

- Fixed whitespace error in patch 3 (thanks to Anand Jain for reporting)

- make version; make install should work now (again, thanks to Anand Jain)

- included patch by Arvin Schnell to make it possible to use libbtrfs from C++
  - From this patch I removed some code from cmds-send.c that was added
by mistake.

- libbtrfs properly links to libuuid and libm (Reported by Arvin)

- library symlinks are now properly installed (Reported by Arvin)


From: Mark Fasheh mfas...@suse.de

btrfs: add no file data flag to btrfs send ioctl

This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send
ioctl code. When this flag is set, the btrfs send code will never write file
data into the stream (thus also avoiding expensive reads of that data in the
first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of
BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in
question.

This patch does not affect the operation of BTRFS_SEND_C_CLONE commands -
they will continue to be sent when a search finds an appropriate extent to
clone from.

Signed-off-by: Mark Fasheh mfas...@suse.de

diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
index 731e287..1f6cfdd 100644
--- a/fs/btrfs/ioctl.h
+++ b/fs/btrfs/ioctl.h
@@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args {
__u64   reserved[16];   /* in */
 };
 
+/*
+ * Caller doesn't want file data in the send stream, even if the
+ * search of clone sources doesn't find an extent. UPDATE_EXTENT
+ * commands will be sent instead of WRITE commands.
+ */
+#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
+
 struct btrfs_ioctl_send_args {
__s64 send_fd;  /* in */
__u64 clone_sources_count;  /* in */
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index e78b297..8d0c6b4 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -85,6 +85,7 @@ struct send_ctx {
u32 send_max_size;
u64 total_send_size;
u64 cmd_send_size[BTRFS_SEND_C_MAX + 1];
+   u64 flags;  /* 'flags' member of btrfs_ioctl_send_args is u64 */
 
struct vfsmount *mnt;
 
@@ -3707,6 +3708,39 @@ out:
return ret;
 }
 
+/*
+ * Send an update extent command to user space.
+ */
+static int send_update_extent(struct send_ctx *sctx,
+ u64 offset, u32 len)
+{
+   int ret = 0;
+   struct fs_path *p;
+
+   p = fs_path_alloc(sctx);
+   if (!p)
+   return -ENOMEM;
+
+   ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT);
+   if (ret  0)
+   goto out;
+
+   ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p);
+   if (ret  0)
+   goto out;
+
+   TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len);
+
+   ret = send_cmd(sctx);
+
+tlv_put_failure:
+out:
+   fs_path_free(sctx, p);
+   return ret;
+}
+
 static int send_write_or_clone(struct send_ctx *sctx,
   struct btrfs_path *path,
   struct btrfs_key *key,
@@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx,
goto out;
}
 

[PATCH 1/5] btrfs-progs: Add support for BTRFS_SEND_FLAG_NO_FILE_DATA

2013-01-30 Thread Mark Fasheh
The flag and command are synced from kernel to user. Also, this patch adds a
callback for the BTRFS_SEND_C_UPDATE_EXTENT in struct btrfs_send_ops.
read_and_process_cmd() is updated to decode BTRFS_SEND_C_UPDATE_EXTENT and
send the values through the right callback. I did not add a callback
definition to cmds-receive.c as that code never uses
BTRFS_SEND_FLAG_NO_FILE_DATA.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 ioctl.h   |7 +++
 send-stream.c |6 ++
 send-stream.h |1 +
 send.h|1 +
 4 files changed, 15 insertions(+)

diff --git a/ioctl.h b/ioctl.h
index 6fda3a1..b7f1ce3 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -321,6 +321,13 @@ struct btrfs_ioctl_received_subvol_args {
__u64   reserved[16];   /* in */
 };
 
+/*
+ * Caller doesn't want file data in the send stream, even if the
+ * search of clone sources doesn't find an extent. UPDATE_EXTENT
+ * commands will be sent instead of WRITE commands.
+ */
+#defineBTRFS_SEND_FLAG_NO_FILE_DATA0x1
+
 struct btrfs_ioctl_send_args {
__s64 send_fd;  /* in */
__u64 clone_sources_count;  /* in */
diff --git a/send-stream.c b/send-stream.c
index 55fa728..a3628e4 100644
--- a/send-stream.c
+++ b/send-stream.c
@@ -418,6 +418,12 @@ static int read_and_process_cmd(struct btrfs_send_stream 
*s)
TLV_GET_TIMESPEC(s, BTRFS_SEND_A_CTIME, ct);
ret = s-ops-utimes(path, at, mt, ct, s-user);
break;
+   case BTRFS_SEND_C_UPDATE_EXTENT:
+   TLV_GET_STRING(s, BTRFS_SEND_A_PATH, path);
+   TLV_GET_U64(s, BTRFS_SEND_A_FILE_OFFSET, offset);
+   TLV_GET_U64(s, BTRFS_SEND_A_SIZE, tmp);
+   ret = s-ops-update_extent(path, offset, tmp, s-user);
+   break;
case BTRFS_SEND_C_END:
ret = 1;
break;
diff --git a/send-stream.h b/send-stream.h
index b69b7f1..9a17e32 100644
--- a/send-stream.h
+++ b/send-stream.h
@@ -49,6 +49,7 @@ struct btrfs_send_ops {
int (*utimes)(const char *path, struct timespec *at,
  struct timespec *mt, struct timespec *ct,
  void *user);
+   int (*update_extent)(const char *path, u64 offset, u64 len, void *user);
 };
 
 int btrfs_read_and_process_send_stream(int fd,
diff --git a/send.h b/send.h
index 9934e94..48d425a 100644
--- a/send.h
+++ b/send.h
@@ -86,6 +86,7 @@ enum btrfs_send_cmd {
BTRFS_SEND_C_UTIMES,
 
BTRFS_SEND_C_END,
+   BTRFS_SEND_C_UPDATE_EXTENT,
__BTRFS_SEND_C_MAX,
 };
 #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] btrfs-progs: libify some parts of btrfs-progs

2013-01-30 Thread Mark Fasheh
External software wanting to use the functionality provided by the btrfs
send ioctl has a hard time doing so without replicating tons of work. Of
particular interest are functions like btrfs_read_and_process_send_stream()
and subvol_uuid_search(). As that functionality requires a bit more than
just send-stream.c and send-utils.c we have to pull in some other parts of
the progs package.

This patch adds code to the Makefile and headers to create a library,
libbtrfs which the btrfs command now links to.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 Makefile   |   83 
 btrfs-list.h   |4 +++
 crc32c.h   |4 +++
 ctree.h|9 ++
 extent-cache.h |6 
 extent_io.h|7 +
 radix-tree.h   |4 +++
 rbtree.h   |4 +++
 send-utils.h   |5 
 9 files changed, 96 insertions(+), 30 deletions(-)

diff --git a/Makefile b/Makefile
index 4894903..4962985 100644
--- a/Makefile
+++ b/Makefile
@@ -1,14 +1,17 @@
 CC = gcc
-AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2
+AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 
-DBTRFS_FLAT_INCLUDES -fPIC
 CFLAGS = -g -O1
 objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
- root-tree.o dir-item.o file-item.o inode-item.o \
- inode-map.o crc32c.o rbtree.o extent-cache.o extent_io.o \
- volumes.o utils.o btrfs-list.o btrfslabel.o repair.o \
- send-stream.o send-utils.o qgroup.o
+ root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \
+ extent-cache.o extent_io.o volumes.o utils.o btrfslabel.o repair.o \
+ qgroup.o
 cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
   cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \
   cmds-quota.o cmds-qgroup.o
+libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o
+libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \
+  crc32c.h list.h kerncompat.h radix-tree.h extent-cache.h \
+  extent_io.h ioctl.h ctree.h
 
 CHECKFLAGS= -D__linux__ -Dlinux -D__STDC__ -Dunix -D__unix__ -Wbitwise \
-Wuninitialized -Wshadow -Wundef
@@ -17,13 +20,20 @@ DEPFLAGS = -Wp,-MMD,$(@D)/.$(@F).d,-MT,$@
 INSTALL = install
 prefix ?= /usr/local
 bindir = $(prefix)/bin
-LIBS=-luuid -lm
+libdir = $(prefix)/lib
+incdir = $(prefix)/include/btrfs
+lib_LIBS=-luuid -lm -L.
+LIBS=$(lib_LIBS) -lbtrfs
 RESTORE_LIBS=-lz
 
 progs = btrfsctl mkfs.btrfs btrfs-debug-tree btrfs-show btrfs-vol btrfsck \
btrfs btrfs-map-logical btrfs-image btrfs-zero-log btrfs-convert \
btrfs-find-root btrfs-restore btrfstune
 
+libs = libbtrfs.so.1.0
+lib_links = libbtrfs.so.1 libbtrfs.so
+headers = $(libbtrfs_headers)
+
 # make C=1 to enable sparse
 ifdef C
check = sparse $(CHECKFLAGS)
@@ -36,70 +46,77 @@ endif
$(CC) $(DEPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c $
 
 
-all: version $(progs) manpages
+all: version $(libs) $(progs) manpages
 
 version:
bash version.sh
 
-btrfs: $(objects) btrfs.o help.o common.o $(cmds_objects)
+$(libs): $(libbtrfs_objects) $(lib_links) send.h
+   $(CC) $(CFLAGS) $(libbtrfs_objects) $(lib_LIBS) -shared 
-Wl,-soname,libbtrfs.so.1 -o libbtrfs.so.1.0
+
+$(lib_links):
+   ln -sf libbtrfs.so.1.0 libbtrfs.so.1
+   ln -sf libbtrfs.so.1.0 libbtrfs.so
+
+btrfs: $(objects) btrfs.o help.o common.o $(cmds_objects) $(libs)
$(CC) $(CFLAGS) -o btrfs btrfs.o help.o common.o $(cmds_objects) \
$(objects) $(LDFLAGS) $(LIBS) -lpthread
 
-calc-size: $(objects) calc-size.o
+calc-size: $(objects) $(libs) calc-size.o
$(CC) $(CFLAGS) -o calc-size calc-size.o $(objects) $(LDFLAGS) $(LIBS)
 
-btrfs-find-root: $(objects) find-root.o
+btrfs-find-root: $(objects) $(libs) find-root.o
$(CC) $(CFLAGS) -o btrfs-find-root find-root.o $(objects) $(LDFLAGS) 
$(LIBS)
 
-btrfs-restore: $(objects) restore.o
+btrfs-restore: $(objects) $(libs) restore.o
$(CC) $(CFLAGS) -o btrfs-restore restore.o $(objects) $(LDFLAGS) 
$(LIBS) $(RESTORE_LIBS)
 
-btrfsctl: $(objects) btrfsctl.o
+btrfsctl: $(objects) $(libs) btrfsctl.o
$(CC) $(CFLAGS) -o btrfsctl btrfsctl.o $(objects) $(LDFLAGS) $(LIBS)
 
-btrfs-vol: $(objects) btrfs-vol.o
+btrfs-vol: $(objects) $(libs) btrfs-vol.o
$(CC) $(CFLAGS) -o btrfs-vol btrfs-vol.o $(objects) $(LDFLAGS) $(LIBS)
 
-btrfs-show: $(objects) btrfs-show.o
+btrfs-show: $(objects) $(libs) btrfs-show.o
$(CC) $(CFLAGS) -o btrfs-show btrfs-show.o $(objects) $(LDFLAGS) $(LIBS)
 
-btrfsck: $(objects) btrfsck.o
+btrfsck: $(objects) $(libs) btrfsck.o
$(CC) $(CFLAGS) -o btrfsck btrfsck.o $(objects) $(LDFLAGS) $(LIBS)
 
-mkfs.btrfs: $(objects) mkfs.o
+mkfs.btrfs: $(objects) $(libs) mkfs.o
$(CC) $(CFLAGS) -o mkfs.btrfs $(objects) mkfs.o $(LDFLAGS) $(LIBS)
 
-btrfs-debug-tree: $(objects) debug-tree.o
+btrfs-debug-tree: 

[PATCH 3/5] btrfs-progs: add send-test

2013-01-30 Thread Mark Fasheh
send-test.c links against libbtrfs and uses the send functionality provided
to decode and print a send stream to the console.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 Makefile|5 +-
 send-test.c |  458 +++
 2 files changed, 462 insertions(+), 1 deletion(-)
 create mode 100644 send-test.c

diff --git a/Makefile b/Makefile
index 4962985..27d62c0 100644
--- a/Makefile
+++ b/Makefile
@@ -119,6 +119,9 @@ btrfs-convert: $(objects) $(libs) convert.o
 ioctl-test: $(objects) $(libs) ioctl-test.o
$(CC) $(CFLAGS) -o ioctl-test $(objects) ioctl-test.o $(LDFLAGS) $(LIBS)
 
+send-test: $(objects) send-test.o
+   $(CC) $(CFLAGS) -o send-test send-test.o $(LDFLAGS) $(LIBS) -lpthread
+
 manpages:
cd man; $(MAKE)
 
@@ -128,7 +131,7 @@ install-man:
 clean :
rm -f $(progs) $(libs) cscope.out *.o .*.d btrfs-convert btrfs-image \
  btrfs-select-super btrfs-zero-log btrfstune dir-test ioctl-test \
- quick-test version.h
+ quick-test send-test version.h
cd man; $(MAKE) clean
 
 install: $(libs) $(progs) install-man
diff --git a/send-test.c b/send-test.c
new file mode 100644
index 000..8c14718
--- /dev/null
+++ b/send-test.c
@@ -0,0 +1,458 @@
+/*
+ * Copyright (C) 2013 SUSE.  All rights reserved.
+ *
+ * This code is adapted from cmds-send.c and cmds-receive.c,
+ * Both of which are:
+ *
+ * Copyright (C) 2012 Alexander Block.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#define _GNU_SOURCE
+
+#include unistd.h
+#include stdint.h
+#include dirent.h
+#include pthread.h
+#include math.h
+#include sys/types.h
+#include sys/stat.h
+#include fcntl.h
+#include sys/ioctl.h
+#include libgen.h
+#include mntent.h
+#include limits.h
+#include stdlib.h
+#include asm/types.h
+#include uuid/uuid.h
+
+/*
+ * This should be compilable without the rest of the btrfs-progs
+ * source distribution.
+ */
+#if BTRFS_FLAT_INCLUDES
+#include send-utils.h
+#include send-stream.h
+#else
+#include btrfs/send-utils.h
+#include btrfs/send-stream.h
+#endif /* BTRFS_FLAT_INCLUDES */
+
+static int pipefd[2];
+struct btrfs_ioctl_send_args io_send = {0, };
+static char *subvol_path;
+static char *root_path;
+
+struct recv_args {
+   char *full_subvol_path;
+   char *root_path;
+};
+
+void usage(int error)
+{
+   printf(send-test btrfs root subvol\n);
+   if (error)
+   exit(error);
+}
+
+static int print_subvol(const char *path, const u8 *uuid, u64 ctransid,
+   void *user)
+{
+   struct recv_args *r = user;
+   char uuid_str[128];
+
+   r-full_subvol_path = path_cat(r-root_path, path);
+   uuid_unparse(uuid, uuid_str);
+
+   printf(subvol\t%s\t%llu\t%s\n, uuid_str,
+  (unsigned long long)ctransid, r-full_subvol_path);
+
+   return 0;
+}
+
+static int print_snapshot(const char *path, const u8 *uuid, u64 ctransid,
+ const u8 *parent_uuid, u64 parent_ctransid,
+ void *user)
+{
+   struct recv_args *r = user;
+   char uuid_str[128];
+   char parent_uuid_str[128];
+
+   r-full_subvol_path = path_cat(r-root_path, path);
+   uuid_unparse(uuid, uuid_str);
+   uuid_unparse(parent_uuid, parent_uuid_str);
+
+   printf(snapshot\t%s\t%llu\t%s\t%llu\t%s\n, uuid_str,
+  (unsigned long long)ctransid, parent_uuid_str,
+  (unsigned long long)parent_ctransid, r-full_subvol_path);
+
+   return 0;
+}
+
+static int print_mkfile(const char *path, void *user)
+{
+   struct recv_args *r = user;
+   char *full_path = path_cat(r-full_subvol_path, path);
+
+   printf(mkfile\t%s\n, full_path);
+
+   free(full_path);
+   return 0;
+}
+
+static int print_mkdir(const char *path, void *user)
+{
+   struct recv_args *r = user;
+   char *full_path = path_cat(r-full_subvol_path, path);
+
+   printf(mkdir\t%s\n, full_path);
+
+   free(full_path);
+   return 0;
+}
+
+static int print_mknod(const char *path, u64 mode, u64 dev, void *user)
+{
+   struct recv_args *r = user;
+   char *full_path = path_cat(r-full_subvol_path, path);
+
+   printf(mknod\t%llo\t0x%llx\t%s\n, (unsigned long long)mode,
+  (unsigned long long)dev, full_path);
+
+   free(full_path);
+   

[PATCH 4/5] btrfs-progs: make libbtrfs usable from C++

2013-01-30 Thread Mark Fasheh
From: Arvin Schnell aschn...@suse.de

Please find attached a patch to make the new libbtrfs usable from
C++ (at least for the parts snapper will likely need).

Signed-off-by: Arvin Schnell aschn...@suse.de
Signed-off-by: Mark Fasheh mfas...@suse.de
---
 extent_io.c   |6 +++---
 extent_io.h   |6 +++---
 ioctl.h   |9 +
 list.h|   40 
 rbtree.h  |2 +-
 send-stream.h |7 +++
 send-utils.h  |7 +++
 send.h|8 
 8 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/extent_io.c b/extent_io.c
index ebb35b2..70ecc48 100644
--- a/extent_io.c
+++ b/extent_io.c
@@ -48,7 +48,7 @@ static struct extent_state *alloc_extent_state(void)
return NULL;
state-refs = 1;
state-state = 0;
-   state-private = 0;
+   state-xprivate = 0;
return state;
 }
 
@@ -509,7 +509,7 @@ int set_state_private(struct extent_io_tree *tree, u64 
start, u64 private)
ret = -ENOENT;
goto out;
}
-   state-private = private;
+   state-xprivate = private;
 out:
return ret;
 }
@@ -530,7 +530,7 @@ int get_state_private(struct extent_io_tree *tree, u64 
start, u64 *private)
ret = -ENOENT;
goto out;
}
-   *private = state-private;
+   *private = state-xprivate;
 out:
return ret;
 }
diff --git a/extent_io.h b/extent_io.h
index 4553859..6d8404d 100644
--- a/extent_io.h
+++ b/extent_io.h
@@ -54,7 +54,7 @@ struct extent_state {
u64 end;
int refs;
unsigned long state;
-   u64 private;
+   u64 xprivate;
 };
 
 struct extent_buffer {
@@ -93,8 +93,8 @@ int extent_buffer_uptodate(struct extent_buffer *eb);
 int set_extent_buffer_uptodate(struct extent_buffer *eb);
 int clear_extent_buffer_uptodate(struct extent_io_tree *tree,
struct extent_buffer *eb);
-int set_state_private(struct extent_io_tree *tree, u64 start, u64 private);
-int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private);
+int set_state_private(struct extent_io_tree *tree, u64 start, u64 xprivate);
+int get_state_private(struct extent_io_tree *tree, u64 start, u64 *xprivate);
 struct extent_buffer *find_extent_buffer(struct extent_io_tree *tree,
 u64 bytenr, u32 blocksize);
 struct extent_buffer *find_first_extent_buffer(struct extent_io_tree *tree,
diff --git a/ioctl.h b/ioctl.h
index b7f1ce3..56de39f 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -22,6 +22,10 @@
 #include linux/ioctl.h
 #include time.h
 
+#ifdef __cplusplus
+extern C {
+#endif
+
 #define BTRFS_IOCTL_MAGIC 0x94
 #define BTRFS_VOL_NAME_MAX 255
 
@@ -439,4 +443,9 @@ struct btrfs_ioctl_clone_range_args {
struct btrfs_ioctl_qgroup_create_args)
 #define BTRFS_IOC_QGROUP_LIMIT _IOR(BTRFS_IOCTL_MAGIC, 43, \
struct btrfs_ioctl_qgroup_limit_args)
+
+#ifdef __cplusplus
+}
+#endif
+
 #endif
diff --git a/list.h b/list.h
index d31090c..50f4619 100644
--- a/list.h
+++ b/list.h
@@ -19,8 +19,8 @@
 #ifndef _LINUX_LIST_H
 #define _LINUX_LIST_H
 
-#define LIST_POISON1  ((void *) 0x00100100)
-#define LIST_POISON2  ((void *) 0x00200200)
+#define LIST_POISON1  ((struct list_head *) 0x00100100)
+#define LIST_POISON2  ((struct list_head *) 0x00200200)
 
 /*
  * Simple doubly linked list implementation.
@@ -54,17 +54,17 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
  * the prev/next entries already!
  */
 #ifndef CONFIG_DEBUG_LIST
-static inline void __list_add(struct list_head *new,
+static inline void __list_add(struct list_head *xnew,
  struct list_head *prev,
  struct list_head *next)
 {
-   next-prev = new;
-   new-next = next;
-   new-prev = prev;
-   prev-next = new;
+   next-prev = xnew;
+   xnew-next = next;
+   xnew-prev = prev;
+   prev-next = xnew;
 }
 #else
-extern void __list_add(struct list_head *new,
+extern void __list_add(struct list_head *xnew,
  struct list_head *prev,
  struct list_head *next);
 #endif
@@ -78,12 +78,12 @@ extern void __list_add(struct list_head *new,
  * This is good for implementing stacks.
  */
 #ifndef CONFIG_DEBUG_LIST
-static inline void list_add(struct list_head *new, struct list_head *head)
+static inline void list_add(struct list_head *xnew, struct list_head *head)
 {
-   __list_add(new, head, head-next);
+   __list_add(xnew, head, head-next);
 }
 #else
-extern void list_add(struct list_head *new, struct list_head *head);
+extern void list_add(struct list_head *xnew, struct list_head *head);
 #endif
 
 
@@ -95,9 +95,9 @@ extern void list_add(struct list_head *new, struct list_head 
*head);
  * Insert a new entry before the specified head.
  * This is useful for implementing queues.

[PATCH 5/5] btrfs-progs: use autotools for building

2013-01-30 Thread Mark Fasheh
From: Jeff Mahoney je...@suse.com

Since we're building shared libraries now, let's not reinvent the wheel.
This also makes packaging libbtrfs much easier. The following (empty) files
are added to satisfy autoconf: AUTHORS ChangeLog NEWS INSTALL.

Changes by Mark Fasheh:
- Fixes to make this patch work with upstream
- I moved the previous contents of INSTALL to README since that seems
  to make more sense now)

Signed-off-by: Jeff Mahoney je...@suse.com
Signed-off-by: Mark Fasheh mfas...@suse.de
---
 INSTALL  |   59 
 Makefile |  146 
 Makefile.am  |   53 
 README   |   59 
 autogen.sh   |5 +
 configure.ac |6 +
 man/Makefile |   37 -
 man/Makefile.am  |2 +
 man/btrfs-image.8|   34 +
 man/btrfs-image.8.in |   34 -
 man/btrfs-show.8 |   22 +++
 man/btrfs-show.8.in  |   22 ---
 man/btrfs.8  |  365 ++
 man/btrfs.8.in   |  365 --
 man/btrfsck.8|   17 +++
 man/btrfsck.8.in |   17 ---
 man/btrfsctl.8   |   48 +++
 man/btrfsctl.8.in|   48 ---
 man/mkfs.btrfs.8 |   79 +++
 man/mkfs.btrfs.8.in  |   79 ---
 20 files changed, 690 insertions(+), 807 deletions(-)
 create mode 100644 AUTHORS
 create mode 100644 ChangeLog
 delete mode 100644 Makefile
 create mode 100644 Makefile.am
 create mode 100644 NEWS
 create mode 100644 README
 create mode 100755 autogen.sh
 create mode 100644 configure.ac
 delete mode 100644 man/Makefile
 create mode 100644 man/Makefile.am
 create mode 100644 man/btrfs-image.8
 delete mode 100644 man/btrfs-image.8.in
 create mode 100644 man/btrfs-show.8
 delete mode 100644 man/btrfs-show.8.in
 create mode 100644 man/btrfs.8
 delete mode 100644 man/btrfs.8.in
 create mode 100644 man/btrfsck.8
 delete mode 100644 man/btrfsck.8.in
 create mode 100644 man/btrfsctl.8
 delete mode 100644 man/btrfsctl.8.in
 create mode 100644 man/mkfs.btrfs.8
 delete mode 100644 man/mkfs.btrfs.8.in

diff --git a/AUTHORS b/AUTHORS
new file mode 100644
index 000..e69de29
diff --git a/ChangeLog b/ChangeLog
new file mode 100644
index 000..e69de29
diff --git a/INSTALL b/INSTALL
index 6afbd90..e69de29 100644
--- a/INSTALL
+++ b/INSTALL
@@ -1,59 +0,0 @@
-Install Instructions
-
-Btrfs puts snapshots and subvolumes into the root directory of the FS.  This
-directory can only be changed by btrfsctl right now, and normal filesystem
-operations do not work on it.  The default subvolume is called 'default',
-and you can create files and directories in mount_point/default
-
-Btrfs uses libcrc32c in the kernel for file and metadata checksums.  You need
-to compile the kernel with:
-
-CONFIG_LIBCRC32C=m
-
-libcrc32c can be static as well.  Once your kernel is setup, typing make in the
-btrfs module sources will build against the running kernel.  When the build is
-complete:
-
-modprobe libcrc32c
-insmod btrfs.ko
-
-The Btrfs utility programs require libuuid to build.  This can be found
-in the e2fsprogs sources, and is usually available as libuuid or
-e2fsprogs-devel from various distros.
-
-Building the utilities is just make ; make install.  The programs go
-into /usr/local/bin.  The mains commands available are:
-
-mkfs.btrfs: create a filesystem
-
-btrfs: control program to create snapshots and subvolumes:
-   # mount a btrfs filesystem
-   mount /dev/sda2 /mnt
-
-   # create a subvolume
-   btrfs subvolume create /mnt/new_subvol_name
-
-   # snapshot of a subvolume
-   btrfs subvolume snapshot /mnt/default /mnt/snapshot_of_default 
-   btrfs subvolume snapshot /mnt/snapshot_of_default \
-   /mnt/snapshot_of_a_snapshot
-
-   # list of the subvolumes
-   ls /mnt
-   default snapshot_of_a_snapshot snapshot_of_new_subvol
-   new_subvol_name snapshot_of_default
-
-   # removal of a subvolume or a snapshot
-   btrfs subvolume delete /mn/snapshot_of_a_snapshot
-
-   # look a the btrfs man page for further information
-   man btrfs
-
-btrfsck: do a limited check of the FS extent trees./li
-
-btrfs-debug-tree: print all of the FS metadata in text form.  Example:
-
-   btrfs-debug-tree /dev/sda2  big_output_file
-
-
-
diff --git a/Makefile b/Makefile
deleted file mode 100644
index 27d62c0..000
--- a/Makefile
+++ /dev/null
@@ -1,146 +0,0 @@
-CC = gcc
-AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 
-DBTRFS_FLAT_INCLUDES -fPIC
-CFLAGS = -g -O1
-objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
- root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \
- extent-cache.o extent_io.o volumes.o utils.o btrfslabel.o repair.o \
- qgroup.o
-cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
-  cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \
-  

Re: [PATCH] [RFC] include btrfsck in btrfs - including name check

2013-01-30 Thread Ian Kumlien
On Wed, Jan 30, 2013 at 11:59:05PM +0200, Ilya Dryomov wrote:
 On Wed, Jan 30, 2013 at 10:11:44PM +0100, Ian Kumlien wrote:
  On Wed, Jan 30, 2013 at 12:33:42PM -0800, Filipe Brandenburger wrote:
   Hi Ian,
   
   On Tue, Jan 29, 2013 at 3:03 PM, Ian Kumlien po...@demius.net wrote:
This patch includes fsck as a subcommand of btrfs, but if you rename
the binary to btrfsck (or, preferably, use a symlink) it will act like
the old btrfs command.
   
   You can rename files in your git (there's git mv for that), only
   thing is when you generate the patch with format-patch (or git show,
   git diff etc.) pass it the -M option to detect moves and act
   appropriately.
  
  git send-email seems to send the full diff, diffing against /dev/null =P
  This is why i skipped that part.
  
   Regarding your patches, I really like the idea of btrfs fsck but I
   think I'd prefer to keep the external commands as wrapper scripts
   instead of adding busybox-style name detection to btrfs... But then,
   that's just my opinion.
  
  Well, now both works.
  
   I guess I would have a btrfsck that would simply contain:
   
   #! /bin/sh
   exec btrfs fsck $@
   
   Downside is that error reporting (e.g. invalid syntax, etc.) would
   show btrfs fsck instead of the command the user actually typed...
  
  Actually it still does, due to how btrfs handles things... It's a simple
  enough fix and it will make rescue cd's or dracut images, or just about
  anything.
  
  I understand your point, but i think this is a simpler solution =)
 
 FWIW I agree with Filipe, this name detection thing looks ugly to me.
 The merge itself is a good idea, but I think we should stick with shell
 wrappers for everything else.

Which part of it?

char *func = strrchr(argv[0], '/');
if (func)
argv[0] = ++func;

Would you prefer i rewrote it as:
...
char *func = strrchr(argv[0], '/');
if (func)
++func;
else
func = argv[0]
...
if (parse_one_exact_token(func, function_cmd_group, cmd)  0)
---

Would that be better?

The only thing it actually does is remove any path if present

I have now added:
btrfs rescue restore [options] device
Restore filesystem
btrfs rescue select-super -s number device
Select a superblock
btrfs rescue dump-super device
Dump a superblock to disk
btrfs rescue debug-tree [options] device
Debug the filesystem

And i'm waiting to rebase my patch series since i need to rewrite the
commit messages if this is wanted changes.

 Thanks,
 
   Ilya
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Poor performance of btrfs. Suspected unidentified btrfs housekeeping process which writes a lot

2013-01-30 Thread Chris Murphy

On Jan 30, 2013, at 7:57 AM, Adam Ryczkowski adam.ryczkow...@statystyka.net 
wrote:
 
 I suspect it has something to do with snapshots I make for backup. I have 35 
 of them, and I ask bedup to find duplicates across all subvolumes.


Assuming most files do have identical duplicates, implies the same file in all 
35 subvolumes is actually in the same physical location; it differs only in 
subvol reference. But it's not btrfs that determines the duplicate vs 
unique state of those 35 file instances, but unison. The fs still must send 
all 35x instances for the state to be determined, as if they were unique files.

Another thing, I'd expect this to scale very poorly if the 35 subvolumes 
contain any appreciable uniqueness, because searches can't be done in parallel. 
So the more subvolumes you add, the more disk contention you get, but also 
enormous amounts of latency as possibly 35 locations on the disk are being 
searched if they happen to be unique.

So in either case duplicate vs unique you have a problem, just different 
kinds. And as the storage grows, it increasingly encounters both problems at 
the same time. Small problem. What size are the files?

And that's on a bare drive before you went and did this:

 My filesystem /dev/vg-adama-docs/lv-adama-docs is 372GB in size, and is a 
 quite complex setup:
 It is based on logical volume (LVM2), which has a single physical volume made 
 by dm-crypt device /dev/dm-1, which subsequently sits on top of /dev/md1 
 linux raid 6, which is built with 4 identical 186GB GPT partitions on each of 
 my SATA 3TB hard drives.

Why are you using raid6 for four disks, instead of raid10?  What's the chunk 
size for the raid 6? What's the btrfs leaf size? What's the dedup chunk size?

Why are you using LVM at all, while the /dev/dm-1 is the same size as the LV? 
You say the btrfs volume on LV is on dm-1 which means they're all the same 
size, obviating the need for LVM in this case entirely.

Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/11] btrfs: remove unused fd in btrfs_ioctl_send()

2013-01-30 Thread Eric Sandeen
All we do is set it to NULL and test it :)

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/send.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 321b7fb..614da0d 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -4536,7 +4536,6 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
struct btrfs_fs_info *fs_info;
struct btrfs_ioctl_send_args *arg = NULL;
struct btrfs_key key;
-   struct file *filp = NULL;
struct send_ctx *sctx = NULL;
u32 i;
u64 *clone_sources_tmp = NULL;
@@ -4673,8 +4672,6 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
goto out;
 
 out:
-   if (filp)
-   fput(filp);
kfree(arg);
vfree(clone_sources_tmp);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/11] btrfs: misc fixes and cleanups

2013-01-30 Thread Eric Sandeen
A handful of fixes from looking at Coverity static checker
results, and the surrounding code.  Of varying severity,
but all worthwhile I hope.

Full disclosure: compile-tested only, but nothing too crazy
in here I think.

Thanks,
-Eric

[PATCH 01/11] btrfs: remove unused fd in btrfs_ioctl_send()
[PATCH 02/11] btrfs: list_entry can't return NULL
[PATCH 03/11] btrfs: remove unused fs_info from btrfs_decode_error()
[PATCH 04/11] btrfs: handle null fs_info in btrfs_panic()
[PATCH 05/11] btrfs: annotate intentional switch case fallthroughs
[PATCH 06/11] btrfs: add missing break in btrfs_print_leaf()
[PATCH 07/11] btrfs: fix varargs in __btrfs_std_error
[PATCH 08/11] btrfs: remove unused item in btrfs_insert_delayed_item()
[PATCH 09/11] btrfs: remove unnecessary DEFINE_WAIT() declarations
[PATCH 10/11] btrfs: ensure we don't overrun devices_info[] in 
__btrfs_alloc_chunk
[PATCH 11/11] btrfs: don't try to notify udev about missing devices
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/11] btrfs: remove unused fs_info from btrfs_decode_error()

2013-01-30 Thread Eric Sandeen
Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/super.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d8982e9..e933a5f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -63,8 +63,7 @@
 static const struct super_operations btrfs_super_ops;
 static struct file_system_type btrfs_fs_type;
 
-static const char *btrfs_decode_error(struct btrfs_fs_info *fs_info, int errno,
- char nbuf[16])
+static const char *btrfs_decode_error(int errno, char nbuf[16])
 {
char *errstr = NULL;
 
@@ -152,7 +151,7 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
if (errno == -EROFS  (sb-s_flags  MS_RDONLY))
return;
 
-   errstr = btrfs_decode_error(fs_info, errno, nbuf);
+   errstr = btrfs_decode_error(errno, nbuf);
if (fmt) {
struct va_format vaf = {
.fmt = fmt,
@@ -261,7 +260,7 @@ void __btrfs_abort_transaction(struct btrfs_trans_handle 
*trans,
char nbuf[16];
const char *errstr;
 
-   errstr = btrfs_decode_error(root-fs_info, errno, nbuf);
+   errstr = btrfs_decode_error(errno, nbuf);
btrfs_printk(root-fs_info,
 %s:%d: Aborting unused transaction(%s).\n,
 function, line, errstr);
@@ -289,7 +288,7 @@ void __btrfs_panic(struct btrfs_fs_info *fs_info, const 
char *function,
va_start(args, fmt);
vaf.va = args;
 
-   errstr = btrfs_decode_error(fs_info, errno, nbuf);
+   errstr = btrfs_decode_error(errno, nbuf);
if (fs_info-mount_opt  BTRFS_MOUNT_PANIC_ON_FATAL_ERROR)
panic(KERN_CRIT BTRFS panic (device %s) in %s:%d: %pV (%s)\n,
s_id, function, line, vaf, errstr);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/11] btrfs: handle null fs_info in btrfs_panic()

2013-01-30 Thread Eric Sandeen
At least backref_tree_panic() can apparently pass
in a null fs_info, so handle that in __btrfs_panic
to get the message out on the console.

The btrfs_panic macro also uses fs_info, but that's
largely pointless; it's testing to see if
BTRFS_MOUNT_PANIC_ON_FATAL_ERROR is not set.
But if it *were* set, __btrfs_panic() would have,
well, paniced and we wouldn't be here, testing it!
So just BUG() at this point.

And since we only use fs_info once now, just use it
directly.

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/ctree.h |9 ++---
 fs/btrfs/super.c |2 +-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 547b7b0..57121fc 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3620,11 +3620,14 @@ __printf(5, 6)
 void __btrfs_panic(struct btrfs_fs_info *fs_info, const char *function,
   unsigned int line, int errno, const char *fmt, ...);
 
+/*
+ * If BTRFS_MOUNT_PANIC_ON_FATAL_ERROR is in mount_opt, __btrfs_panic
+ * will panic().  Otherwise we BUG() here.
+ */
 #define btrfs_panic(fs_info, errno, fmt, args...)  \
 do {   \
-   struct btrfs_fs_info *_i = (fs_info);   \
-   __btrfs_panic(_i, __func__, __LINE__, errno, fmt, ##args);  \
-   BUG_ON(!(_i-mount_opt  BTRFS_MOUNT_PANIC_ON_FATAL_ERROR));\
+   __btrfs_panic(fs_info, __func__, __LINE__, errno, fmt, ##args); \
+   BUG();  \
 } while (0)
 
 /* acl.c */
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index e933a5f..d5e7e18 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -289,7 +289,7 @@ void __btrfs_panic(struct btrfs_fs_info *fs_info, const 
char *function,
vaf.va = args;
 
errstr = btrfs_decode_error(errno, nbuf);
-   if (fs_info-mount_opt  BTRFS_MOUNT_PANIC_ON_FATAL_ERROR)
+   if (fs_info  (fs_info-mount_opt  BTRFS_MOUNT_PANIC_ON_FATAL_ERROR))
panic(KERN_CRIT BTRFS panic (device %s) in %s:%d: %pV (%s)\n,
s_id, function, line, vaf, errstr);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/11] btrfs: ensure we don't overrun devices_info[] in __btrfs_alloc_chunk

2013-01-30 Thread Eric Sandeen
WARN_ON isn't enough, we need to stop the loop if for any reason
we would overrun the devices_info array.

I tried to track down the connection between the length of
the alloc_devices list and the rw_devices counter but
it wasn't immediately obvious, so be defensive about it.

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/volumes.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 15f6efd..09c63ac 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3630,12 +3630,16 @@ static int __btrfs_alloc_chunk(struct 
btrfs_trans_handle *trans,
if (max_avail  BTRFS_STRIPE_LEN * dev_stripes)
continue;
 
+   if (ndevs == fs_devices-rw_devices) {
+   WARN(1, %s: found more than %llu devices\n,
+__func__, fs_devices-rw_devices);
+   break;
+   }
devices_info[ndevs].dev_offset = dev_offset;
devices_info[ndevs].max_avail = max_avail;
devices_info[ndevs].total_avail = total_avail;
devices_info[ndevs].dev = device;
++ndevs;
-   WARN_ON(ndevs  fs_devices-rw_devices);
}
 
/*
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/11] btrfs: remove unused item in btrfs_insert_delayed_item()

2013-01-30 Thread Eric Sandeen
item was set but never used in this function.

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/delayed-inode.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 3483603..092c680 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -875,7 +875,6 @@ static int btrfs_insert_delayed_item(struct 
btrfs_trans_handle *trans,
 struct btrfs_delayed_item *delayed_item)
 {
struct extent_buffer *leaf;
-   struct btrfs_item *item;
char *ptr;
int ret;
 
@@ -886,7 +885,6 @@ static int btrfs_insert_delayed_item(struct 
btrfs_trans_handle *trans,
 
leaf = path-nodes[0];
 
-   item = btrfs_item_nr(leaf, path-slots[0]);
ptr = btrfs_item_ptr(leaf, path-slots[0], char);
 
write_extent_buffer(leaf, delayed_item-data, (unsigned long)ptr,
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/11] btrfs: list_entry can't return NULL

2013-01-30 Thread Eric Sandeen
No need to test the result, we can't get a
null pointer from list_entry()

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/disk-io.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a8f652d..d89da40 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3823,8 +3823,6 @@ int btrfs_cleanup_transaction(struct btrfs_root *root)
 
while (!list_empty(list)) {
t = list_entry(list.next, struct btrfs_transaction, list);
-   if (!t)
-   break;
 
btrfs_destroy_ordered_operations(root);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/11] btrfs: annotate intentional switch case fallthroughs

2013-01-30 Thread Eric Sandeen
This keeps static checkers happy.

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/ctree.c |1 +
 fs/btrfs/super.c |1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index eea5da7..ac4a424 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1138,6 +1138,7 @@ __tree_mod_log_rewind(struct extent_buffer *eb, u64 
time_seq,
switch (tm-op) {
case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
BUG_ON(tm-slot  n);
+   /* Fallthrough */
case MOD_LOG_KEY_REMOVE_WHILE_MOVING:
case MOD_LOG_KEY_REMOVE:
btrfs_set_node_key(eb, tm-key, tm-slot);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d5e7e18..1dd2d86 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -437,6 +437,7 @@ int btrfs_parse_options(struct btrfs_root *root, char 
*options)
case Opt_compress_force:
case Opt_compress_force_type:
compress_force = true;
+   /* Fallthrough */
case Opt_compress:
case Opt_compress_type:
if (token == Opt_compress ||
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/11] btrfs: add missing break in btrfs_print_leaf()

2013-01-30 Thread Eric Sandeen
I don't think that BTRFS_DEV_EXTENT_KEY is supposed
to fall through to BTRFS_DEV_STATS_KEY ...

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/print-tree.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c
index 50d95fd..920957e 100644
--- a/fs/btrfs/print-tree.c
+++ b/fs/btrfs/print-tree.c
@@ -294,6 +294,7 @@ void btrfs_print_leaf(struct btrfs_root *root, struct 
extent_buffer *l)
   btrfs_dev_extent_chunk_offset(l, dev_extent),
   (unsigned long long)
   btrfs_dev_extent_length(l, dev_extent));
+   break;
case BTRFS_DEV_STATS_KEY:
printk(KERN_INFO \t\tdevice stats\n);
break;
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/11] btrfs: remove unnecessary DEFINE_WAIT() declarations

2013-01-30 Thread Eric Sandeen
No point in DEFINE_WAIT(wait) if it's not used!

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/extent-tree.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a8b8adc..0bb3424 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5478,7 +5478,6 @@ wait_block_group_cache_progress(struct 
btrfs_block_group_cache *cache,
u64 num_bytes)
 {
struct btrfs_caching_control *caching_ctl;
-   DEFINE_WAIT(wait);
 
caching_ctl = get_caching_control(cache);
if (!caching_ctl)
@@ -5495,7 +5494,6 @@ static noinline int
 wait_block_group_cache_done(struct btrfs_block_group_cache *cache)
 {
struct btrfs_caching_control *caching_ctl;
-   DEFINE_WAIT(wait);
 
caching_ctl = get_caching_control(cache);
if (!caching_ctl)
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/11] btrfs: fix varargs in __btrfs_std_error

2013-01-30 Thread Eric Sandeen
__btrfs_std_error didn't always properly call va_end,
and might call va_start even if fmt was NULL.

Move all the varargs handling into the block where we
have fmt.

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/super.c |   14 +++---
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1dd2d86..fe3c799 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -141,8 +141,6 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
struct super_block *sb = fs_info-sb;
char nbuf[16];
const char *errstr;
-   va_list args;
-   va_start(args, fmt);
 
/*
 * Special case: if the error is EROFS, and we're already
@@ -153,13 +151,16 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, 
const char *function,
 
errstr = btrfs_decode_error(errno, nbuf);
if (fmt) {
-   struct va_format vaf = {
-   .fmt = fmt,
-   .va = args,
-   };
+   struct va_format vaf;
+   va_list args;
+
+   va_start(args, fmt);
+   vaf.fmt = fmt;
+   vaf.va = args;
 
printk(KERN_CRIT BTRFS error (device %s) in %s:%d: %s (%pV)\n,
sb-s_id, function, line, errstr, vaf);
+   va_end(args);
} else {
printk(KERN_CRIT BTRFS error (device %s) in %s:%d: %s\n,
sb-s_id, function, line, errstr);
@@ -170,7 +171,6 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
save_error_info(fs_info);
btrfs_handle_error(fs_info);
}
-   va_end(args);
 }
 
 static const char * const logtypes[] = {
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/11] btrfs: don't try to notify udev about missing devices

2013-01-30 Thread Eric Sandeen
If we remove a missing device, bdev is null, and if we
send that off to btrfs_kobject_uevent we'll panic.

Signed-off-by: Eric Sandeen sand...@redhat.com
---
 fs/btrfs/volumes.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 09c63ac..bdd6962 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1556,7 +1556,8 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path)
ret = 0;
 
/* Notify udev that device has changed */
-   btrfs_kobject_uevent(bdev, KOBJ_CHANGE);
+   if (bdev)
+   btrfs_kobject_uevent(bdev, KOBJ_CHANGE);
 
 error_brelse:
brelse(bh);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/11] btrfs: misc fixes and cleanups

2013-01-30 Thread Zach Brown
On Wed, Jan 30, 2013 at 06:54:51PM -0600, Eric Sandeen wrote:
 A handful of fixes from looking at Coverity static checker
 results, and the surrounding code.  Of varying severity,
 but all worthwhile I hope.
 
 Full disclosure: compile-tested only, but nothing too crazy
 in here I think.

Yeah, this all seems pretty reasonable to me.

Signed-off-by: Zach Brown z...@redhat.com

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Poor performance of btrfs. Suspected unidentified btrfs housekeeping process which writes a lot

2013-01-30 Thread Adam Ryczkowski

Than you, Chris, for your time.


On 2013-01-31 00:58, Chris Murphy wrote:

On Jan 30, 2013, at 7:57 AM, Adam Ryczkowskiadam.ryczkow...@statystyka.net  
wrote:

I suspect it has something to do with snapshots I make for backup. I have 35 of 
them, and I ask bedup to find duplicates across all subvolumes.

Assuming most files do have identical duplicates, implies the same file in all 35 subvolumes is 
actually in the same physical location; it differs only in subvol reference. But it's not btrfs 
that determines the duplicate vs unique state of those 35 file instances, 
but unison. The fs still must send all 35x instances for the state to be determined, as if they 
were unique files.
I'm sorry if I didn't put my question more clearly. I tried to write, 
that the problem is not specific to the unison; I am able to reproduce 
the problem using other means of reading contents of the file. I tried 
'cat' many small files, and previewing under Midnight Commander some 
large ones. I didn't take precise measurements, but I can tell, that 
reading 500 50-byte files (ca. 25kB of data) took way longer that 
reading one 3MB file, so I suspect the problem is with metadata access 
times rather than with data.


I am aware, that reading 1MB distributed in small files takes longer 
than 1MB of sequential reading. The problem is that _suddenly_ this 
speed  got at least 20 times longer than usual. And from what iotop and 
systat told me, the harddrives were busy _writing_ something, not 
_reading_! The amount of time I wait for scanning the whole harddrive 
with unison is comparable with time that full balance takes.


Anyway, I synchronize only the working copy part of my file system. 
All the backup subvolumes sit in a separate path, not seen by the unison.
Moreover, once I wait long enough for the system to finish scanning the 
file system, file access speeds are back to normal, even after I drop 
read cache or even reboot the system. It is only after making another 
snapshot, when the problems recurs.

Another thing, I'd expect this to scale very poorly if the 35 subvolumes 
contain any appreciable uniqueness, because searches can't be done in parallel. 
So the more subvolumes you add, the more disk contention you get, but also 
enormous amounts of latency as possibly 35 locations on the disk are being 
searched if they happen to be unique.


*The severity of my problem is proportional to time*. It happens 
immediately after making snaphot, and persists for each file until I try 
to read its contents. Than, even after the reboot, timing is back to 
normal. With my limited knowledge about the internals of btrfs I 
suspect, that the bedup has messed my metadata somehow. Maybe I should 
balance only the metadata part (if that is possible at all)?

So in either case duplicate vs unique you have a problem, just different 
kinds. And as the storage grows, it increasingly encounters both problems at the same time. Small 
problem. What size are the files?

And that's on a bare drive before you went and did this:


My filesystem /dev/vg-adama-docs/lv-adama-docs is 372GB in size, and is a quite 
complex setup:
It is based on logical volume (LVM2), which has a single physical volume made 
by dm-crypt device /dev/dm-1, which subsequently sits on top of /dev/md1 linux 
raid 6, which is built with 4 identical 186GB GPT partitions on each of my SATA 
3TB hard drives.

Why are you using raid6 for four disks, instead of raid10?
Because I plan to add another 4 in the future. It's way easier to add 
another disk to the array, than to change the RAID layout.

What's the chunk size for the raid 6? What's the btrfs leaf size? What's the 
dedup chunk size?
I'll tell you tomorrow, but I hardly think that the misalignment could 
be any problem here. As I said, everything was fine and the problem 
didn't appear in gradual fashion.

Why are you using LVM at all, while the /dev/dm-1 is the same size as the LV? 
You say the btrfs volume on LV is on dm-1 which means they're all the same 
size, obviating the need for LVM in this case entirely.
Yes, I agree, that at the moment I don't need it. But when partition 
sits on logical volume I keep the option to extend the filesystem, when 
I the need comes.
My current needs are more complex, I don't keep all the date in the same 
redundancy and security level. It is also hard to tell in advance the 
relative sizes of each combination of redundancy and security levels. So 
I allocate only as much space on the GPT partitions as I immediately 
need, and in the future, when need comes, I can relatively easily make 
more partitions, arrange them in the appropriate raid/mdcrypt 
combination, and expand the filesystem that ran out space.


I am aware, that this setup is very complex. I can say, that my 
application is not life-critical, and this complexity serves me well on 
another Linux server, which I am using over 5 years (without the btrfs, 
of course).




Chris Murphy




--

Adam Ryczkowski

Re: Poor performance of btrfs. Suspected unidentified btrfs housekeeping process which writes a lot

2013-01-30 Thread Chris Murphy

On Jan 30, 2013, at 6:02 PM, Adam Ryczkowski adam.ryczkow...@statystyka.net 
wrote:

  I didn't take precise measurements, but I can tell, that reading 500 50-byte 
 files (ca. 25kB of data) took way longer that reading one 3MB file, so I 
 suspect the problem is with metadata access times rather than with data.

For 50 byte files, btrfs writes the data with metadata. Depending on their 
location relative to each other, this could mean 250MB of reads because of the 
large raid6 chunk size, yet only ~ 2MB is needed by btrfs.


 I am aware, that reading 1MB distributed in small files takes longer than 1MB 
 of sequential reading. The problem is that _suddenly_ this speed  got at 
 least 20 times longer than usual.

How does dedup work on 50 byte files? How does it contribute to fragmentation? 
And then how does that fragmentation turn into gross read inefficiencies at the 
md chunk level?


 And from what iotop and systat told me, the harddrives were busy _writing_ 
 something, not _reading_!

Seems like you need to find out what's being written, how many and how big the 
requests are. Small writes mean huge RWM penalty on raid6, especially a 4 disk 
raid 6 where you're practically guaranteed to have either data or metadata 
request halted for a parity rewrite.

 
 Anyway, I synchronize only the working copy part of my file system. All the 
 backup subvolumes sit in a separate path, not seen by the unison.

You're syncing what to what, in physical terms? I know one of the what's is a 
btrfs volume on top of LVM, on top of LUKs, on top of md raid6, on top of 
partitions located on four 3TB drives. YOu said there are other partitions on 
these drives so are there other read/writes occurring on those drives at the 
same time? It doesn't look like that's the case from iotop, the md0


 Moreover, once I wait long enough for the system to finish scanning the file 
 system, file access speeds are back to normal, even after I drop read cache 
 or even reboot the system. It is only after making another snapshot, when the 
 problems recurs.
 Another thing, I'd expect this to scale very poorly if the 35 subvolumes 
 contain any appreciable uniqueness, because searches can't be done in 
 parallel. So the more subvolumes you add, the more disk contention you get, 
 but also enormous amounts of latency as possibly 35 locations on the disk 
 are being searched if they happen to be unique.
 
 *The severity of my problem is proportional to time*. It happens immediately 
 after making snaphot, and persists for each file until I try to read its 
 contents. Than, even after the reboot, timing is back to normal. With my 
 limited knowledge about the internals of btrfs I suspect, that the bedup has 
 messed my metadata somehow. Maybe I should balance only the metadata part (if 
 that is possible at all)?

It's possible to balance just metadata chunks. But I think this is a spaghetti 
on the wall approach, rather than understanding how all of these layers are 
interacting with each other.
https://btrfs.wiki.kernel.org/index.php/Balance_Filters

 
 Why are you using raid6 for four disks, instead of raid10?
 Because I plan to add another 4 in the future. It's way easier to add another 
 disk to the array, than to change the RAID layout.

If this is happening imminently perhaps, in the meantime you have a terribly 
inefficient raid setup.

 What's the chunk size for the raid 6? What's the btrfs leaf size? What's the 
 dedup chunk size?
 I'll tell you tomorrow, but I hardly think that the misalignment could be any 
 problem here. As I said, everything was fine and the problem didn't appear in 
 gradual fashion.

It also depends on what mysterious stuff is being written during what's 
ostensibly a read only event.


 Why are you using LVM at all, while the /dev/dm-1 is the same size as the 
 LV? You say the btrfs volume on LV is on dm-1 which means they're all the 
 same size, obviating the need for LVM in this case entirely.
 Yes, I agree, that at the moment I don't need it. But when partition sits on 
 logical volume I keep the option to extend the filesystem, when I the need 
 comes.

This is not an ideal way to extend a btrfs file system however. You're adding 
unnecessarily layers and complexity while also not taking advantage of what LVM 
can do that btrfs cannot when it comes to logical volume management.


 My current needs are more complex, I don't keep all the date in the same 
 redundancy and security level. It is also hard to tell in advance the 
 relative sizes of each combination of redundancy and security levels. So I 
 allocate only as much space on the GPT partitions as I immediately need, and 
 in the future, when need comes, I can relatively easily make more partitions, 
 arrange them in the appropriate raid/mdcrypt combination, and expand the 
 filesystem that ran out space.

It sounds unnecessarily complex, but what do I know. Hopefully you have 
everything backed up to something that is comparatively simple. 

Re: [PATCH 10/12] Btrfs-progs: add show subcommand to subvol cli

2013-01-30 Thread Anand Jain


Wang,


+   ret = 0;
+   /* print the info */

 I think it will be better if you can move the following
 printing to a function..it will make the code more clear and
 readable..


 Thanks for looking into this. However IMO there is no need
 as of now.  This can be taken when there is more reasonable
 need. I hope you would agree with me.

-Anand
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Tsutomu Itoh
Hi,

In kernel 3.8-rc5, the following panics occurred when the mount was done
by the degraded option.

# btrfs fi sh /dev/sdc8
Label: none  uuid: fc63cd80-5ae2-4fbe-8795-2d526c937a56
Total devices 3 FS bytes used 20.98GB
devid1 size 9.31GB used 9.31GB path /dev/sdd8
devid2 size 9.31GB used 9.31GB path /dev/sdc8
*** Some devices missing

Btrfs v0.20-rc1-37-g91d9eec
# mount -o degraded /dev/sdc8 /test1

 564 static struct btrfs_worker_thread *find_worker(struct btrfs_workers 
*workers)
 565 {
...
...
 595 fallback:
 596 fallback = NULL;
 597 /*
 598  * we have failed to find any workers, just
 599  * return the first one we can find.
 600  */
 601 if (!list_empty(workers-worker_list))
 602 fallback = workers-worker_list.next;
 603 if (!list_empty(workers-idle_list))
 604 fallback = workers-idle_list.next;
 605 BUG_ON(!fallback);  -- this !
 606 worker = list_entry(fallback,
 607   struct btrfs_worker_thread, worker_list);

-Tsutomu

===

[ 7913.075890] btrfs: allowing degraded mounts
[ 7913.075893] btrfs: disk space caching is enabled
[ 7913.092031] Btrfs: too many missing devices, writeable mount is not allowed
[ 7913.092297] [ cut here ]
[ 7913.092313] kernel BUG at fs/btrfs/async-thread.c:605!
[ 7913.092326] invalid opcode:  [#1] SMP
[ 7913.092342] Modules linked in: btrfs zlib_deflate crc32c libcrc32c nfsd 
lockd nfs_acl auth_rpcgss sunrpc 8021q garp stp llc cpufreq_ondemand cachefiles 
fscache ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod uinput ppdev 
iTCO_wdt iTCO_vendor_support parport_pc parport sg acpi_cpufreq freq_table 
mperf coretemp kvm pcspkr i2c_i801 i2c_core lpc_ich mfd_core tg3 ptp pps_core 
shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sr_mod cdrom 
sd_mod crc_t10dif pata_acpi ata_piix libata megaraid_sas scsi_mod floppy [last 
unloaded: microcode]
[ 7913.092575] CPU 0
[ 7913.092584] Pid: 3673, comm: btrfs-endio-wri Not tainted 3.8.0-rc5 #1 
FUJITSU-SV  PRIMERGY/D2399
[ 7913.092608] RIP: 0010:[a04670ef]  [a04670ef] 
btrfs_queue_worker+0x10e/0x236 [btrfs]
[ 7913.092663] RSP: 0018:88019fc03c10  EFLAGS: 00010046
[ 7913.092676] RAX:  RBX: 8801967b8a58 RCX: 
[ 7913.092894] RDX:  RSI: 8801961239b8 RDI: 8801967b8ab8
[ 7913.093116] RBP: 88019fc03c50 R08:  R09: 880198801180
[ 7913.093247] R10: a045fda7 R11: 0003 R12: 
[ 7913.093247] R13: 8801961239b8 R14: 8801967b8ab8 R15: 0246
[ 7913.093247] FS:  () GS:88019fc0() 
knlGS:
[ 7913.093247] CS:  0010 DS:  ES:  CR0: 8005003b
[ 7913.093247] CR2: ff600400 CR3: 00019575d000 CR4: 07f0
[ 7913.093247] DR0:  DR1:  DR2: 
[ 7913.093247] DR3:  DR6: 0ff0 DR7: 0400
[ 7913.093247] Process btrfs-endio-wri (pid: 3673, threadinfo 8801939ca000, 
task 880195795b00)
[ 7913.093247] Stack:
[ 7913.093247]  8801967b8a88 8801967b8a78 88003fa0a600 
8801965ad0c0
[ 7913.093247]  88003fa0a600   

[ 7913.096183]  88019fc03c60 a043e357 88019fc03c70 
811526aa
[ 7913.096183] Call Trace:
[ 7913.096183]  IRQ
[ 7913.096183]
[ 7913.096183]  [a043e357] end_workqueue_bio+0x79/0x7b [btrfs]
[ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
[ 7913.096183]  [a045fdb2] btrfs_end_bio+0x10b/0x122 [btrfs]
[ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
[ 7913.096183]  [811c5e3f] req_bio_endio+0x96/0x9f
[ 7913.096183]  [811c601d] blk_update_request+0x1d5/0x3a4
[ 7913.096183]  [811c620c] blk_update_bidi_request+0x20/0x6f
[ 7913.096183]  [811c7a59] blk_end_bidi_request+0x1f/0x5d
[ 7913.096183]  [811c7ad3] blk_end_request+0x10/0x12
[ 7913.096183]  [a001db50] scsi_io_completion+0x207/0x4f3 [scsi_mod]
[ 7913.096183]  [a0016df9] scsi_finish_command+0xec/0xf5 [scsi_mod]
[ 7913.096183]  [a001df50] scsi_softirq_done+0xff/0x108 [scsi_mod]
[ 7913.096183]  [811ccb3a] blk_done_softirq+0x7a/0x8e
[ 7913.096183]  [810475c3] __do_softirq+0xd7/0x1ed
[ 7913.096183]  [813ead9c] call_softirq+0x1c/0x30
[ 7913.096183]  [81010ab6] do_softirq+0x46/0x83
[ 7913.096183]  [81047363] irq_exit+0x49/0xb7
[ 7913.096183]  [813eafd5] do_IRQ+0x9d/0xb4
[ 7913.096183]  [a0467217] ? btrfs_queue_worker+0x236/0x236 [btrfs]
[ 7913.096183]  [813e2a2d] common_interrupt+0x6d/0x6d
[ 7913.096183]  EOI
[ 7913.096183]
[ 

Re: [RFC] Move btrfsck in to the btrfs command

2013-01-30 Thread Roman Mamedov
On Wed, 30 Jan 2013 00:03:52 +0100
Ian Kumlien po...@demius.net wrote:

 This patch moves btrfsck in to btrfs fsck.

Does the ...fs fs.. combination look less than ideally beautiful to anyone
else? That's Filesystem abbreviated two times right there.
Who can use an ATM Machine? Someone who knows a PIN Number.

Also it's a complete clstrfck of nine consonants, with no vowels.

How about btrfs check?

-- 
With respect,
Roman

~~~
Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free.


signature.asc
Description: PGP signature


Re: [RFC] Move btrfsck in to the btrfs command

2013-01-30 Thread Chris Murphy

On Jan 30, 2013, at 9:10 PM, Roman Mamedov r...@romanrm.ru wrote:
 
 How about btrfs check?

For that matter, separate out check and repair. Is there a potential for a 
btrfsck repair to make things worse? And if so, could this be determined with a 
check?


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs-progs: man btrfs: group subcommands together

2013-01-30 Thread Anand Jain
David,

   Kindly accept this patch for your Btrfs-progs
   integration-20130130 branch. This patch updates
   only man btrfs and is a very low risk category.

Thanks, Anand


Anand Jain (1):
  Btrfs-progs: man btrfs: subcommands must be grouped together

 man/btrfs.8.in | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs-progs: man btrfs: subcommands must be grouped together

2013-01-30 Thread Anand Jain

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 man/btrfs.8.in | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index c8fdc15..8d3ecba 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -19,6 +19,8 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBsubvolume show\fP\fI path\fP
 .PP
+\fBbtrfs\fP \fBsubvolume find-new\fP\fI subvolume last_gen\fP
+.PP
 \fBbtrfs\fP \fBfilesystem defragment\fP -c[zlib|lzo] [-l \fIlen\fR] \
 [-s \fIstart\fR] [-t \fIsize\fR] -[vf] \fIfile\fR|\fIdir\fR \
 [\fIfile\fR|\fIdir\fR...]
@@ -29,8 +31,6 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBfilesystem label\fP\fI dev [newlabel]\fP
 .PP
-\fBbtrfs\fP \fBsubvolume find-new\fP\fI subvolume last_gen\fP
-.PP
 \fBbtrfs\fP \fBfilesystem balance\fP\fI path \fP
 .PP
 \fBbtrfs\fP \fBdevice scan\fP\fI [--all-devices|device [device...]]\fP
@@ -172,6 +172,10 @@ is similar to \fBsubvolume list\fR command.
 Show information of a given subvolume in the \fIpath\fR.
 .TP
 
+\fBsubvolume find-new\fR\fI subvolume last_gen\fR
+List the recently modified files in a subvolume, after \fIlast_gen\fR ID.
+.TP
+
 \fBfilesystem defragment\fP -c[zlib|lzo] [-l \fIlen\fR] [-s \fIstart\fR] \
 [-t \fIsize\fR] -[vf] \fIfile\fR|\fIdir\fR [\fIfile\fR|\fIdir\fR...]
 
@@ -201,10 +205,6 @@ don't use it if you use snapshots, have de-duplicated your 
data or made
 copies with \fBcp --reflink\fP.
 .TP
 
-\fBsubvolume find-new\fR\fI subvolume last_gen\fR
-List the recently modified files in a subvolume, after \fIlast_gen\fR ID.
-.TP
-
 \fBfilesystem sync\fR\fI path \fR
 Force a sync for the filesystem identified by \fIpath\fR.
 .TP
-- 
1.8.1.227.g44fe835

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] Btrfs: fix remount vs autodefrag

2013-01-30 Thread Miao Xie
Any comments about this patch?

Thanks
Miao

On mon, 26 Nov 2012 17:28:13 +0800, Miao Xie wrote:
 If we remount the fs to close the auto defragment or make the fs R/O, we 
 should
 stop the auto defragment.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com
 ---
  fs/btrfs/ctree.h |  1 +
  fs/btrfs/file.c  | 13 +
  fs/btrfs/super.c | 29 +
  3 files changed, 43 insertions(+)
 
 diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
 index 4ce24ce..01d671c 100644
 --- a/fs/btrfs/ctree.h
 +++ b/fs/btrfs/ctree.h
 @@ -1759,6 +1759,7 @@ struct btrfs_ioctl_defrag_range_args {
  
  #define btrfs_clear_opt(o, opt)  ((o) = ~BTRFS_MOUNT_##opt)
  #define btrfs_set_opt(o, opt)((o) |= BTRFS_MOUNT_##opt)
 +#define btrfs_raw_test_opt(o, opt)   ((o)  BTRFS_MOUNT_##opt)
  #define btrfs_test_opt(root, opt)((root)-fs_info-mount_opt  \
BTRFS_MOUNT_##opt)
  /*
 diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
 index 40b17d0..7aaae56 100644
 --- a/fs/btrfs/file.c
 +++ b/fs/btrfs/file.c
 @@ -320,8 +320,21 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info 
 *fs_info,
   range.start = defrag-last_offset;
  
   sb_start_write(fs_info-sb);
 +
 + /* Avoid defraging files on R/O fs */
 + if (!down_write_trylock(fs_info-sb-s_umount)) {
 + sb_end_write(fs_info-sb);
 + btrfs_requeue_inode_defrag(inode, defrag);
 + iput(inode);
 + return -EBUSY;
 + }
 +
 + BUG_ON(fs_info-sb-s_flags  MS_RDONLY);
 +
   num_defrag = btrfs_defrag_file(inode, NULL, range, defrag-transid,
  BTRFS_DEFRAG_BATCH);
 +
 + up_write(fs_info-sb-s_umount);
   sb_end_write(fs_info-sb);
   /*
* if we filled the whole defrag batch, there
 diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
 index b3b041a..2e7beee 100644
 --- a/fs/btrfs/super.c
 +++ b/fs/btrfs/super.c
 @@ -1189,6 +1189,32 @@ static void btrfs_resize_thread_pool(struct 
 btrfs_fs_info *fs_info,
   btrfs_set_max_workers(fs_info-scrub_workers, new_pool_size);
  }
  
 +static inline void btrfs_remount_prepare(struct btrfs_fs_info *fs_info,
 +  unsigned long old_opts, int flags)
 +{
 + if (btrfs_raw_test_opt(old_opts, AUTO_DEFRAG) 
 + (!btrfs_raw_test_opt(fs_info-mount_opt, AUTO_DEFRAG) ||
 +  (flags  MS_RDONLY))) {
 + /* wait for any defraggers to finish */
 + wait_event(fs_info-transaction_wait,
 +(atomic_read(fs_info-defrag_running) == 0));
 + }
 +}
 +
 +static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 +  unsigned long old_opts, int flags)
 +{
 + /*
 +  * We remount the fs successfully, then we need cleanup all defragable
 +  * inodes if the autodefragment is close or the fs is R/O.
 +  */
 + if (btrfs_raw_test_opt(old_opts, AUTO_DEFRAG) 
 + (!btrfs_raw_test_opt(fs_info-mount_opt, AUTO_DEFRAG) ||
 +  (flags  MS_RDONLY)))
 + btrfs_cleanup_defrag_inodes(fs_info);
 +
 +}
 +
  static int btrfs_remount(struct super_block *sb, int *flags, char *data)
  {
   struct btrfs_fs_info *fs_info = btrfs_sb(sb);
 @@ -1214,6 +1240,8 @@ static int btrfs_remount(struct super_block *sb, int 
 *flags, char *data)
   if ((*flags  MS_RDONLY) == (sb-s_flags  MS_RDONLY))
   return 0;
  
 + btrfs_remount_prepare(fs_info, old_opts, *flags);
 +
   if (*flags  MS_RDONLY) {
   sb-s_flags |= MS_RDONLY;
  
 @@ -1247,6 +1275,7 @@ static int btrfs_remount(struct super_block *sb, int 
 *flags, char *data)
   sb-s_flags = ~MS_RDONLY;
   }
  
 + btrfs_remount_cleanup(fs_info, old_opts, *flags);
   return 0;
  
  restore:
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Eric Sandeen
On 1/30/13 9:37 PM, Tsutomu Itoh wrote:
 Hi,
 
 In kernel 3.8-rc5, the following panics occurred when the mount was done
 by the degraded option.
 
 # btrfs fi sh /dev/sdc8
 Label: none  uuid: fc63cd80-5ae2-4fbe-8795-2d526c937a56
 Total devices 3 FS bytes used 20.98GB
 devid1 size 9.31GB used 9.31GB path /dev/sdd8
 devid2 size 9.31GB used 9.31GB path /dev/sdc8
 *** Some devices missing
 
 Btrfs v0.20-rc1-37-g91d9eec
 # mount -o degraded /dev/sdc8 /test1
 
  564 static struct btrfs_worker_thread *find_worker(struct btrfs_workers 
 *workers)
  565 {
 ...

I'm new at this so just taking a guess, but maybe a patch below.  :)

Hm, so we can't get here unless:

worker = next_worker(workers);

returned NULL.  And it can't return NULL unless idle_list is empty,
and we are not at the maximum nr. of threads, or the current worker
list is empty.

So it's possible to return NULL from next_worker() if 
idle_list and worker_list are both empty, I think.

 ...
  595 fallback:
  596 fallback = NULL;
  597 /*
  598  * we have failed to find any workers, just
  599  * return the first one we can find.
  600  */
  601 if (!list_empty(workers-worker_list))
  602 fallback = workers-worker_list.next;

it's possible that we got here *because* the worker_list was
empty...

  603 if (!list_empty(workers-idle_list))

... and that when we were called, this list was empty too.

  604 fallback = workers-idle_list.next;
  605 BUG_ON(!fallback);  -- this !


Seems quite possible that there are no worker threads at all at this point.
How could that happen...

  606 worker = list_entry(fallback,
  607   struct btrfs_worker_thread, worker_list);
 
 -Tsutomu
 
 ===
 
 [ 7913.075890] btrfs: allowing degraded mounts
 [ 7913.075893] btrfs: disk space caching is enabled
 [ 7913.092031] Btrfs: too many missing devices, writeable mount is not allowed

so this was supposed to fail the mount in open_ctree; it jumps to shutting down
the worker threads.  Which might result in no threads available.

 [ 7913.092297] [ cut here ]
 [ 7913.092313] kernel BUG at fs/btrfs/async-thread.c:605!
 [ 7913.092326] invalid opcode:  [#1] SMP
 [ 7913.092342] Modules linked in: btrfs zlib_deflate crc32c libcrc32c nfsd 
 lockd nfs_acl auth_rpcgss sunrpc 8021q garp stp llc cpufreq_ondemand 
 cachefiles fscache ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod 
 uinput ppdev iTCO_wdt iTCO_vendor_support parport_pc parport sg acpi_cpufreq 
 freq_table mperf coretemp kvm pcspkr i2c_i801 i2c_core lpc_ich mfd_core tg3 
 ptp pps_core shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 
 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_piix libata megaraid_sas 
 scsi_mod floppy [last unloaded: microcode]
 [ 7913.092575] CPU 0
 [ 7913.092584] Pid: 3673, comm: btrfs-endio-wri Not tainted 3.8.0-rc5 #1 
 FUJITSU-SV  PRIMERGY/D2399
 [ 7913.092608] RIP: 0010:[a04670ef]  [a04670ef] 
 btrfs_queue_worker+0x10e/0x236 [btrfs]

but this is already trying to do work, and has no workers to handle it.

The place we jump to is fail_block_groups, and before it is this comment:

/*  
 * make sure we're done with the btree inode before we stop our
 * kthreads
 */
filemap_write_and_wait(fs_info-btree_inode-i_mapping);
invalidate_inode_pages2(fs_info-btree_inode-i_mapping);
   
fail_block_groups:
btrfs_free_block_groups(fs_info);

if you move the fail_block_groups: target above the comment, does that fix it?
(although I don't know yet what started IO . . . )

like this:

From: Eric Sandeen sand...@redhat.com

Make sure that we are always done with the btree_inode's mapping
before we shut down the worker threads in open_ctree() error
cases.

Signed-off-by: Eric Sandeen sand...@redhat.com 

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d89da40..1e2abda 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2689,6 +2689,7 @@ fail_trans_kthread:
 fail_cleaner:
kthread_stop(fs_info-cleaner_kthread);
 
+fail_block_groups:
/*
 * make sure we're done with the btree inode before we stop our
 * kthreads
@@ -2696,7 +2697,6 @@ fail_cleaner:
filemap_write_and_wait(fs_info-btree_inode-i_mapping);
invalidate_inode_pages2(fs_info-btree_inode-i_mapping);
 
-fail_block_groups:
btrfs_free_block_groups(fs_info);
 
 fail_tree_roots:

Just a guess; but I don't know what would have started writes already...

-Eric

 [ 7913.092663] RSP: 0018:88019fc03c10  EFLAGS: 00010046
 [ 7913.092676] RAX:  RBX: 8801967b8a58 RCX: 
 
 [ 7913.092894] RDX:  RSI: 

Re: [BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Miao Xie
On thu, 31 Jan 2013 12:37:49 +0900, Tsutomu Itoh wrote:
 Hi,
 
 In kernel 3.8-rc5, the following panics occurred when the mount was done
 by the degraded option.
 
 # btrfs fi sh /dev/sdc8
 Label: none  uuid: fc63cd80-5ae2-4fbe-8795-2d526c937a56
 Total devices 3 FS bytes used 20.98GB
 devid1 size 9.31GB used 9.31GB path /dev/sdd8
 devid2 size 9.31GB used 9.31GB path /dev/sdc8
 *** Some devices missing
 
 Btrfs v0.20-rc1-37-g91d9eec
 # mount -o degraded /dev/sdc8 /test1
 
  564 static struct btrfs_worker_thread *find_worker(struct btrfs_workers 
 *workers)
  565 {
 ...
 ...
  595 fallback:
  596 fallback = NULL;
  597 /*
  598  * we have failed to find any workers, just
  599  * return the first one we can find.
  600  */
  601 if (!list_empty(workers-worker_list))
  602 fallback = workers-worker_list.next;
  603 if (!list_empty(workers-idle_list))
  604 fallback = workers-idle_list.next;
  605 BUG_ON(!fallback);  -- this !
  606 worker = list_entry(fallback,
  607   struct btrfs_worker_thread, worker_list);


If worker_list is not empty, we get a worker from this list; if worker_list is 
empty,
it means all the workers in idle_list, we get the worker from idle_list.

So the above bug is introduced by the second if sentence. it should be else 
if.

Thanks
Miao

 
 -Tsutomu
 
 ===
 
 [ 7913.075890] btrfs: allowing degraded mounts
 [ 7913.075893] btrfs: disk space caching is enabled
 [ 7913.092031] Btrfs: too many missing devices, writeable mount is not allowed
 [ 7913.092297] [ cut here ]
 [ 7913.092313] kernel BUG at fs/btrfs/async-thread.c:605!
 [ 7913.092326] invalid opcode:  [#1] SMP
 [ 7913.092342] Modules linked in: btrfs zlib_deflate crc32c libcrc32c nfsd 
 lockd nfs_acl auth_rpcgss sunrpc 8021q garp stp llc cpufreq_ondemand 
 cachefiles fscache ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod 
 uinput ppdev iTCO_wdt iTCO_vendor_support parport_pc parport sg acpi_cpufreq 
 freq_table mperf coretemp kvm pcspkr i2c_i801 i2c_core lpc_ich mfd_core tg3 
 ptp pps_core shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 
 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_piix libata megaraid_sas 
 scsi_mod floppy [last unloaded: microcode]
 [ 7913.092575] CPU 0
 [ 7913.092584] Pid: 3673, comm: btrfs-endio-wri Not tainted 3.8.0-rc5 #1 
 FUJITSU-SV  PRIMERGY/D2399
 [ 7913.092608] RIP: 0010:[a04670ef]  [a04670ef] 
 btrfs_queue_worker+0x10e/0x236 [btrfs]
 [ 7913.092663] RSP: 0018:88019fc03c10  EFLAGS: 00010046
 [ 7913.092676] RAX:  RBX: 8801967b8a58 RCX: 
 
 [ 7913.092894] RDX:  RSI: 8801961239b8 RDI: 
 8801967b8ab8
 [ 7913.093116] RBP: 88019fc03c50 R08:  R09: 
 880198801180
 [ 7913.093247] R10: a045fda7 R11: 0003 R12: 
 
 [ 7913.093247] R13: 8801961239b8 R14: 8801967b8ab8 R15: 
 0246
 [ 7913.093247] FS:  () GS:88019fc0() 
 knlGS:
 [ 7913.093247] CS:  0010 DS:  ES:  CR0: 8005003b
 [ 7913.093247] CR2: ff600400 CR3: 00019575d000 CR4: 
 07f0
 [ 7913.093247] DR0:  DR1:  DR2: 
 
 [ 7913.093247] DR3:  DR6: 0ff0 DR7: 
 0400
 [ 7913.093247] Process btrfs-endio-wri (pid: 3673, threadinfo 
 8801939ca000, task 880195795b00)
 [ 7913.093247] Stack:
 [ 7913.093247]  8801967b8a88 8801967b8a78 88003fa0a600 
 8801965ad0c0
 [ 7913.093247]  88003fa0a600   
 
 [ 7913.096183]  88019fc03c60 a043e357 88019fc03c70 
 811526aa
 [ 7913.096183] Call Trace:
 [ 7913.096183]  IRQ
 [ 7913.096183]
 [ 7913.096183]  [a043e357] end_workqueue_bio+0x79/0x7b [btrfs]
 [ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
 [ 7913.096183]  [a045fdb2] btrfs_end_bio+0x10b/0x122 [btrfs]
 [ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
 [ 7913.096183]  [811c5e3f] req_bio_endio+0x96/0x9f
 [ 7913.096183]  [811c601d] blk_update_request+0x1d5/0x3a4
 [ 7913.096183]  [811c620c] blk_update_bidi_request+0x20/0x6f
 [ 7913.096183]  [811c7a59] blk_end_bidi_request+0x1f/0x5d
 [ 7913.096183]  [811c7ad3] blk_end_request+0x10/0x12
 [ 7913.096183]  [a001db50] scsi_io_completion+0x207/0x4f3 [scsi_mod]
 [ 7913.096183]  [a0016df9] scsi_finish_command+0xec/0xf5 [scsi_mod]
 [ 7913.096183]  [a001df50] scsi_softirq_done+0xff/0x108 [scsi_mod]
 [ 7913.096183]  [811ccb3a] blk_done_softirq+0x7a/0x8e
 [ 7913.096183]  [810475c3] 

Re: [BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Eric Sandeen
On Jan 31, 2013, at 12:13 AM, Miao Xie mi...@cn.fujitsu.com wrote:

 On thu, 31 Jan 2013 12:37:49 +0900, Tsutomu Itoh wrote:
 Hi,
 
 In kernel 3.8-rc5, the following panics occurred when the mount was done
 by the degraded option.
 
 # btrfs fi sh /dev/sdc8
 Label: none  uuid: fc63cd80-5ae2-4fbe-8795-2d526c937a56
Total devices 3 FS bytes used 20.98GB
devid1 size 9.31GB used 9.31GB path /dev/sdd8
devid2 size 9.31GB used 9.31GB path /dev/sdc8
*** Some devices missing
 
 Btrfs v0.20-rc1-37-g91d9eec
 # mount -o degraded /dev/sdc8 /test1
 
 564 static struct btrfs_worker_thread *find_worker(struct btrfs_workers 
 *workers)
 565 {
 ...
 ...
 595 fallback:
 596 fallback = NULL;
 597 /*
 598  * we have failed to find any workers, just
 599  * return the first one we can find.
 600  */
 601 if (!list_empty(workers-worker_list))
 602 fallback = workers-worker_list.next;
 603 if (!list_empty(workers-idle_list))
 604 fallback = workers-idle_list.next;
 605 BUG_ON(!fallback);  -- this !
 606 worker = list_entry(fallback,
 607   struct btrfs_worker_thread, worker_list);
 
 
 If worker_list is not empty, we get a worker from this list; if worker_list 
 is empty,
 it means all the workers in idle_list, we get the worker from idle_list.
 
 So the above bug is introduced by the second if sentence. it should be else 
 if.

else if makes sense, but we cannot reach the BUG_ON unless both lists are 
empty, correct?

-Eric

 Thanks
 Miao
 
 
 -Tsutomu
 
 ===
 
 [ 7913.075890] btrfs: allowing degraded mounts
 [ 7913.075893] btrfs: disk space caching is enabled
 [ 7913.092031] Btrfs: too many missing devices, writeable mount is not 
 allowed
 [ 7913.092297] [ cut here ]
 [ 7913.092313] kernel BUG at fs/btrfs/async-thread.c:605!
 [ 7913.092326] invalid opcode:  [#1] SMP
 [ 7913.092342] Modules linked in: btrfs zlib_deflate crc32c libcrc32c nfsd 
 lockd nfs_acl auth_rpcgss sunrpc 8021q garp stp llc cpufreq_ondemand 
 cachefiles fscache ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod 
 uinput ppdev iTCO_wdt iTCO_vendor_support parport_pc parport sg acpi_cpufreq 
 freq_table mperf coretemp kvm pcspkr i2c_i801 i2c_core lpc_ich mfd_core tg3 
 ptp pps_core shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 
 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_piix libata megaraid_sas 
 scsi_mod floppy [last unloaded: microcode]
 [ 7913.092575] CPU 0
 [ 7913.092584] Pid: 3673, comm: btrfs-endio-wri Not tainted 3.8.0-rc5 #1 
 FUJITSU-SV  PRIMERGY/D2399
 [ 7913.092608] RIP: 0010:[a04670ef]  [a04670ef] 
 btrfs_queue_worker+0x10e/0x236 [btrfs]
 [ 7913.092663] RSP: 0018:88019fc03c10  EFLAGS: 00010046
 [ 7913.092676] RAX:  RBX: 8801967b8a58 RCX: 
 
 [ 7913.092894] RDX:  RSI: 8801961239b8 RDI: 
 8801967b8ab8
 [ 7913.093116] RBP: 88019fc03c50 R08:  R09: 
 880198801180
 [ 7913.093247] R10: a045fda7 R11: 0003 R12: 
 
 [ 7913.093247] R13: 8801961239b8 R14: 8801967b8ab8 R15: 
 0246
 [ 7913.093247] FS:  () GS:88019fc0() 
 knlGS:
 [ 7913.093247] CS:  0010 DS:  ES:  CR0: 8005003b
 [ 7913.093247] CR2: ff600400 CR3: 00019575d000 CR4: 
 07f0
 [ 7913.093247] DR0:  DR1:  DR2: 
 
 [ 7913.093247] DR3:  DR6: 0ff0 DR7: 
 0400
 [ 7913.093247] Process btrfs-endio-wri (pid: 3673, threadinfo 
 8801939ca000, task 880195795b00)
 [ 7913.093247] Stack:
 [ 7913.093247]  8801967b8a88 8801967b8a78 88003fa0a600 
 8801965ad0c0
 [ 7913.093247]  88003fa0a600   
 
 [ 7913.096183]  88019fc03c60 a043e357 88019fc03c70 
 811526aa
 [ 7913.096183] Call Trace:
 [ 7913.096183]  IRQ
 [ 7913.096183]
 [ 7913.096183]  [a043e357] end_workqueue_bio+0x79/0x7b [btrfs]
 [ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
 [ 7913.096183]  [a045fdb2] btrfs_end_bio+0x10b/0x122 [btrfs]
 [ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
 [ 7913.096183]  [811c5e3f] req_bio_endio+0x96/0x9f
 [ 7913.096183]  [811c601d] blk_update_request+0x1d5/0x3a4
 [ 7913.096183]  [811c620c] blk_update_bidi_request+0x20/0x6f
 [ 7913.096183]  [811c7a59] blk_end_bidi_request+0x1f/0x5d
 [ 7913.096183]  [811c7ad3] blk_end_request+0x10/0x12
 [ 7913.096183]  [a001db50] scsi_io_completion+0x207/0x4f3 
 [scsi_mod]
 [ 7913.096183]  [a0016df9] scsi_finish_command+0xec/0xf5 [scsi_mod]
 [ 7913.096183]  

Re: [BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Miao Xie
On Thu, 31 Jan 2013 01:19:41 -0500 (est), Eric Sandeen wrote:
 On Jan 31, 2013, at 12:13 AM, Miao Xie mi...@cn.fujitsu.com wrote:
 
 On thu, 31 Jan 2013 12:37:49 +0900, Tsutomu Itoh wrote:
 Hi,

 In kernel 3.8-rc5, the following panics occurred when the mount was done
 by the degraded option.

 # btrfs fi sh /dev/sdc8
 Label: none  uuid: fc63cd80-5ae2-4fbe-8795-2d526c937a56
Total devices 3 FS bytes used 20.98GB
devid1 size 9.31GB used 9.31GB path /dev/sdd8
devid2 size 9.31GB used 9.31GB path /dev/sdc8
*** Some devices missing

 Btrfs v0.20-rc1-37-g91d9eec
 # mount -o degraded /dev/sdc8 /test1

 564 static struct btrfs_worker_thread *find_worker(struct btrfs_workers 
 *workers)
 565 {
 ...
 ...
 595 fallback:
 596 fallback = NULL;
 597 /*
 598  * we have failed to find any workers, just
 599  * return the first one we can find.
 600  */
 601 if (!list_empty(workers-worker_list))
 602 fallback = workers-worker_list.next;
 603 if (!list_empty(workers-idle_list))
 604 fallback = workers-idle_list.next;
 605 BUG_ON(!fallback);  -- this !
 606 worker = list_entry(fallback,
 607   struct btrfs_worker_thread, worker_list);


 If worker_list is not empty, we get a worker from this list; if worker_list 
 is empty,
 it means all the workers in idle_list, we get the worker from idle_list.

 So the above bug is introduced by the second if sentence. it should be else 
 if.
 
 else if makes sense, but we cannot reach the BUG_ON unless both lists are 
 empty, correct?

You are right, I misread the code.

Thanks
Miao

 
 -Eric
 
 Thanks
 Miao


 -Tsutomu

 ===

 [ 7913.075890] btrfs: allowing degraded mounts
 [ 7913.075893] btrfs: disk space caching is enabled
 [ 7913.092031] Btrfs: too many missing devices, writeable mount is not 
 allowed
 [ 7913.092297] [ cut here ]
 [ 7913.092313] kernel BUG at fs/btrfs/async-thread.c:605!
 [ 7913.092326] invalid opcode:  [#1] SMP
 [ 7913.092342] Modules linked in: btrfs zlib_deflate crc32c libcrc32c nfsd 
 lockd nfs_acl auth_rpcgss sunrpc 8021q garp stp llc cpufreq_ondemand 
 cachefiles fscache ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod 
 uinput ppdev iTCO_wdt iTCO_vendor_support parport_pc parport sg 
 acpi_cpufreq freq_table mperf coretemp kvm pcspkr i2c_i801 i2c_core lpc_ich 
 mfd_core tg3 ptp pps_core shpchp pci_hotplug i3000_edac edac_core ext4 
 mbcache jbd2 crc16 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_piix libata 
 megaraid_sas scsi_mod floppy [last unloaded: microcode]
 [ 7913.092575] CPU 0
 [ 7913.092584] Pid: 3673, comm: btrfs-endio-wri Not tainted 3.8.0-rc5 #1 
 FUJITSU-SV  PRIMERGY/D2399
 [ 7913.092608] RIP: 0010:[a04670ef]  [a04670ef] 
 btrfs_queue_worker+0x10e/0x236 [btrfs]
 [ 7913.092663] RSP: 0018:88019fc03c10  EFLAGS: 00010046
 [ 7913.092676] RAX:  RBX: 8801967b8a58 RCX: 
 
 [ 7913.092894] RDX:  RSI: 8801961239b8 RDI: 
 8801967b8ab8
 [ 7913.093116] RBP: 88019fc03c50 R08:  R09: 
 880198801180
 [ 7913.093247] R10: a045fda7 R11: 0003 R12: 
 
 [ 7913.093247] R13: 8801961239b8 R14: 8801967b8ab8 R15: 
 0246
 [ 7913.093247] FS:  () GS:88019fc0() 
 knlGS:
 [ 7913.093247] CS:  0010 DS:  ES:  CR0: 8005003b
 [ 7913.093247] CR2: ff600400 CR3: 00019575d000 CR4: 
 07f0
 [ 7913.093247] DR0:  DR1:  DR2: 
 
 [ 7913.093247] DR3:  DR6: 0ff0 DR7: 
 0400
 [ 7913.093247] Process btrfs-endio-wri (pid: 3673, threadinfo 
 8801939ca000, task 880195795b00)
 [ 7913.093247] Stack:
 [ 7913.093247]  8801967b8a88 8801967b8a78 88003fa0a600 
 8801965ad0c0
 [ 7913.093247]  88003fa0a600   
 
 [ 7913.096183]  88019fc03c60 a043e357 88019fc03c70 
 811526aa
 [ 7913.096183] Call Trace:
 [ 7913.096183]  IRQ
 [ 7913.096183]
 [ 7913.096183]  [a043e357] end_workqueue_bio+0x79/0x7b [btrfs]
 [ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
 [ 7913.096183]  [a045fdb2] btrfs_end_bio+0x10b/0x122 [btrfs]
 [ 7913.096183]  [811526aa] bio_endio+0x2d/0x2f
 [ 7913.096183]  [811c5e3f] req_bio_endio+0x96/0x9f
 [ 7913.096183]  [811c601d] blk_update_request+0x1d5/0x3a4
 [ 7913.096183]  [811c620c] blk_update_bidi_request+0x20/0x6f
 [ 7913.096183]  [811c7a59] blk_end_bidi_request+0x1f/0x5d
 [ 7913.096183]  [811c7ad3] blk_end_request+0x10/0x12
 [ 7913.096183]  [a001db50] 

Re: [BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Miao Xie
On wed, 30 Jan 2013 23:55:34 -0600, Eric Sandeen wrote:
 ===

 [ 7913.075890] btrfs: allowing degraded mounts
 [ 7913.075893] btrfs: disk space caching is enabled
 [ 7913.092031] Btrfs: too many missing devices, writeable mount is not 
 allowed
 
 so this was supposed to fail the mount in open_ctree; it jumps to shutting 
 down
 the worker threads.  Which might result in no threads available.
 
 [ 7913.092297] [ cut here ]
 [ 7913.092313] kernel BUG at fs/btrfs/async-thread.c:605!
 [ 7913.092326] invalid opcode:  [#1] SMP
 [ 7913.092342] Modules linked in: btrfs zlib_deflate crc32c libcrc32c nfsd 
 lockd nfs_acl auth_rpcgss sunrpc 8021q garp stp llc cpufreq_ondemand 
 cachefiles fscache ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod 
 uinput ppdev iTCO_wdt iTCO_vendor_support parport_pc parport sg acpi_cpufreq 
 freq_table mperf coretemp kvm pcspkr i2c_i801 i2c_core lpc_ich mfd_core tg3 
 ptp pps_core shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 
 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_piix libata megaraid_sas 
 scsi_mod floppy [last unloaded: microcode]
 [ 7913.092575] CPU 0
 [ 7913.092584] Pid: 3673, comm: btrfs-endio-wri Not tainted 3.8.0-rc5 #1 
 FUJITSU-SV  PRIMERGY/D2399
 [ 7913.092608] RIP: 0010:[a04670ef]  [a04670ef] 
 btrfs_queue_worker+0x10e/0x236 [btrfs]
 
 but this is already trying to do work, and has no workers to handle it.
 
 The place we jump to is fail_block_groups, and before it is this comment:
 
 /*  
  * make sure we're done with the btree inode before we stop our
  * kthreads
  */
 filemap_write_and_wait(fs_info-btree_inode-i_mapping);
 invalidate_inode_pages2(fs_info-btree_inode-i_mapping);

 fail_block_groups:
 btrfs_free_block_groups(fs_info);
 
 if you move the fail_block_groups: target above the comment, does that fix it?
 (although I don't know yet what started IO . . . )

Reading the metadata of the tree root and Reading block group information 
started IO.
so, I think this patch can fix the problem.

 like this:
 
 From: Eric Sandeen sand...@redhat.com
 
 Make sure that we are always done with the btree_inode's mapping
 before we shut down the worker threads in open_ctree() error
 cases.
 
 Signed-off-by: Eric Sandeen sand...@redhat.com 
 
 diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
 index d89da40..1e2abda 100644
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -2689,6 +2689,7 @@ fail_trans_kthread:
  fail_cleaner:
   kthread_stop(fs_info-cleaner_kthread);
  
 +fail_block_groups:
   /*
* make sure we're done with the btree inode before we stop our
* kthreads
 @@ -2696,7 +2697,6 @@ fail_cleaner:
   filemap_write_and_wait(fs_info-btree_inode-i_mapping);
   invalidate_inode_pages2(fs_info-btree_inode-i_mapping);
  
 -fail_block_groups:
   btrfs_free_block_groups(fs_info);
  
  fail_tree_roots:
 
 Just a guess; but I don't know what would have started writes already...

I don't think it was write IO. It was just a soft interrupt caused by a 
metadata read IO,
and this soft interrupt happened while btrfs-endio-write-workers was going to 
stop.

Thanks
Miao
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] kernel BUG at fs/btrfs/async-thread.c:605!

2013-01-30 Thread Miao Xie
On wed, 30 Jan 2013 23:55:34 -0600, Eric Sandeen wrote:
 if you move the fail_block_groups: target above the comment, does that fix it?
 (although I don't know yet what started IO . . . )
 
 like this:
 
 From: Eric Sandeen sand...@redhat.com
 
 Make sure that we are always done with the btree_inode's mapping
 before we shut down the worker threads in open_ctree() error
 cases.


I reviewed your patch again, and found it just fix the above problem, it still
have similar problems which are not fixed.

How about this one?

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 0c31d07..d8fd711 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2728,13 +2728,13 @@ fail_cleaner:
 * kthreads
 */
filemap_write_and_wait(fs_info-btree_inode-i_mapping);
-   invalidate_inode_pages2(fs_info-btree_inode-i_mapping);
 
 fail_block_groups:
btrfs_free_block_groups(fs_info);
 
 fail_tree_roots:
free_root_pointers(fs_info, 1);
+   invalidate_inode_pages2(fs_info-btree_inode-i_mapping);
 
 fail_sb_buffer:
btrfs_stop_workers(fs_info-generic_worker);
@@ -2755,7 +2755,6 @@ fail_alloc:
 fail_iput:
btrfs_mapping_tree_free(fs_info-mapping_tree);
 
-   invalidate_inode_pages2(fs_info-btree_inode-i_mapping);
iput(fs_info-btree_inode);
 fail_bdi:
bdi_destroy(fs_info-bdi);

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html