Re: [PATCH 2/4] btrfs-progs: Integrate error message output into find_mount_root().

2014-07-10 Thread Satoru Takeuchi
Hi Qu,

(2014/07/10 12:05), Qu Wenruo wrote:
 Before this patch, find_mount_root() and the caller both output error
 message, which sometimes make the output duplicated and hard to judge
 what the problem is.
 
 This pathh will integrate all the error messages output into
 find_mount_root() to give more meaning error prompt and remove the
 unneeded caller error messages.
 
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
 ---
   cmds-receive.c   |  2 --
   cmds-send.c  |  8 +---
   cmds-subvolume.c |  5 +
   utils.c  | 15 ---
   4 files changed, 14 insertions(+), 16 deletions(-)
 
 diff --git a/cmds-receive.c b/cmds-receive.c
 index 48380a5..084d97d 100644
 --- a/cmds-receive.c
 +++ b/cmds-receive.c
 @@ -981,8 +981,6 @@ static int do_receive(struct btrfs_receive *r, const char 
 *tomnt, int r_fd,
   ret = find_mount_root(dest_dir_full_path, r-root_path);
   if (ret  0) {
   ret = -EINVAL;
 - fprintf(stderr, ERROR: failed to determine mount point 
 - for %s\n, dest_dir_full_path);
   goto out;
   }
   r-mnt_fd = open(r-root_path, O_RDONLY | O_NOATIME);
 diff --git a/cmds-send.c b/cmds-send.c
 index 9a73b32..091f32b 100644
 --- a/cmds-send.c
 +++ b/cmds-send.c
 @@ -357,8 +357,6 @@ static int init_root_path(struct btrfs_send *s, const 
 char *subvol)
   ret = find_mount_root(subvol, s-root_path);
   if (ret  0) {
   ret = -EINVAL;
 - fprintf(stderr, ERROR: failed to determine mount point 
 - for %s\n, subvol);
   goto out;
   }
   
 @@ -622,12 +620,8 @@ int cmd_send(int argc, char **argv)
   }
   
   ret = find_mount_root(subvol, mount_root);
 - if (ret  0) {
 - fprintf(stderr, ERROR: find_mount_root failed on %s: 
 - %s\n, subvol,
 - strerror(-ret));
 + if (ret  0)
   goto out;
 - }
   if (strcmp(send.root_path, mount_root) != 0) {
   ret = -EINVAL;
   fprintf(stderr, ERROR: all subvols must be from the 
 diff --git a/cmds-subvolume.c b/cmds-subvolume.c
 index 639fb10..b252eab 100644
 --- a/cmds-subvolume.c
 +++ b/cmds-subvolume.c
 @@ -981,11 +981,8 @@ static int cmd_subvol_show(int argc, char **argv)
   }
   
   ret = find_mount_root(fullpath, mnt);
 - if (ret  0) {
 - fprintf(stderr, ERROR: find_mount_root failed on %s: 
 - %s\n, fullpath, strerror(-ret));
 + if (ret  0)
   goto out;
 - }
   ret = 1;
   svpath = get_subvol_name(mnt, fullpath);
   
 diff --git a/utils.c b/utils.c
 index 507ec6c..07173ee 100644
 --- a/utils.c
 +++ b/utils.c
 @@ -2417,13 +2417,19 @@ int find_mount_root(const char *path, char 
 **mount_root)
   char *longest_match = NULL;
   
   fd = open(path, O_RDONLY | O_NOATIME);
 - if (fd  0)
 + if (fd  0) {
 + fprintf(stderr, ERROR: Failed to open %s: %s\n,
 + path, strerror(errno));

It drops part of original messages. It doesn't show this error
is from find_mount_root(). I consider the original meaning keep as is.
How do you think?

Thanks,
Satoru

   return -errno;
 + }
   close(fd);
   
   mnttab = setmntent(/proc/self/mounts, r);
 - if (!mnttab)
 + if (!mnttab) {
 + fprintf(stderr, ERROR: Failed to setmntent: %s\n,
 + strerror(errno));
   return -errno;
 + }
   
   while ((ent = getmntent(mnttab))) {
   len = strlen(ent-mnt_dir);
 @@ -2457,8 +2463,11 @@ int find_mount_root(const char *path, char 
 **mount_root)
   
   ret = 0;
   *mount_root = realpath(longest_match, NULL);
 - if (!*mount_root)
 + if (!*mount_root) {
 + fprintf(stderr, Failed to resolve path %s: %s\n,
 + longest_match, strerror(errno));
   ret = -errno;
 + }
   
   free(longest_match);
   return ret;
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] btrfs-progs: Fix wrong indent in btrfs-progs.

2014-07-10 Thread Satoru Takeuchi
(2014/07/10 12:05), Qu Wenruo wrote:
 When editing cmds-filesystem.c, I found cmd_filesystem_df() uses 7
 spaces as indent instead of 1 tab (or 8 spaces). which makes indent
 quite embarrassing.
 Such problem is especillay hard to detect when reviewing patches,
 since the leading '+' makes a tab only 7 spaces long, makeing 7 spaces
 look the same with a tab.
 
 This patch fixes all the 7 spaces indent.
 
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com

Reviewed-by: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com

 ---
   cmds-filesystem.c | 79 
 +++
   ctree.h   | 15 ++-
   utils.c   | 10 +++
   3 files changed, 52 insertions(+), 52 deletions(-)
 
 diff --git a/cmds-filesystem.c b/cmds-filesystem.c
 index 4b2d27e..0a9b62a 100644
 --- a/cmds-filesystem.c
 +++ b/cmds-filesystem.c
 @@ -114,23 +114,23 @@ static const char * const filesystem_cmd_group_usage[] 
 = {
   };
   
   static const char * const cmd_filesystem_df_usage[] = {
 -   btrfs filesystem df path,
 -   Show space usage information for a mount point,
 -   NULL
 + btrfs filesystem df path,
 + Show space usage information for a mount point,
 + NULL
   };
   
   static void print_df(struct btrfs_ioctl_space_args *sargs)
   {
 -   u64 i;
 -   struct btrfs_ioctl_space_info *sp = sargs-spaces;
 -
 -   for (i = 0; i  sargs-total_spaces; i++, sp++) {
 -   printf(%s, %s: total=%s, used=%s\n,
 -   group_type_str(sp-flags),
 -   group_profile_str(sp-flags),
 -   pretty_size(sp-total_bytes),
 -   pretty_size(sp-used_bytes));
 -   }
 + u64 i;
 + struct btrfs_ioctl_space_info *sp = sargs-spaces;
 +
 + for (i = 0; i  sargs-total_spaces; i++, sp++) {
 + printf(%s, %s: total=%s, used=%s\n,
 +group_type_str(sp-flags),
 +group_profile_str(sp-flags),
 +pretty_size(sp-total_bytes),
 +pretty_size(sp-used_bytes));
 + }
   }
   
   static int get_df(int fd, struct btrfs_ioctl_space_args **sargs_ret)
 @@ -183,33 +183,32 @@ static int get_df(int fd, struct btrfs_ioctl_space_args 
 **sargs_ret)
   
   static int cmd_filesystem_df(int argc, char **argv)
   {
 -   struct btrfs_ioctl_space_args *sargs = NULL;
 -   int ret;
 -   int fd;
 -   char *path;
 -   DIR *dirstream = NULL;
 -
 -   if (check_argc_exact(argc, 2))
 -   usage(cmd_filesystem_df_usage);
 -
 -   path = argv[1];
 -
 -   fd = open_file_or_dir(path, dirstream);
 -   if (fd  0) {
 -   fprintf(stderr, ERROR: can't access '%s'\n, path);
 -   return 1;
 -   }
 -   ret = get_df(fd, sargs);
 -
 -   if (!ret  sargs) {
 -   print_df(sargs);
 -   free(sargs);
 -   } else {
 -   fprintf(stderr, ERROR: get_df failed %s\n, strerror(-ret));
 -   }
 -
 -   close_file_or_dir(fd, dirstream);
 -   return !!ret;
 + struct btrfs_ioctl_space_args *sargs = NULL;
 + int ret;
 + int fd;
 + char *path;
 + DIR *dirstream = NULL;
 +
 + if (check_argc_exact(argc, 2))
 + usage(cmd_filesystem_df_usage);
 +
 + path = argv[1];
 +
 + fd = open_file_or_dir(path, dirstream);
 + if (fd  0) {
 + fprintf(stderr, ERROR: can't access '%s'\n, path);
 + return 1;
 + }
 + ret = get_df(fd, sargs);
 + if (!ret  sargs) {
 + print_df(sargs);
 + free(sargs);
 + } else {
 + fprintf(stderr, ERROR: get_df failed %s\n, strerror(-ret));
 + }
 +
 + close_file_or_dir(fd, dirstream);
 + return !!ret;
   }
   
   static int match_search_item_kernel(__u8 *fsid, char *mnt, char *label,
 diff --git a/ctree.h b/ctree.h
 index 35d3633..83d85b3 100644
 --- a/ctree.h
 +++ b/ctree.h
 @@ -939,10 +939,10 @@ struct btrfs_block_group_cache {
   };
   
   struct btrfs_extent_ops {
 -   int (*alloc_extent)(struct btrfs_root *root, u64 num_bytes,
 -u64 hint_byte, struct btrfs_key *ins);
 -   int (*free_extent)(struct btrfs_root *root, u64 bytenr,
 -   u64 num_bytes);
 + int (*alloc_extent)(struct btrfs_root *root, u64 num_bytes,
 + u64 hint_byte, struct btrfs_key *ins);
 + int (*free_extent)(struct btrfs_root *root, u64 bytenr,
 +u64 num_bytes);
   };
   
   struct btrfs_device;
 @@ -2117,9 +2117,10 @@ 
 BTRFS_SETGET_STACK_FUNCS(stack_qgroup_limit_rsv_exclusive,
   static inline u32 btrfs_file_extent_inline_item_len(struct extent_buffer 
 *eb,
   struct btrfs_item *e)
   {
 -   unsigned long offset;
 -   offset = offsetof(struct btrfs_file_extent_item, disk_bytenr);
 -   return btrfs_item_size(eb, e) - offset;
 + unsigned long 

Re: [PATCH 2/4] btrfs-progs: Integrate error message output into find_mount_root().

2014-07-10 Thread Miao Xie
Takeuchi-san

On Thu, 10 Jul 2014 16:33:23 +0900, Satoru Takeuchi wrote:
 (2014/07/10 12:05), Qu Wenruo wrote:
 Before this patch, find_mount_root() and the caller both output error
 message, which sometimes make the output duplicated and hard to judge
 what the problem is.

 This pathh will integrate all the error messages output into
 find_mount_root() to give more meaning error prompt and remove the
 unneeded caller error messages.

 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
 ---
   cmds-receive.c   |  2 --
   cmds-send.c  |  8 +---
   cmds-subvolume.c |  5 +
   utils.c  | 15 ---
   4 files changed, 14 insertions(+), 16 deletions(-)

 diff --git a/cmds-receive.c b/cmds-receive.c
 index 48380a5..084d97d 100644
 --- a/cmds-receive.c
 +++ b/cmds-receive.c
 @@ -981,8 +981,6 @@ static int do_receive(struct btrfs_receive *r, const 
 char *tomnt, int r_fd,
  ret = find_mount_root(dest_dir_full_path, r-root_path);
  if (ret  0) {
  ret = -EINVAL;
 -fprintf(stderr, ERROR: failed to determine mount point 
 -for %s\n, dest_dir_full_path);
  goto out;
  }
  r-mnt_fd = open(r-root_path, O_RDONLY | O_NOATIME);
 diff --git a/cmds-send.c b/cmds-send.c
 index 9a73b32..091f32b 100644
 --- a/cmds-send.c
 +++ b/cmds-send.c
 @@ -357,8 +357,6 @@ static int init_root_path(struct btrfs_send *s, const 
 char *subvol)
  ret = find_mount_root(subvol, s-root_path);
  if (ret  0) {
  ret = -EINVAL;
 -fprintf(stderr, ERROR: failed to determine mount point 
 -for %s\n, subvol);
  goto out;
  }
   
 @@ -622,12 +620,8 @@ int cmd_send(int argc, char **argv)
  }
   
  ret = find_mount_root(subvol, mount_root);
 -if (ret  0) {
 -fprintf(stderr, ERROR: find_mount_root failed on %s: 
 -%s\n, subvol,
 -strerror(-ret));
 +if (ret  0)
  goto out;
 -}
  if (strcmp(send.root_path, mount_root) != 0) {
  ret = -EINVAL;
  fprintf(stderr, ERROR: all subvols must be from the 
 diff --git a/cmds-subvolume.c b/cmds-subvolume.c
 index 639fb10..b252eab 100644
 --- a/cmds-subvolume.c
 +++ b/cmds-subvolume.c
 @@ -981,11 +981,8 @@ static int cmd_subvol_show(int argc, char **argv)
  }
   
  ret = find_mount_root(fullpath, mnt);
 -if (ret  0) {
 -fprintf(stderr, ERROR: find_mount_root failed on %s: 
 -%s\n, fullpath, strerror(-ret));
 +if (ret  0)
  goto out;
 -}
  ret = 1;
  svpath = get_subvol_name(mnt, fullpath);
   
 diff --git a/utils.c b/utils.c
 index 507ec6c..07173ee 100644
 --- a/utils.c
 +++ b/utils.c
 @@ -2417,13 +2417,19 @@ int find_mount_root(const char *path, char 
 **mount_root)
  char *longest_match = NULL;
   
  fd = open(path, O_RDONLY | O_NOATIME);
 -if (fd  0)
 +if (fd  0) {
 +fprintf(stderr, ERROR: Failed to open %s: %s\n,
 +path, strerror(errno));
 
 It drops part of original messages. It doesn't show this error
 is from find_mount_root(). I consider the original meaning keep as is.
 How do you think?

I think it is strange for the common users to show the name of a internal 
function.
Maybe we should introduce two kinds of the message, one is for the common users,
the other is for the developers to debug.

Thanks
Miao

 Thanks,
 Satoru
 
  return -errno;
 +}
  close(fd);
   
  mnttab = setmntent(/proc/self/mounts, r);
 -if (!mnttab)
 +if (!mnttab) {
 +fprintf(stderr, ERROR: Failed to setmntent: %s\n,
 +strerror(errno));
  return -errno;
 +}
   
  while ((ent = getmntent(mnttab))) {
  len = strlen(ent-mnt_dir);
 @@ -2457,8 +2463,11 @@ int find_mount_root(const char *path, char 
 **mount_root)
   
  ret = 0;
  *mount_root = realpath(longest_match, NULL);
 -if (!*mount_root)
 +if (!*mount_root) {
 +fprintf(stderr, Failed to resolve path %s: %s\n,
 +longest_match, strerror(errno));
  ret = -errno;
 +}
   
  free(longest_match);
  return ret;

 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] btrfs-progs: Integrate error message output into find_mount_root().

2014-07-10 Thread Qu Wenruo


 Original Message 
Subject: Re: [PATCH 2/4] btrfs-progs: Integrate error message output 
into find_mount_root().

From: Miao Xie mi...@cn.fujitsu.com
To: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com, Qu Wenruo 
quwen...@cn.fujitsu.com, linux-btrfs@vger.kernel.org

Date: 2014年07月10日 16:10

Takeuchi-san

On Thu, 10 Jul 2014 16:33:23 +0900, Satoru Takeuchi wrote:

(2014/07/10 12:05), Qu Wenruo wrote:

Before this patch, find_mount_root() and the caller both output error
message, which sometimes make the output duplicated and hard to judge
what the problem is.

This pathh will integrate all the error messages output into
find_mount_root() to give more meaning error prompt and remove the
unneeded caller error messages.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
   cmds-receive.c   |  2 --
   cmds-send.c  |  8 +---
   cmds-subvolume.c |  5 +
   utils.c  | 15 ---
   4 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 48380a5..084d97d 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -981,8 +981,6 @@ static int do_receive(struct btrfs_receive *r, const char 
*tomnt, int r_fd,
ret = find_mount_root(dest_dir_full_path, r-root_path);
if (ret  0) {
ret = -EINVAL;
-   fprintf(stderr, ERROR: failed to determine mount point 
-   for %s\n, dest_dir_full_path);
goto out;
}
r-mnt_fd = open(r-root_path, O_RDONLY | O_NOATIME);
diff --git a/cmds-send.c b/cmds-send.c
index 9a73b32..091f32b 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -357,8 +357,6 @@ static int init_root_path(struct btrfs_send *s, const char 
*subvol)
ret = find_mount_root(subvol, s-root_path);
if (ret  0) {
ret = -EINVAL;
-   fprintf(stderr, ERROR: failed to determine mount point 
-   for %s\n, subvol);
goto out;
}
   
@@ -622,12 +620,8 @@ int cmd_send(int argc, char **argv)

}
   
   		ret = find_mount_root(subvol, mount_root);

-   if (ret  0) {
-   fprintf(stderr, ERROR: find_mount_root failed on %s: 
-   %s\n, subvol,
-   strerror(-ret));
+   if (ret  0)
goto out;
-   }
if (strcmp(send.root_path, mount_root) != 0) {
ret = -EINVAL;
fprintf(stderr, ERROR: all subvols must be from the 
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 639fb10..b252eab 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -981,11 +981,8 @@ static int cmd_subvol_show(int argc, char **argv)
}
   
   	ret = find_mount_root(fullpath, mnt);

-   if (ret  0) {
-   fprintf(stderr, ERROR: find_mount_root failed on %s: 
-   %s\n, fullpath, strerror(-ret));
+   if (ret  0)
goto out;
-   }
ret = 1;
svpath = get_subvol_name(mnt, fullpath);
   
diff --git a/utils.c b/utils.c

index 507ec6c..07173ee 100644
--- a/utils.c
+++ b/utils.c
@@ -2417,13 +2417,19 @@ int find_mount_root(const char *path, char **mount_root)
char *longest_match = NULL;
   
   	fd = open(path, O_RDONLY | O_NOATIME);

-   if (fd  0)
+   if (fd  0) {
+   fprintf(stderr, ERROR: Failed to open %s: %s\n,
+   path, strerror(errno));

It drops part of original messages. It doesn't show this error
is from find_mount_root(). I consider the original meaning keep as is.
How do you think?

I think it is strange for the common users to show the name of a internal 
function.
Maybe we should introduce two kinds of the message, one is for the common users,
the other is for the developers to debug.

Thanks
Miao

I agree with Miao's idea.
It's true that some developers needs to get info from the output,
but IMO the error messages are often used to indicate what *users* do wrong,
since most problem is caused by wrong parameter given by users.

For example, I always forget to run 'btrfs fi df /mnt' and the 
'Operation not permiited' message
makes me realize the permission problem. And the function name or other 
messages are less

important than that.

On the other hand, if developers encounter problems, they will gdb the 
program or grep the source
to find out the problem. So function name in error message seems not so 
demanding for me.


It would also be a greate idea for adding new frame work to show debug 
message,
but I'd prefer to make the frame some times later(Maybe when btrfs-progs 
become more comlicated than current?)


Thanks,
Qu





Thanks,
Satoru


return -errno;
+   }
close(fd);
   
   	mnttab = setmntent(/proc/self/mounts, r);

-   if (!mnttab)
+   if (!mnttab) {
+   fprintf(stderr, ERROR: Failed to 

Re: btrfs RAID with enterprise SATA or SAS drives

2014-07-10 Thread Martin Steigerwald
Am Donnerstag, 10. Juli 2014, 12:10:46 schrieb Russell Coker:
 On Wed, 9 Jul 2014 16:48:05 Martin Steigerwald wrote:
   - for someone using SAS or enterprise SATA drives with Linux, I
   understand btrfs gives the extra benefit of checksums, are there any
   other specific benefits over using mdadm or dmraid?
  
  I think I can answer this one.
  
  Most important advantage I think is BTRFS is aware of which blocks of
  the RAID are in use and need to be synced:
  
  - Instant initialization of RAID regardless of size (unless at some
  capacity mkfs.btrfs needs more time)
 
 From mdadm(8):
 
--assume-clean
   Tell mdadm that the array pre-existed and is known to be 
 clean. It  can be useful when trying to recover from a major failure as
 you can be sure that no data will be affected unless  you  actu‐ ally 
 write  to  the array.  It can also be used when creating a RAID1 or
 RAID10 if you want to avoid the initial resync, however this  practice 
 — while normally safe — is not recommended.  Use this only if you
 really know what you are doing.
 
   When the devices that will be part of a new  array  were 
 filled with zeros before creation the operator knows the array is actu‐
 ally clean. If that is the case,  such  as  after  running  bad‐
 blocks,  this  argument  can be used to tell mdadm the facts the
 operator knows.
 
 While it might be regarded as a hack, it is possible to do a fairly
 instant initialisation of a Linux software RAID-1.

It is not the same.

BTRFS doesn´t care if the data of the unused blocks differ.

The RAID is on *filesystem* level, not on raw block level. The data on both 
disks don´t even have to be located in the exact same sectors.


  - Rebuild after disk failure or disk replace will only copy *used*
  blocks
 Have you done any benchmarks on this?  The down-side of copying used
 blocks is that you first need to discover which blocks are used.  Given
 that seek time is a major bottleneck at some portion of space used it
 will be faster to just copy the entire disk.

As BTRFS operates the RAID on the filesystem level it already knows which 
blocks are in use. I never had a disk replace or faulty disk yet in my two 
RAID-1 arrays, so I have no measurements. It may depend on free space 
fragementation.

  Scrubbing can repair from good disk if RAID with redundancy, but
  SoftRAID should be able to do this as well. But also for scrubbing:
  BTRFS only check and repairs used blocks.
 
 When you scrub Linux Software RAID (and in fact pretty much every RAID)
 it will only correct errors that the disks flag.  If a disk returns bad
 data and says that it's good then the RAID scrub will happily copy the
 bad data over the good data (for a RAID-1) or generate new valid parity
 blocks for bad data (for RAID-5/6).
 
 http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html
 
 Page 12 of the above document says that nearline disks (IE the ones
 people like me can afford for home use) have a 0.466% incidence of
 returning bad data and claiming it's good in a year.  Currently I run
 about 20 such disks in a variety of servers, workstations, and laptops.
  Therefore the probability of having no such errors on all those disks
 would be .99534^20=.91081.  The probability of having no such errors
 over a period of 10 years would be (.99534^20)^10=.39290 which means
 that over 10 years I should expect to have such errors, which is why
 BTRFS RAID-1 and DUP metadata on single disks are necessary features.

Yeah, the checksums comes in handy here.

(excuse long signature, its added by server)

Ciao,

-- 
Martin Steigerwald
Consultant / Trainer

teamix GmbH
Südwestpark 43
90449 Nürnberg

fon:  +49 911 30999 55
fax:  +49 911 30999 99
mail: martin.steigerw...@teamix.de
web:  http://www.teamix.de
blog: http://blog.teamix.de

Amtsgericht Nürnberg, HRB 18320
Geschäftsführer: Oliver Kügow, Richard Müller

** JETZT ANMELDEN – teamix TechDemo - 23.07.2014 - 
http://www.teamix.de/techdemo **

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs RAID with enterprise SATA or SAS drives

2014-07-10 Thread Austin S Hemmelgarn
On 2014-07-09 22:10, Russell Coker wrote:
 On Wed, 9 Jul 2014 16:48:05 Martin Steigerwald wrote:
 - for someone using SAS or enterprise SATA drives with Linux, I
 understand btrfs gives the extra benefit of checksums, are there any
 other specific benefits over using mdadm or dmraid?

 I think I can answer this one.

 Most important advantage I think is BTRFS is aware of which blocks of the
 RAID are in use and need to be synced:

 - Instant initialization of RAID regardless of size (unless at some
 capacity mkfs.btrfs needs more time)
 
 From mdadm(8):
 
--assume-clean
   Tell mdadm that the array pre-existed and is known to be  clean.
   It  can be useful when trying to recover from a major failure as
   you can be sure that no data will be affected unless  you  actu‐
   ally  write  to  the array.  It can also be used when creating a
   RAID1 or RAID10 if you want to avoid the initial resync, however
   this  practice  — while normally safe — is not recommended.  Use
   this only if you really know what you are doing.
 
   When the devices that will be part of a new  array  were  filled
   with zeros before creation the operator knows the array is actu‐
   ally clean. If that is the case,  such  as  after  running  bad‐
   blocks,  this  argument  can be used to tell mdadm the facts the
   operator knows.
 
 While it might be regarded as a hack, it is possible to do a fairly instant 
 initialisation of a Linux software RAID-1.

This has the notable disadvantage however that the first scrub you run
will essentially preform a full resync if you didn't make sure that the
disks had identical data to begin with.
 - Rebuild after disk failure or disk replace will only copy *used* blocks
 
 Have you done any benchmarks on this?  The down-side of copying used blocks 
 is 
 that you first need to discover which blocks are used.  Given that seek time 
 is 
 a major bottleneck at some portion of space used it will be faster to just 
 copy the entire disk.
 
 I haven't done any tests on BTRFS in this regard, but I've seen a disk 
 replacement on ZFS run significantly slower than a dd of the block device 
 would.
 
First of all, this isn't really a good comparison for two reasons:
1. EVERYTHING on ZFS (or any filesystem that tries to do that much work)
is slower than a dd of the raw block device.
2. Even if the throughput is lower, this is only really an issue if the
disk is more than half full, because you don't copy the unused blocks

Also, while it isn't really a recovery situation, I recently upgraded
from a 2 1TB disk BTRFS RAID1 setup to a 4 1TB disk BTRFS RAID10 setup,
and the performance of the re-balance really wasn't all that bad.  I
have maybe 100GB of actual data, so the array started out roughly 10%
full, and the re-balance only took about 2 minutes.  Of course, it
probably helps that I make a point to keep my filesystems de-fragmented,
scrub and balance regularly, and don't use a lot of sub-volumes or
snapshots, so the filesystem in question is not too different from what
it would have looked like if I had just wiped the FS and restored from a
backup.
 Scrubbing can repair from good disk if RAID with redundancy, but SoftRAID
 should be able to do this as well. But also for scrubbing: BTRFS only
 check and repairs used blocks.
 
 When you scrub Linux Software RAID (and in fact pretty much every RAID) it 
 will only correct errors that the disks flag.  If a disk returns bad data and 
 says that it's good then the RAID scrub will happily copy the bad data over 
 the good data (for a RAID-1) or generate new valid parity blocks for bad data 
 (for RAID-5/6).
 
 http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html
 
 Page 12 of the above document says that nearline disks (IE the ones people 
 like me can afford for home use) have a 0.466% incidence of returning bad 
 data 
 and claiming it's good in a year.  Currently I run about 20 such disks in a 
 variety of servers, workstations, and laptops.  Therefore the probability of 
 having no such errors on all those disks would be .99534^20=.91081.  The 
 probability of having no such errors over a period of 10 years would be 
 (.99534^20)^10=.39290 which means that over 10 years I should expect to have 
 such errors, which is why BTRFS RAID-1 and DUP metadata on single disks are 
 necessary features.
 




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH RESEND 1/4] btrfs-progs: Check fstype in find_mount_root()

2014-07-10 Thread Martin Steigerwald
Am Donnerstag, 10. Juli 2014, 11:05:10 schrieb Qu Wenruo:
 When calling find_mount_root(), caller in fact wants to find the mount
 point of *BTRFS*.
 
 So also check ent-fstype in find_mount_root() and output proper error
 messages if needed.
 This will suppress a lot of Inapproiate ioctl for device error
 message.
 
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
 ---
  utils.c | 11 +++
  1 file changed, 11 insertions(+)
 
 diff --git a/utils.c b/utils.c
 index 993d085..507ec6c 100644
 --- a/utils.c
 +++ b/utils.c
 @@ -2412,6 +2412,7 @@ int find_mount_root(const char *path, char
 **mount_root) struct mntent *ent;
   int len;
   int ret;
 + int not_btrfs;
   int longest_matchlen = 0;
   char *longest_match = NULL;
 
 @@ -2432,6 +2433,10 @@ int find_mount_root(const char *path, char
 **mount_root) free(longest_match);
   longest_matchlen = len;
   longest_match = strdup(ent-mnt_dir);
 + if (strcmp(ent-mnt_type, btrfs))
 + not_btrfs = 1;
 + else
 + not_btrfs = 0;
   }
   }
   }
 @@ -2443,6 +2448,12 @@ int find_mount_root(const char *path, char
 **mount_root) path);
   return -ENOENT;
   }
 + if (not_btrfs) {
 + fprintf(stderr,
 + ERROR: %s does not belong to a btrfs mount points.\n,

Just a typo: mount point

 + path);
 + return -EINVAL;
 + }
 
   ret = 0;
   *mount_root = realpath(longest_match, NULL);

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: File server structure suggestion

2014-07-10 Thread Tamas Papp


On 07/10/2014 04:41 PM, Andrew Flerchinger wrote:

what was going on. That sold me on the idea of data checksums, but I'd
rather stay in linux than BSD, and I previously made use of online
capacity expansion as needed, which ZFS doesn't support.


What do you mean by that?
What zfs doesn't support is reducing a pool.


tamas
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: File server structure suggestion

2014-07-10 Thread Andrew Flerchinger
I want to increase the size of the vdev, not just the zpool. I want to
make a 3-drive array into a 4-drive array by adding a single drive
while still having one parity stripe across all data. Adding more
vdevs to a zpool isn't quite the same thing as online capacity
expansion. It's not something most businesses would do, which is why
the feature never made it into ZFS, but consumers are cheap and mdadm
supported it. From what I've read, btrfs supports it, too.

On Thu, Jul 10, 2014 at 10:52 AM, Tamas Papp tom...@martos.bme.hu wrote:

 On 07/10/2014 04:41 PM, Andrew Flerchinger wrote:

 what was going on. That sold me on the idea of data checksums, but I'd
 rather stay in linux than BSD, and I previously made use of online
 capacity expansion as needed, which ZFS doesn't support.


 What do you mean by that?
 What zfs doesn't support is reducing a pool.


 tamas
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Mark Fasheh
Hey Dave, thanks for the patch review! Pretty much all of what you wrote
sounds good to me, there's just one or two items I wanted to clarify - those
comments are inline. Thanks again,

On Thu, Jul 10, 2014 at 10:43:30AM +1000, Dave Chinner wrote:
 On Wed, Jul 09, 2014 at 03:41:50PM -0700, Mark Fasheh wrote:
  +
  +# Enable qgroups now that we have our filesystem prepared. This
  +# will kick off a scan which we will have to wait for below.
  +$BTRFS_UTIL_PROG qu en $SCRATCH_MNT
  +sleep 30
 
 That seems rather arbitrary. The sleeps you are adding add well over
 a minute to the runtime, and a quota scan of a filesystem with 200
 files should be almost instantenous.

Yeah I'll bring that back down to 5 seconds? It's 30 from my testing because
I was being paranoid and neglected to update it for the rest of the world.


  +_scratch_unmount
  +_scratch_mount
 
 What is the purpose of this?

This is kind of 'maximum paranoia' again from my own test script. The idea
was to make _absolutely_ certain that all metadata found it's way to disk
and won't be optimized in layout any more. There's a decent chance it
doesn't do anything but it doesn't seem a huge deal. I wasn't clear though -
do you want it removed or can I comment it for clarity?


  +# Ok, delete the snapshot we made previously. Since btrfs drop
  +# snapshot is a delayed action with no way to force it, we have to
  +# impose another sleep here.
  +$BTRFS_UTIL_PROG su de $SCRATCH_MNT/snap1
  +sleep 45
 
 That's indicative of a bug, yes?

No, that's just how it happens. In fact, if you unmount while a snapshot is
being dropped, progress of the drop will be recorded and it will be
continued on next mount. However, since we *must* have the drop_snapshot
complete for this test I have the large sleep. Unlike the previous sleep I
don't think this can be reduced by much :(
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Question] disk_bytenr with multiple devices

2014-07-10 Thread Zhe Zhang
When a btrfs has multiple devices (e.g. /dev/sdb, /dev/sdc), how
should I interpret disk_bytenr in btrfs_file_extent_item?

Does it depend on the striping config? Say I used raid0, then
disk_bytenr 0~64K will be on /dev/sdb, and 64K~128K on /dev/sdc?

Thanks,
Zhe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Zach Brown
On Thu, Jul 10, 2014 at 10:36:14AM -0700, Mark Fasheh wrote:
 On Thu, Jul 10, 2014 at 10:43:30AM +1000, Dave Chinner wrote:
  On Wed, Jul 09, 2014 at 03:41:50PM -0700, Mark Fasheh wrote:
   +
   +# Enable qgroups now that we have our filesystem prepared. This
   +# will kick off a scan which we will have to wait for below.
   +$BTRFS_UTIL_PROG qu en $SCRATCH_MNT
   +sleep 30
  
  That seems rather arbitrary. The sleeps you are adding add well over
  a minute to the runtime, and a quota scan of a filesystem with 200
  files should be almost instantenous.
 
 Yeah I'll bring that back down to 5 seconds?

How long does it usually take?

What interfaces would be needed for this to work precisely so we don't
have to play this game ever again?

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Mark Fasheh
On Thu, Jul 10, 2014 at 11:32:28AM -0700, Zach Brown wrote:
 On Thu, Jul 10, 2014 at 10:36:14AM -0700, Mark Fasheh wrote:
  On Thu, Jul 10, 2014 at 10:43:30AM +1000, Dave Chinner wrote:
   On Wed, Jul 09, 2014 at 03:41:50PM -0700, Mark Fasheh wrote:
+
+# Enable qgroups now that we have our filesystem prepared. This
+# will kick off a scan which we will have to wait for below.
+$BTRFS_UTIL_PROG qu en $SCRATCH_MNT
+sleep 30
   
   That seems rather arbitrary. The sleeps you are adding add well over
   a minute to the runtime, and a quota scan of a filesystem with 200
   files should be almost instantenous.
  
  Yeah I'll bring that back down to 5 seconds?
 
 How long does it usually take?
 
 What interfaces would be needed for this to work precisely so we don't
 have to play this game ever again?

Well there's also the 'sleep 45' below because we need to be certain that
btrfs_drop_snapshot gets run. This was all a bit of a pain during debugging
to be honest.

So in my experience, an interface to make debugging easier would involve
running every delayed action in the file system to completion, including a
sync of dirty blocks to disk. In theory, this would include any delayed
actions that were kicked off as a result of the actions you are syncing.
You'd do it all from a point in time of course so that we don't spin forever
on a busy filesystem. I do not know whether this is feasible.

Given something like that, you'd just replace the calls to sleep with 'btrfs
fi synctheworldandwait' and know that on return, the actions you just queued
up completed.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Zach Brown
On Thu, Jul 10, 2014 at 12:00:55PM -0700, Mark Fasheh wrote:
 On Thu, Jul 10, 2014 at 11:32:28AM -0700, Zach Brown wrote:
  On Thu, Jul 10, 2014 at 10:36:14AM -0700, Mark Fasheh wrote:
   On Thu, Jul 10, 2014 at 10:43:30AM +1000, Dave Chinner wrote:
On Wed, Jul 09, 2014 at 03:41:50PM -0700, Mark Fasheh wrote:
 +
 +# Enable qgroups now that we have our filesystem prepared. This
 +# will kick off a scan which we will have to wait for below.
 +$BTRFS_UTIL_PROG qu en $SCRATCH_MNT
 +sleep 30

That seems rather arbitrary. The sleeps you are adding add well over
a minute to the runtime, and a quota scan of a filesystem with 200
files should be almost instantenous.
   
   Yeah I'll bring that back down to 5 seconds?
  
  How long does it usually take?
  
  What interfaces would be needed for this to work precisely so we don't
  have to play this game ever again?
 
 Well there's also the 'sleep 45' below because we need to be certain that
 btrfs_drop_snapshot gets run. This was all a bit of a pain during debugging
 to be honest.

Yeah.  It seems like there's an opportunity for sync flags in the
commands.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Mark Fasheh
On Thu, Jul 10, 2014 at 12:05:05PM -0700, Zach Brown wrote:
 On Thu, Jul 10, 2014 at 12:00:55PM -0700, Mark Fasheh wrote:
  On Thu, Jul 10, 2014 at 11:32:28AM -0700, Zach Brown wrote:
   On Thu, Jul 10, 2014 at 10:36:14AM -0700, Mark Fasheh wrote:
On Thu, Jul 10, 2014 at 10:43:30AM +1000, Dave Chinner wrote:
 On Wed, Jul 09, 2014 at 03:41:50PM -0700, Mark Fasheh wrote:
  +
  +# Enable qgroups now that we have our filesystem prepared. This
  +# will kick off a scan which we will have to wait for below.
  +$BTRFS_UTIL_PROG qu en $SCRATCH_MNT
  +sleep 30
 
 That seems rather arbitrary. The sleeps you are adding add well over
 a minute to the runtime, and a quota scan of a filesystem with 200
 files should be almost instantenous.

Yeah I'll bring that back down to 5 seconds?
   
   How long does it usually take?
   
   What interfaces would be needed for this to work precisely so we don't
   have to play this game ever again?
  
  Well there's also the 'sleep 45' below because we need to be certain that
  btrfs_drop_snapshot gets run. This was all a bit of a pain during debugging
  to be honest.
 
 Yeah.  It seems like there's an opportunity for sync flags in the
 commands.

Yep, that would've helped.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Dave Chinner
On Thu, Jul 10, 2014 at 10:36:14AM -0700, Mark Fasheh wrote:
 Hey Dave, thanks for the patch review! Pretty much all of what you wrote
 sounds good to me, there's just one or two items I wanted to clarify - those
 comments are inline. Thanks again,
 
 On Thu, Jul 10, 2014 at 10:43:30AM +1000, Dave Chinner wrote:
  On Wed, Jul 09, 2014 at 03:41:50PM -0700, Mark Fasheh wrote:
   +
   +# Enable qgroups now that we have our filesystem prepared. This
   +# will kick off a scan which we will have to wait for below.
   +$BTRFS_UTIL_PROG qu en $SCRATCH_MNT
   +sleep 30
  
  That seems rather arbitrary. The sleeps you are adding add well over
  a minute to the runtime, and a quota scan of a filesystem with 200
  files should be almost instantenous.
 
 Yeah I'll bring that back down to 5 seconds? It's 30 from my testing because
 I was being paranoid and neglected to update it for the rest of the world.

Be nice to have the btrfs command wait for it to complete. Not being
able to query the status of background work or wait for it is
somewhat user unfriendly. If you could poll, then a 1s sleep in a
poll loop would be fine. Short of that, then I guess sleep 5 is the
best we can do.

 
   +_scratch_unmount
   +_scratch_mount
  
  What is the purpose of this?
 
 This is kind of 'maximum paranoia' again from my own test script. The idea
 was to make _absolutely_ certain that all metadata found it's way to disk
 and won't be optimized in layout any more. There's a decent chance it
 doesn't do anything but it doesn't seem a huge deal. I wasn't clear though -
 do you want it removed or can I comment it for clarity?

Comment. If someone reads the test in 2 years time they won't
have to ask wtf?...

   +# Ok, delete the snapshot we made previously. Since btrfs drop
   +# snapshot is a delayed action with no way to force it, we have to
   +# impose another sleep here.
   +$BTRFS_UTIL_PROG su de $SCRATCH_MNT/snap1
   +sleep 45
  
  That's indicative of a bug, yes?
 
 No, that's just how it happens. In fact, if you unmount while a snapshot is
 being dropped, progress of the drop will be recorded and it will be
 continued on next mount. However, since we *must* have the drop_snapshot
 complete for this test I have the large sleep. Unlike the previous sleep I
 don't think this can be reduced by much :(

Right, again the can't wait or poll for status of background work
issue comes up.  That's the bug in the UI I was refering to. I guess
that we'll just have to wait for a long time here. Pretty hacky,
though...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: File server structure suggestion

2014-07-10 Thread Duncan
Andrew Flerchinger posted on Thu, 10 Jul 2014 10:41:02 -0400 as excerpted:

 Enter btrfs. Unfortunately, it's newer than ZFS and isn't as robust, but
 it does support online capacity expansion, and the on disk format is
 expect to be stable. It has data checksums and COW, which are the
 primary things I'm after. RAID10 seems pretty stable, but RAID56 isn't.
 
 So I'm looking for a suggestion. My end goal is RAID6 and expand it a
 drive at a time as needed. For right now, I can either:
 
 1) Run RAID6, but be aware of its limitations. I can manually remove and
 add drives in separate steps if needed. Keep the server on a UPS to
 limit unexpected shutdowns and any corruption there. The whole array
 can't be scrubbed, but if there is a chechsum problem when reading
 individual data, will that still be corrected and/or logged? This will
 be a temporary situation, as over time, more features will be built out,
 and the existing file system will be better supported.
 
 2) Run RAID10, and convert the file system to RAID6 later once it is
 stable. Since RAID10 is far more stable and feature complete than RAID56
 right now, all features will work okay, I'm just buying more
 drives/running at lower capacity for the moment. If I have to grow the
 array, I'd have to buy two drives. In the future, once RAID6 is better
 supported, I can convert in-place to RAID6.

I'd personally consider btrfs raid5/6 to be in practice a slow and lower 
capacity raid0, at this point, except that you'll get raid5/6 for free 
when that's fully supported, since it has been doing the writing for that 
all along, it just couldn't properly restore.  IOW, I wouldn't consider 
it trustworthy at all against loss of a device, which based on your 
suggestion, isn't appropriate for your usage.

That leaves either raid10 or raid1.  It's worth noting that btrfs raid1 
is at this point paired mirrors only, so no matter how many devices, you 
still have exactly two mirrors of all (meta)data.  N-way-mirroring is 
planned for after raid5/6 completion.  Which could put raid1 in the 
running for you, and as the simplest redundant raid, it might be easier 
to convert to raid5/6 later.

Then there's raid10, which takes more drives and is faster, but is still 
limited to two mirrors.  But while I haven't actually used raid10 myself, 
I do /not/ believe it's limited to pair-at-a-time additions.  I believe 
it'll take, for instance five devices, just fine, staggering chunk 
allocation as necessary to fill all at about the same rate.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests/btrfs: add test for quota groups and drop snapshot

2014-07-10 Thread Duncan
Mark Fasheh posted on Thu, 10 Jul 2014 12:00:55 -0700 as excerpted:

 Given something like that, you'd just replace the calls to sleep with
 'btrfs fi synctheworldandwait' and know that on return, the actions you
 just queued up completed.

I'll admit to not really knowing what I'm talking about here, but on 
first intuition, What about either btrfs filesystem sync calls, or 
mounting with the synctime (don't have time to look up the specific 
option ATM) mount option?  Normal sync is 30 seconds, but the mount 
option can be used to make that 5 seconds or whatever.  And I don't know 
whether btrfs filesystem sync is synchronous or not.

But that might help reduce it below 30 and 45 seconds, anyway, even if 
some sleep is still required.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Using serialized BTRFS snapshots as a backup

2014-07-10 Thread David Player
I'm trying to use serialized BTRFS snapshots as a backup system. The problem is 
that I don't know how to avoid sending duplicate data and also have the ability 
to prune old backups.

Specifically I've considered the following:

#snapshot
btrfs subvolume snapshot -r  live-volume  volume-date

#serialize snapshot for transmission to remote machine
btrfs send -f backup.date volume-date -p volume-yesterday

However this means I have to keep every serialized snapshot forever. I've tried 
unpacking these incremental snapshots, deleting intermediate volumes, and 
repacking the latest version. Unfortunately, deleting an intermediate snapshot 
appears to change the IDs for later snapshots, and future serialized snapshots 
can't be unpacked. In other words, I have incremental snapshots 1-4, I unpack 
1-3, erase 2, and now I can't unpack 4 due to: ERROR: could not find parent 
subvolume.

I've also considered keeping a chain of incremental monthly backups, and basing 
daily backups on both the monthly and previous daily. This would allow me to 
delete daily backups in the future, but now I have to send twice as much data 
to the backup machine.

What bothers me is that subvolume 3 (from the example above) really has all the 
same data before and after I delete subvolume 2. The IDs of the volume change, 
preventing me from unpacking incremental 4, and that's causing my problems.

Are there any better ideas I haven't thought of?

Currently running BTRFS 3.12 on kernel 3.13 (Ubuntu 14.04).

Thanks,
David Player

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] btrfs-progs: Integrate error message output into find_mount_root().

2014-07-10 Thread Satoru Takeuchi

(2014/07/10 17:26), Qu Wenruo wrote:


 Original Message 
Subject: Re: [PATCH 2/4] btrfs-progs: Integrate error message output into 
find_mount_root().
From: Miao Xie mi...@cn.fujitsu.com
To: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com, Qu Wenruo 
quwen...@cn.fujitsu.com, linux-btrfs@vger.kernel.org
Date: 2014年07月10日 16:10

Takeuchi-san

On Thu, 10 Jul 2014 16:33:23 +0900, Satoru Takeuchi wrote:

(2014/07/10 12:05), Qu Wenruo wrote:

Before this patch, find_mount_root() and the caller both output error
message, which sometimes make the output duplicated and hard to judge
what the problem is.

This pathh will integrate all the error messages output into
find_mount_root() to give more meaning error prompt and remove the
unneeded caller error messages.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
   cmds-receive.c   |  2 --
   cmds-send.c  |  8 +---
   cmds-subvolume.c |  5 +
   utils.c  | 15 ---
   4 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 48380a5..084d97d 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -981,8 +981,6 @@ static int do_receive(struct btrfs_receive *r, const char 
*tomnt, int r_fd,
   ret = find_mount_root(dest_dir_full_path, r-root_path);
   if (ret  0) {
   ret = -EINVAL;
-fprintf(stderr, ERROR: failed to determine mount point 
-for %s\n, dest_dir_full_path);
   goto out;
   }
   r-mnt_fd = open(r-root_path, O_RDONLY | O_NOATIME);
diff --git a/cmds-send.c b/cmds-send.c
index 9a73b32..091f32b 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -357,8 +357,6 @@ static int init_root_path(struct btrfs_send *s, const char 
*subvol)
   ret = find_mount_root(subvol, s-root_path);
   if (ret  0) {
   ret = -EINVAL;
-fprintf(stderr, ERROR: failed to determine mount point 
-for %s\n, subvol);
   goto out;
   }
@@ -622,12 +620,8 @@ int cmd_send(int argc, char **argv)
   }
   ret = find_mount_root(subvol, mount_root);
-if (ret  0) {
-fprintf(stderr, ERROR: find_mount_root failed on %s: 
-%s\n, subvol,
-strerror(-ret));
+if (ret  0)
   goto out;
-}
   if (strcmp(send.root_path, mount_root) != 0) {
   ret = -EINVAL;
   fprintf(stderr, ERROR: all subvols must be from the 
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 639fb10..b252eab 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -981,11 +981,8 @@ static int cmd_subvol_show(int argc, char **argv)
   }
   ret = find_mount_root(fullpath, mnt);
-if (ret  0) {
-fprintf(stderr, ERROR: find_mount_root failed on %s: 
-%s\n, fullpath, strerror(-ret));
+if (ret  0)
   goto out;
-}
   ret = 1;
   svpath = get_subvol_name(mnt, fullpath);
diff --git a/utils.c b/utils.c
index 507ec6c..07173ee 100644
--- a/utils.c
+++ b/utils.c
@@ -2417,13 +2417,19 @@ int find_mount_root(const char *path, char **mount_root)
   char *longest_match = NULL;
   fd = open(path, O_RDONLY | O_NOATIME);
-if (fd  0)
+if (fd  0) {
+fprintf(stderr, ERROR: Failed to open %s: %s\n,
+path, strerror(errno));

It drops part of original messages. It doesn't show this error
is from find_mount_root(). I consider the original meaning keep as is.
How do you think?

I think it is strange for the common users to show the name of a internal 
function.
Maybe we should introduce two kinds of the message, one is for the common users,
the other is for the developers to debug.

Thanks
Miao

I agree with Miao's idea.
It's true that some developers needs to get info from the output,
but IMO the error messages are often used to indicate what *users* do wrong,
since most problem is caused by wrong parameter given by users.

For example, I always forget to run 'btrfs fi df /mnt' and the 'Operation not 
permiited' message
makes me realize the permission problem. And the function name or other 
messages are less
important than that.

On the other hand, if developers encounter problems, they will gdb the program 
or grep the source
to find out the problem. So function name in error message seems not so 
demanding for me.


OK, I got it.

Reviewed-by: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com



It would also be a greate idea for adding new frame work to show debug message,
but I'd prefer to make the frame some times later(Maybe when btrfs-progs become 
more comlicated than current?)


It's nice. I consider the messages of btrfs-progs are a bit messy.

Satoru



Thanks,
Qu





Thanks,
Satoru


   return -errno;
+}
   close(fd);
   mnttab = setmntent(/proc/self/mounts, r);
-if (!mnttab)
+if (!mnttab) {
+fprintf(stderr, ERROR: Failed to setmntent: %s\n,
+strerror(errno));
   return -errno;
+}
   

Btrfs transaction checksum corruption losing root of the tree bizarre UUID change.

2014-07-10 Thread Tomasz Kusmierz
Hi all !

So it been some time with btrfs, and so far I was very pleased, but
since I've upgraded to ubuntu from 13.10 to 14.04 problems started to
occur (YES I know this might be unrelated).

So in the past I've had problems with btrfs which turned out to be a
problem caused by static from printer generating some corruption in
ram causing checksum failures on the file system - so I'm not going to
assume that there is something wrong with btrfs from the start.

Anyway:
On my server I'm running 6 x 2TB disk in raid 10 for general storage
and 2 x ~0.5 TB raid 1 for system. Might be unrelated, but after
upgrading to 14.04 I've started using Own Cloud which uses Apache 
MySql for backing store - all data stored on storage array, mysql was
on system array.

All started with csum errors showing up in mysql data files and in
some transactions !!!. Generally system imidiatelly was switching to
all btrfs read only mode due to being forced by kernel (don't have
dmesg / syslog now). Removed offending files, problem seemed to go
away and started from scratch. After 5 days problem reapered and now
was located around same mysql files and in files managed by apache as
cloud. At this point since these files are rather dear to me I've
decided to pull all stops and try to rescue as much as I can.

As a excercise in btrfs managment I've run btrfsck --repair - did not
help. Repeated with --init-csum-tree - turned out that this left me
with blank system array. Nice ! could use some warning here.

I've moved all drives and move those to my main rig which got a nice
16GB of ecc ram, so errors of ram, cpu, controller should be kept
theoretically eliminated. I've used system array drives and spare
drive to extract all dear to me files to newly created array (1tb +
500GB + 640GB). Runned a scrub on it and everything seemed OK. At this
point I've deleted dear to me files from storage array and ran  a
scrub. Scrub now showed even more csum errors in transactions and one
large file that was not touched FOR VERY LONG TIME (size ~1GB).
Deleted file. Ran scrub - no errors. Copied dear to me files back to
storage array. Ran scrub - no issues. Deleted files from my backup
array and decided to call a day. Next day I've decided to run a scrub
once more just to be sure this time it discovered a myriad of errors
in files and transactions. Since I've had no time to continue decided
to postpone on next day - next day I've started my rig and noticed
that both backup array and storage array does not mount anymore. I was
attempting to rescue situation without any luck. Power cycled PC and
on next startup both arrays failed to mount, when I tried to mount
backup array mount told me that this specific uuid DOES NOT EXIST
!?!?!

my fstab uuid:
fcf23e83-f165-4af0-8d1c-cd6f8d2788f4
new uuid:
771a4ed0-5859-4e10-b916-07aec4b1a60b


tried to mount by /dev/sdb1 and it did mount. Tried by new uuid and it
did mount as well. Scrub passes with flying colours on backup array
while storage array still fails to mount with:

root@ubuntu-pc:~# mount /dev/sdd1 /arrays/@storage/
mount: wrong fs type, bad option, bad superblock on /dev/sdd1,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

for any device in the array.

Honestly this is a question to more senior guys - what should I do now ?

Chris Mason - have you got any updates to your old friend stress.sh
? If not I can try using previous version that you provided to stress
test my system - but I this is a second system that exposes this
erratic behaviour.

Anyone - what can I do to rescue my bellowed files (no sarcasm with
zfs / ext4 / tapes / DVDs)

ps. needles to say: SMART - no sata CRC errors, no relocated sectors,
no errors what so ever (as much as I can see).
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] btrfs: remove unnecessary error check

2014-07-10 Thread Satoru Takeuchi

Hi Eric,

(2014/07/10 22:27), Eric Sandeen wrote:

On 7/10/14, 1:44 AM, Satoru Takeuchi wrote:

(2014/07/10 12:26), Eric Sandeen wrote:




On Jul 9, 2014, at 10:20 PM, Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com 
wrote:

From: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com

If (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC)) is false,
obviously trans is -ENOSPC. So we can safely remove the redundant
(PTR_ERR(trans) == -ENOSPC) check.



True, but now a comment like:

/* Handle ENOSPC */

might still be nice...


Eric, thank you for your comment. I fixed my patch.
How about is it?


One other thing I missed the first time, I'm sorry, notes below:



===
From: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com

If (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC)) is false,
obviously trans is -ENOSPC. So we can safely remove the redundant
(PTR_ERR(trans) == -ENOSPC) check.

Signed-off-by: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com

---
  fs/btrfs/inode.c | 29 +++--
  1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..115aac3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3803,22 +3803,23 @@ static struct btrfs_trans_handle 
*__unlink_start_trans(struct inode *dir)
  if (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC)
  return trans;

-if (PTR_ERR(trans) == -ENOSPC) {
-u64 num_bytes = btrfs_calc_trans_metadata_size(root, 5);
+/* Handle ENOSPC */

-trans = btrfs_start_transaction(root, 0);
-if (IS_ERR(trans))
-return trans;
-ret = btrfs_cond_migrate_bytes(root-fs_info,
-   root-fs_info-trans_block_rsv,
-   num_bytes, 5);
-if (ret) {
-btrfs_end_transaction(trans, root);
-return ERR_PTR(ret);
-}
-trans-block_rsv = root-fs_info-trans_block_rsv;
-trans-bytes_reserved = num_bytes;
+u64 num_bytes = btrfs_calc_trans_metadata_size(root, 5);


This variable should be declared at the beginning of the function,
not in the middle, because it's no longer in a separate code block.


OK, moved.



Also, somehow by the time the patch got here, tabs turned into
4 spaces, so this one wouldn't apply for me.


I did't realize that. Thank you.


Sorry for missing the variable declaration problem the first time!


No problem, more review is welcome. THank you very much :-)

This is the v3 patch.

===
From: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com

If (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC)) is false,
obviously trans is -ENOSPC. So we can safely remove the redundant
(PTR_ERR(trans) == -ENOSPC) check.

Signed-off-by: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com

---
 fs/btrfs/inode.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..e7ac779 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3790,6 +3790,7 @@ static struct btrfs_trans_handle 
*__unlink_start_trans(struct inode *dir)
 {
struct btrfs_trans_handle *trans;
struct btrfs_root *root = BTRFS_I(dir)-root;
+   u64 num_bytes = btrfs_calc_trans_metadata_size(root, 5);
int ret;
 
 	/*

@@ -3803,22 +3804,21 @@ static struct btrfs_trans_handle 
*__unlink_start_trans(struct inode *dir)
if (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC)
return trans;
 
-	if (PTR_ERR(trans) == -ENOSPC) {

-   u64 num_bytes = btrfs_calc_trans_metadata_size(root, 5);
+   /* Handle ENOSPC */
 
-		trans = btrfs_start_transaction(root, 0);

-   if (IS_ERR(trans))
-   return trans;
-   ret = btrfs_cond_migrate_bytes(root-fs_info,
-  root-fs_info-trans_block_rsv,
-  num_bytes, 5);
-   if (ret) {
-   btrfs_end_transaction(trans, root);
-   return ERR_PTR(ret);
-   }
-   trans-block_rsv = root-fs_info-trans_block_rsv;
-   trans-bytes_reserved = num_bytes;
+   trans = btrfs_start_transaction(root, 0);
+   if (IS_ERR(trans))
+   return trans;
+   ret = btrfs_cond_migrate_bytes(root-fs_info,
+   root-fs_info-trans_block_rsv,
+   num_bytes, 5);
+   if (ret) {
+   btrfs_end_transaction(trans, root);
+   return ERR_PTR(ret);
}
+   trans-block_rsv = root-fs_info-trans_block_rsv;
+   trans-bytes_reserved = num_bytes;
+
return trans;
 }
 
--

1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs transaction checksum corruption losing root of the tree bizarre UUID change.

2014-07-10 Thread Austin S Hemmelgarn
On 07/10/2014 07:32 PM, Tomasz Kusmierz wrote:
 Hi all !
 
 So it been some time with btrfs, and so far I was very pleased, but
 since I've upgraded to ubuntu from 13.10 to 14.04 problems started to
 occur (YES I know this might be unrelated).
 
 So in the past I've had problems with btrfs which turned out to be a
 problem caused by static from printer generating some corruption in
 ram causing checksum failures on the file system - so I'm not going to
 assume that there is something wrong with btrfs from the start.
 
 Anyway:
 On my server I'm running 6 x 2TB disk in raid 10 for general storage
 and 2 x ~0.5 TB raid 1 for system. Might be unrelated, but after
 upgrading to 14.04 I've started using Own Cloud which uses Apache 
 MySql for backing store - all data stored on storage array, mysql was
 on system array.
 
 All started with csum errors showing up in mysql data files and in
 some transactions !!!. Generally system imidiatelly was switching to
 all btrfs read only mode due to being forced by kernel (don't have
 dmesg / syslog now). Removed offending files, problem seemed to go
 away and started from scratch. After 5 days problem reapered and now
 was located around same mysql files and in files managed by apache as
 cloud. At this point since these files are rather dear to me I've
 decided to pull all stops and try to rescue as much as I can.
 
 As a excercise in btrfs managment I've run btrfsck --repair - did not
 help. Repeated with --init-csum-tree - turned out that this left me
 with blank system array. Nice ! could use some warning here.
 
I know that this will eventually be pointed out by somebody, so I'm
going to save them the trouble and mention that it does say on both the
wiki and in the manpages that btrfsck should be a last-resort (ie, after
you have made sure you have backups of anything on the FS).
 I've moved all drives and move those to my main rig which got a nice
 16GB of ecc ram, so errors of ram, cpu, controller should be kept
 theoretically eliminated. I've used system array drives and spare
 drive to extract all dear to me files to newly created array (1tb +
 500GB + 640GB). Runned a scrub on it and everything seemed OK. At this
 point I've deleted dear to me files from storage array and ran  a
 scrub. Scrub now showed even more csum errors in transactions and one
 large file that was not touched FOR VERY LONG TIME (size ~1GB).
 Deleted file. Ran scrub - no errors. Copied dear to me files back to
 storage array. Ran scrub - no issues. Deleted files from my backup
 array and decided to call a day. Next day I've decided to run a scrub
 once more just to be sure this time it discovered a myriad of errors
 in files and transactions. Since I've had no time to continue decided
 to postpone on next day - next day I've started my rig and noticed
 that both backup array and storage array does not mount anymore. I was
 attempting to rescue situation without any luck. Power cycled PC and
 on next startup both arrays failed to mount, when I tried to mount
 backup array mount told me that this specific uuid DOES NOT EXIST
 !?!?!
 
 my fstab uuid:
 fcf23e83-f165-4af0-8d1c-cd6f8d2788f4
 new uuid:
 771a4ed0-5859-4e10-b916-07aec4b1a60b
 
 
 tried to mount by /dev/sdb1 and it did mount. Tried by new uuid and it
 did mount as well. Scrub passes with flying colours on backup array
 while storage array still fails to mount with:
 
 root@ubuntu-pc:~# mount /dev/sdd1 /arrays/@storage/
 mount: wrong fs type, bad option, bad superblock on /dev/sdd1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so
 
 for any device in the array.
 
 Honestly this is a question to more senior guys - what should I do now ?
 
 Chris Mason - have you got any updates to your old friend stress.sh
 ? If not I can try using previous version that you provided to stress
 test my system - but I this is a second system that exposes this
 erratic behaviour.
 
 Anyone - what can I do to rescue my bellowed files (no sarcasm with
 zfs / ext4 / tapes / DVDs)
 
 ps. needles to say: SMART - no sata CRC errors, no relocated sectors,
 no errors what so ever (as much as I can see).
First thing that I would do is some very heavy testing with tools like
iozone and fio.  I would use the verify mode from iozone to further
check data integrity.  My guess based on what you have said is that it
is probably issues with either the storage controller (I've had issues
with almost every brand of SATA controller other than Intel, AMD, Via,
and Nvidia, and it almost always manifested as data corruption under
heavy load), or something in the disk's firmware.  I would still suggest
double-checking your RAM with Memtest, and check the cables on the
drives.  The one other thing that I can think of is potential voltage
sags from the PSU (either because the PSU is overloaded at times, or
because of really noisy/poorly-conditioned line power).  Of course, I
may be 

Re: [Question] disk_bytenr with multiple devices

2014-07-10 Thread Qu Wenruo


 Original Message 
Subject: [Question] disk_bytenr with multiple devices
From: Zhe Zhang zhe.zhang.resea...@gmail.com
To: linux-btrfs@vger.kernel.org
Date: 2014年07月11日 02:21

When a btrfs has multiple devices (e.g. /dev/sdb, /dev/sdc), how
should I interpret disk_bytenr in btrfs_file_extent_item?

Does it depend on the striping config? Say I used raid0, then
disk_bytenr 0~64K will be on /dev/sdb, and 64K~128K on /dev/sdc?

Thanks,
Zhe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

https://btrfs.wiki.kernel.org/index.php/Data_Structures#btrfs_file_extent_item
As you can see in the btrfs wiki,
disk_bytenr is *logical* address in btrfs linear space. Not really on 
disk address.


If you really want the address on device, you need to find the chunk 
containing the address,
then the stripe in chunk item will show the raid profile and device 
address on each device.


Thanks,
Qu
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html