Re: [PATCH] Btrfs: fix race condition between writting and scrubing supers

2013-10-24 Thread Chris Mason
Quoting Stefan Behrens (2013-10-23 13:21:34)
 On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote:
  On 22/10/2013 10:37, Stefan Behrens wrote:
  I don't believe that this issue can ever happen. I don't believe that
  somewhere on the path to the flash memory, to the magnetic disc or to
  the drive's cache memory, someone interrupts a 4KB write in the middle
  of operation to read from this 4KB area. This is not an issue IMHO.
  
  I think I have read that unfortunately it can happen.
  SAS and SATA specs for disks do not mandate that if a write is in-flight
  but still not completed, reads from the same sector should return the
  value it is being written; they can return the old value.
  I also think that Linux does not check either.
 
 If the _old_ 4KB block is returned, that's fine and won't cause a
 checksum error.
 
 The patch in question addresses the case that Btrfs submits a write
 request for a 4KB block, and a concurrent read request for that 4KB
 block reads partially the old block and partially the new block,
 resulting in a checksum error reported in the scrub statistic counters.

Concurrent reads and writes to the device are completely undefined, and
Any combination of old, new, random memory corruption wouldn't
surprise me...I'd rather avoid them ;)

Doing the transaction join during the super read is probably the least
complex choice.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] xfstests btrfs/020: test device replace on RO btrfs

2013-10-24 Thread David Sterba
On Thu, Oct 24, 2013 at 12:44:43AM +0800, Eryu Guan wrote:
 +_cleanup()
 +{
 + cd /

Using root as temporary directory?

 + rm -f $tmp.*
 + $UMOUNT_PROG $loop_mnt
 + _destroy_loop_device $loop_dev1
 + losetup -d $loop_dev2 /dev/null 21
 + _destroy_loop_device $loop_dev3
 + rm -rf $loop_mnt
 + rm -f $fs_img1 $fs_img2 $fs_img3
 +}
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix race condition between writting and scrubing supers

2013-10-24 Thread Miao Xie
On thu, 24 Oct 2013 06:08:42 -0400, Chris Mason wrote:
 Quoting Stefan Behrens (2013-10-23 13:21:34)
 On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote:
 On 22/10/2013 10:37, Stefan Behrens wrote:
 I don't believe that this issue can ever happen. I don't believe that
 somewhere on the path to the flash memory, to the magnetic disc or to
 the drive's cache memory, someone interrupts a 4KB write in the middle
 of operation to read from this 4KB area. This is not an issue IMHO.

 I think I have read that unfortunately it can happen.
 SAS and SATA specs for disks do not mandate that if a write is in-flight
 but still not completed, reads from the same sector should return the
 value it is being written; they can return the old value.
 I also think that Linux does not check either.

 If the _old_ 4KB block is returned, that's fine and won't cause a
 checksum error.

 The patch in question addresses the case that Btrfs submits a write
 request for a 4KB block, and a concurrent read request for that 4KB
 block reads partially the old block and partially the new block,
 resulting in a checksum error reported in the scrub statistic counters.
 
 Concurrent reads and writes to the device are completely undefined, and
 Any combination of old, new, random memory corruption wouldn't
 surprise me...I'd rather avoid them ;)
 
 Doing the transaction join during the super read is probably the least
 complex choice.

But it can not block the log tree sync, I think using device_list_mutex is
better since we should acquire this mutex when writing the super blocks and
we are sure that the super blocks are on non-volatile media on completion
after we unlock the mutex.

Thanks
Miao

 
 -chris
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: stat(2) and /proc/pid/maps returns different devices

2013-10-24 Thread Pavel Emelyanov
On 07/20/2013 12:51 AM, Mark Fasheh wrote:
 On Thu, Jul 11, 2013 at 12:26:50AM +0200, David Sterba wrote:
 On Wed, Jul 10, 2013 at 10:45:45AM -0700, Mark Fasheh wrote:
 Well, what do I get when I pretend I don't care any more? The little voice
 in my head says keep plugging away. Here's another attempt at fixing this
 problem in a sane manner. Basically, this time we're adding a flag to
 s_flags which btrfs sets. Proc will see the flag and call -getattr().

 This compiles, but it needs testing (which I will get to soon). It still has
 a bunch of problems in my honest opinion but maybe if we get something
 acceptable upstream we can work from there.

 Also, as Andrew pointed out there's more than one place which is return
 different device than from stat(2) so I probably need to update more sites
 to deal with this.

 Does anyone see a problem with this approach?

 The approach looks ok to me, the implementation is internal to vfs and
 fairly minimal. The bit that bothers me is the name of the flag, it's
 completely unobvious what it means.
 
 I'll come up with something better for my next revision :)

Mark, David,

What are your plans about the next version? Any chance we can see it in the
3.13 merge window? (unless I've missed the fact, that it's already there)

I'd really love to see it, as this thing is a blocker for checkpoint-restore
on btrfs.

Thanks,
Pavel
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix race condition between writting and scrubing supers

2013-10-24 Thread Wang Shilong

On 10/24/2013 06:08 PM, Chris Mason wrote:

Quoting Stefan Behrens (2013-10-23 13:21:34)

On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote:

On 22/10/2013 10:37, Stefan Behrens wrote:

I don't believe that this issue can ever happen. I don't believe that
somewhere on the path to the flash memory, to the magnetic disc or to
the drive's cache memory, someone interrupts a 4KB write in the middle
of operation to read from this 4KB area. This is not an issue IMHO.

I think I have read that unfortunately it can happen.
SAS and SATA specs for disks do not mandate that if a write is in-flight
but still not completed, reads from the same sector should return the
value it is being written; they can return the old value.
I also think that Linux does not check either.

If the _old_ 4KB block is returned, that's fine and won't cause a
checksum error.

The patch in question addresses the case that Btrfs submits a write
request for a 4KB block, and a concurrent read request for that 4KB
block reads partially the old block and partially the new block,
resulting in a checksum error reported in the scrub statistic counters.

Concurrent reads and writes to the device are completely undefined, and
Any combination of old, new, random memory corruption wouldn't
surprise me...I'd rather avoid them ;)

Doing the transaction join during the super read is probably the least
complex choice.
Yeah, by joining transaction we can solve this problem, but it is a 
little confused,

because we don't involve writting in scrubing supers.

And the only race condition happens in commiting transaction, Miao also 
pointed out that
maybe the best way is to move btrfs_scrub_continue after 
write_ctree_super().


Thanks,
Wang

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc

2013-10-24 Thread David Sterba
Hi Chris,

this needs to go to 3.12, the patch is only in btrfs-next. The bug can
happen with systemd journal + balance, the fix helps quite a lot of
users out there. (https://bugzilla.kernel.org/show_bug.cgi?id=63411)

I have cherry-picked the patch to current master, applies cleanly and
the test btrfs/013 passes, here's my

Tested-by: David Sterba dste...@suse.cz

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs raid5 bug task mkfs.btrfs:3695 blocked for more than 120 seconds

2013-10-24 Thread lilofile
when i create raid5 in btrfs ,command like this:

./mkfs.btrfs -d raid5 /dev/sdb /dev/sdc /dev/sdd /dev/sde  /dev/sdf
/dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm -f

WARNING! - Btrfs v0.20-rc1-358-g194aa4a-dirty IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

then no response,


there is error in kernel log:

Oct 24 21:25:36 host1 kernel: [ 3000.809503] INFO: task
mkfs.btrfs:3695 blocked for more than 120 seconds.
Oct 24 21:25:36 host1 kernel: [ 3000.809506] echo 0 
/proc/sys/kernel/hung_task_
timeout_secs disables this message.
Oct 24 21:25:36 host1 kernel: [ 3000.809508] mkfs.btrfs  D
0001 0  3695   3519 0x
Oct 24 21:25:36 host1 kernel: [ 3000.809513]  8807f4441c68
0082 8133677d 88080e049590
Oct 24 21:25:36 host1 kernel: [ 3000.809518]  8807f4441fd8
8807f4441fd8 8807f4441fd8 000139c0
Oct 24 21:25:36 host1 kernel: [ 3000.809522]  88080e9ddc00
8808115a8000 88080f7bc000 7fff
Oct 24 21:25:36 host1 kernel: [ 3000.809527] Call Trace:
Oct 24 21:25:36 host1 kernel: [ 3000.809534]  [8133677d] ?
rb_insert_color+0xad/0x150
Oct 24 21:25:36 host1 kernel: [ 3000.809539]  [8169d8b9]
schedule+0x29/0x70
Oct 24 21:25:36 host1 kernel: [ 3000.809543]  [8169bfd5]
schedule_timeout+0x2a5/0x320
Oct 24 21:25:36 host1 kernel: [ 3000.809547]  [8131048c] ?
blk_queue_bio+0x1cc/0x3a0
Oct 24 21:25:36 host1 kernel: [ 3000.809551]  [8169d70f]
wait_for_common+0xdf/0x180
Oct 24 21:25:36 host1 kernel: [ 3000.809555]  [8108a360] ?
try_to_wake_up+0x200/0x200
Oct 24 21:25:36 host1 kernel: [ 3000.809559]  [8169d88d]
wait_for_completion+0x1d/0x20
Oct 24 21:25:36 host1 kernel: [ 3000.809563]  [81315c14]
blkdev_issue_discard+0x1b4/0x1c0
Oct 24 21:25:36 host1 kernel: [ 3000.809567]  [81316341]
blkdev_ioctl+0x461/0x7a0
Oct 24 21:25:36 host1 kernel: [ 3000.809572]  [811beb70]
block_ioctl+0x40/0x50
Oct 24 21:25:36 host1 kernel: [ 3000.809576]  [811996fa]
do_vfs_ioctl+0x8a/0x340
Oct 24 21:25:36 host1 kernel: [ 3000.809579]  [8118c72a] ?
sys_newfstat+0x2a/0x40
Oct 24 21:25:36 host1 kernel: [ 3000.809583]  [81199a41]
sys_ioctl+0x91/0xa0
Oct 24 21:25:36 host1 kernel: [ 3000.809588]  [816a7029]
system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs raid5 bug task mkfs.btrfs:3695 blocked for more than 120 seconds

2013-10-24 Thread David Sterba
On Thu, Oct 24, 2013 at 10:22:28PM +0800, lilofile wrote:
 Oct 24 21:25:36 host1 kernel: [ 3000.809563]  [81315c14]
 blkdev_issue_discard+0x1b4/0x1c0

There's an discard/TRIM operation being done on all of the devices,
current progs do not report that and it's really confusing. Fixed in
integration branch.

If you don't want to do the trim, use the -K switch.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: make filesystem show by label work

2013-10-24 Thread Anand Jain
with design revamp around filesystem show the fsid filter
by label wasn't planned. but apparently that seemed to be
necessary. this patch will fix it.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 cmds-filesystem.c |  120 -
 1 files changed, 73 insertions(+), 47 deletions(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index d08007e..d2cad81 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -179,6 +179,26 @@ static int cmd_df(int argc, char **argv)
return !!ret;
 }
 
+static int match_search_item_kernel(__u8 *fsid, char *mnt, char *label,
+   char *search)
+{
+   char uuidbuf[37];
+   int search_len = strlen(search);
+
+   search_len = min(search_len, 37);
+   uuid_unparse(fsid, uuidbuf);
+   if (!strncmp(uuidbuf, search, search_len))
+   return 1;
+
+   if (strlen(label)  strcmp(label, search) == 0)
+   return 1;
+
+   if (strcmp(mnt, search) == 0)
+   return 1;
+
+   return 0;
+}
+
 static int uuid_search(struct btrfs_fs_devices *fs_devices, char *search)
 {
char uuidbuf[37];
@@ -275,16 +295,18 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args 
*fs_info,
struct btrfs_ioctl_dev_info_args *tmp_dev_info;
 
uuid_unparse(fs_info-fsid, uuidbuf);
-   printf(Label: %s  uuid: %s\n,
-   strlen(label) ? label : none, uuidbuf);
+   if (label  strlen(label))
+   printf(Label: '%s' , label);
+   else
+   printf(Label: none );
 
-   printf(\tTotal devices %llu FS bytes used %s\n,
-   fs_info-num_devices,
+   printf( uuid: %s\n\tTotal devices %llu FS bytes used %s\n, uuidbuf,
+   fs_info-num_devices,
pretty_size(calc_used_bytes(space_info)));
 
for (i = 0; i  fs_info-num_devices; i++) {
tmp_dev_info = (struct btrfs_ioctl_dev_info_args *)dev_info[i];
-   printf(\tdevid%llu size %s used %s path %s\n,
+   printf(\tdevid %4llu size %s used %s path %s\n,
tmp_dev_info-devid,
pretty_size(tmp_dev_info-total_bytes),
pretty_size(tmp_dev_info-bytes_used),
@@ -308,7 +330,7 @@ static int check_arg_type(char *input)
char path[PATH_MAX];
 
if (!input)
-   return BTRFS_ARG_UNKNOWN;
+   return -EINVAL;
 
if (realpath(input, path)) {
if (is_block_device(input) == 1)
@@ -320,7 +342,7 @@ static int check_arg_type(char *input)
return BTRFS_ARG_UNKNOWN;
}
 
-   if (!uuid_parse(input, out))
+   if (strlen(input) == 36  !uuid_parse(input, out))
return BTRFS_ARG_UUID;
 
return BTRFS_ARG_UNKNOWN;
@@ -328,23 +350,19 @@ static int check_arg_type(char *input)
 
 static int btrfs_scan_kernel(void *search)
 {
-   int ret = 0, fd, type;
+   int ret = 0, fd;
FILE *f;
struct mntent *mnt;
struct btrfs_ioctl_fs_info_args fs_info_arg;
struct btrfs_ioctl_dev_info_args *dev_info_arg = NULL;
struct btrfs_ioctl_space_args *space_info_arg;
char label[BTRFS_LABEL_SIZE];
-   uuid_t uuid;
 
f = setmntent(/proc/self/mounts, r);
if (f == NULL)
return 1;
 
-   type = check_arg_type(search);
-   if (type == BTRFS_ARG_BLKDEV)
-   return 1;
-
+   memset(label, 0, sizeof(label));
while ((mnt = getmntent(f)) != NULL) {
if (strcmp(mnt-mnt_type, btrfs))
continue;
@@ -353,38 +371,36 @@ static int btrfs_scan_kernel(void *search)
if (ret)
return ret;
 
-   switch (type) {
-   case BTRFS_ARG_UUID:
-   ret = uuid_parse(search, uuid);
-   if (ret)
-   return 1;
-   if (uuid_compare(fs_info_arg.fsid, uuid))
-   continue;
-   break;
-   case BTRFS_ARG_MNTPOINT:
-   if (strcmp(search, mnt-mnt_dir))
-   continue;
-   break;
-   case BTRFS_ARG_UNKNOWN:
-   break;
+   if (get_label_mounted(mnt-mnt_dir, label)) {
+   kfree(dev_info_arg);
+   return 1;
+   }
+   if (search  !match_search_item_kernel(fs_info_arg.fsid,
+   mnt-mnt_dir, label, search)) {
+   kfree(dev_info_arg);
+   continue;
}
 
fd = open(mnt-mnt_dir, O_RDONLY);
if (fd  0  !get_df(fd, space_info_arg)) {
-   get_label_mounted(mnt-mnt_dir, label);
  

Re: [PATCH] Btrfs: fix negative qgroup tracking from owner accounting (bug #61951)

2013-10-24 Thread Wang Shilong
Hello Jan,

 btrfs_dec_ref() queued a delayed ref for owner of a tree block. The qgroup
 tracking is based on delayed refs. The owner of a tree block is set when a
 tree block is allocated, it is never updated.
 
 When you allocate a tree block and then remove the subvolume that did the
 allocation, the qgroup accounting for that removal is correct. However, the
 removal was accounted again for each subvolume deletion that also referenced
 the tree block, because accounting was erroneously based on the owner.
 
 Instead of queueing delayed refs for the non-existent owner, we now
 queue delayed refs for the root being removed. This fixes the qgroup
 accounting.

Thanks for tracking this, i apply your patch, and using the flowing patch,
found the problem still exist, the test script like the following:

#!/bin/sh

for i in $(seq 1000)
do
dd if=/dev/zero 
of=mnt/$iaaa  bs=10K 
count=1
done

btrfs sub snapshot mnt mnt/1
for i in $(seq 100)
do
btrfs sub snapshot mnt/$i mnt/$(($i+1))
done

for i in $(seq 101)
do
btrfs sub delete mnt/$i
done


Thanks,
Wang
 
 Signed-off-by: Jan Schmidt list.bt...@jan-o-sch.net
 Tested-by: dustym...@gmail.com
 ---
 fs/btrfs/extent-tree.c |   14 +-
 1 files changed, 9 insertions(+), 5 deletions(-)
 
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index d58bef1..7846cae 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3004,12 +3004,11 @@ out:
 static int __btrfs_mod_ref(struct btrfs_trans_handle *trans,
  struct btrfs_root *root,
  struct extent_buffer *buf,
 -int full_backref, int inc, int for_cow)
 +int full_backref, u64 ref_root, int inc, int for_cow)
 {
   u64 bytenr;
   u64 num_bytes;
   u64 parent;
 - u64 ref_root;
   u32 nritems;
   struct btrfs_key key;
   struct btrfs_file_extent_item *fi;
 @@ -3019,7 +3018,6 @@ static int __btrfs_mod_ref(struct btrfs_trans_handle 
 *trans,
   int (*process_func)(struct btrfs_trans_handle *, struct btrfs_root *,
   u64, u64, u64, u64, u64, u64, int);
 
 - ref_root = btrfs_header_owner(buf);
   nritems = btrfs_header_nritems(buf);
   level = btrfs_header_level(buf);
 
 @@ -3075,13 +3073,19 @@ fail:
 int btrfs_inc_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 struct extent_buffer *buf, int full_backref, int for_cow)
 {
 - return __btrfs_mod_ref(trans, root, buf, full_backref, 1, for_cow);
 + u64 ref_root;
 +
 + ref_root = btrfs_header_owner(buf);
 +
 + return __btrfs_mod_ref(trans, root, buf, full_backref, ref_root,
 +1, for_cow);
 }
 
 int btrfs_dec_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 struct extent_buffer *buf, int full_backref, int for_cow)
 {
 - return __btrfs_mod_ref(trans, root, buf, full_backref, 0, for_cow);
 + return __btrfs_mod_ref(trans, root, buf, full_backref, root-objectid,
 +0, for_cow);
 }
 
 static int write_one_cache_group(struct btrfs_trans_handle *trans,
 -- 
 1.7.2.2
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs-progs: filesystem show of specified mounted disk should work

2013-10-24 Thread David Sterba
On Wed, Oct 23, 2013 at 10:08:16AM +0800, Anand Jain wrote:
 On 10/22/13 10:33 PM, David Sterba wrote:
 On Tue, Oct 22, 2013 at 01:53:22PM +0800, Anand Jain wrote:
 @@ -386,7 +395,7 @@ static int btrfs_scan_kernel(void *search)
   static const char * const cmd_show_usage[] = {
 -   btrfs filesystem show [options] [path|uuid],
 +   btrfs filesystem show [options|path|uuid],
 
 Options should stay separate from the path/uuid, you're extending the
 syntax to accept a device:
 
  btrfs filesystem show [options] [path|uuid|device],
 
 I'm fixing it locally, let me know if this doesn't match what you've
 intended.
 
  I am confused, on how the options should be represented,
  but the internal design is as below.

Hm right, it is a bit confusing, I think because of the syntax that
allows either options or the path/uuid/device specifier, not both, which
is not so common.

I still prefer to keep them separate, because it's something that can be
clarified in the help text or documentation.

Besides, that we may want to add more options that affect
path/uuid/device output, the argument description looks consistent with
other commands and if some combination is not allowed, then an error
message will say why. I really don't expect an average user to think too
hard about it.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs-progs: filesystem show of specified mounted disk should work

2013-10-24 Thread Hugo Mills
On Thu, Oct 24, 2013 at 04:51:00PM +0200, David Sterba wrote:
 On Wed, Oct 23, 2013 at 10:08:16AM +0800, Anand Jain wrote:
  On 10/22/13 10:33 PM, David Sterba wrote:
  On Tue, Oct 22, 2013 at 01:53:22PM +0800, Anand Jain wrote:
  @@ -386,7 +395,7 @@ static int btrfs_scan_kernel(void *search)
static const char * const cmd_show_usage[] = {
  - btrfs filesystem show [options] [path|uuid],
  + btrfs filesystem show [options|path|uuid],
  
  Options should stay separate from the path/uuid, you're extending the
  syntax to accept a device:
  
 btrfs filesystem show [options] [path|uuid|device],
  
  I'm fixing it locally, let me know if this doesn't match what you've
  intended.
  
   I am confused, on how the options should be represented,
   but the internal design is as below.
 
 Hm right, it is a bit confusing, I think because of the syntax that
 allows either options or the path/uuid/device specifier, not both, which
 is not so common.

   Typically, that ends up written as something like:

btrfs filesystem show [options]
btrfs filesystem show [path|uuid|device]

or just as

btrfs filesystem show [options] [path|uuid|device]

with a comment in the man page that the second parameter can't be
supplied with any of the options (if that's the case). Of the two, I
prefer the former, but with the acknowledgement that when the command
grows some options that can be used with the p/u/d, you'll end up
having to change the help text.

   Hugo.

 I still prefer to keep them separate, because it's something that can be
 clarified in the help text or documentation.
 
 Besides, that we may want to add more options that affect
 path/uuid/device output, the argument description looks consistent with
 other commands and if some combination is not allowed, then an error
 message will say why. I really don't expect an average user to think too
 hard about it.
 
 david

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- I believe that it's closely correlated with ---   
   the aeroswine coefficient.


signature.asc
Description: Digital signature


Why cannot I move a read-only snapshot around?

2013-10-24 Thread Karl Kiniger
Dear list, (newbie alert)

After sucessfully sending and receiving a dozen of  related snapshots
I want to move them all to the readonly folder but I cannot:

ls -l
.
drwxr-xr-x. 1 root root 682 Oct 24 16:01 @20131001
drwxr-xr-x. 1 root root 682 Oct 24 16:07 @20131004
drwxr-xr-x. 1 root root 682 Oct 24 16:10 @20131008
drwxr-xr-x. 1 root root 682 Oct 24 16:16 @20131010
drwxr-xr-x. 1 root root 682 Oct 24 16:23 @20131014
drwxr-xr-x. 1 root root 706 Oct 24 16:24 @20131018
drwxr-xr-x. 1 root root 706 Oct 24 16:31 @20131021
drwxr-xr-x. 1 root root 734 Oct 24 16:36 @20131023
drwxr-xr-x. 1 root root 734 Oct 24 16:41 @20131024
drwxr-xr-x. 1 root root 734 Oct 24 16:41 F19
drwxr-xr-x. 1 root root   0 Oct 24 17:21 readonly


mv \@20131024 readonly

mv: cannot move ‘@20131024’ to ‘readonly/@20131024’: Read-only file system

I know I can create other new ro  snapshots within the readonly directory
and then delete those above but in the future I want to send/receive based on
those snapshots (send -p  -c  -c ) but I want to move them
to a more  convenient place.

How can I move them without re-sending all?

Karl

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix negative qgroup tracking from owner accounting (bug #61951)

2013-10-24 Thread Jan Schmidt
On Thu, October 24, 2013 at 16:49 (+0200), Wang Shilong wrote:
 Hello Jan,
 
 btrfs_dec_ref() queued a delayed ref for owner of a tree block. The qgroup
 tracking is based on delayed refs. The owner of a tree block is set when a
 tree block is allocated, it is never updated.

 When you allocate a tree block and then remove the subvolume that did the
 allocation, the qgroup accounting for that removal is correct. However, the
 removal was accounted again for each subvolume deletion that also referenced
 the tree block, because accounting was erroneously based on the owner.

 Instead of queueing delayed refs for the non-existent owner, we now
 queue delayed refs for the root being removed. This fixes the qgroup
 accounting.
 
 Thanks for tracking this, i apply your patch, and using the flowing patch,
 found the problem still exist, the test script like the following:

Reproduced. Gives more negative numbers due to accounting triggered by the
cleaner thread, that's the common part here. I still believe that the fix I sent
is correct, it's probably not complete. Looking into it.

Thanks,
-Jan

 #!/bin/sh
 
 for i in $(seq 1000)
 do
   dd if=/dev/zero 
 of=mnt/$iaaa  bs=10K 
 count=1
 done
 
 btrfs sub snapshot mnt mnt/1
 for i in $(seq 100)
 do
   btrfs sub snapshot mnt/$i mnt/$(($i+1))
 done
 
 for i in $(seq 101)
 do
   btrfs sub delete mnt/$i
 done
 
 
 Thanks,
 Wang
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why cannot I move a read-only snapshot around?

2013-10-24 Thread Karl Kiniger
arrgh,  forgot to mention:

pc2:~ btrfs --version
Btrfs v0.20-rc1

Fedora 19 x86_64

Karl


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

2013-10-24 Thread Hans-Kristian Bakke
The result of the scrubbing came back today and it was not pretty:
...
scrub done for b64daec7-6c14-4996-94b3-80c6abfa26ce
scrub started at Wed Oct 23 23:01:22 2013 and finished after
34990 seconds
total bytes scrubbed: 12.55TB with 3859542 errors
error details: csum=3859542
corrected errors: 0, uncorrectable errors: 3859542, unverified errors: 0
---

Still only two folder structures affected, but seemingly unrecoverable.
I noticed the mail to include it in 3.12. Jippi!
Until this is included I will have to pospone rebalancing over the
four new drives.


Mvh

Hans-Kristian Bakke


On 23 October 2013 23:49, Hans-Kristian Bakke hkba...@gmail.com wrote:
 OK. btrfs scrub and dmesg is hitting me with lots of unfixable errors.
 All in the same file. Example

 [13313.441091] btrfs: unable to fixup (regular) error at logical
 560107954176 on dev /dev/sdn
 [13321.532223] scrub_handle_errored_block: 1510 callbacks suppressed
 [13321.532309] btrfs_dev_stat_print_on_error: 1510 callbacks suppressed
 [13321.532314] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40016, gen 0
 [13321.532420] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40017, gen 0
 [13321.532545] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40018, gen 0
 [13321.532605] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40019, gen 0
 [13321.533039] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40020, gen 0
 [13321.537519] scrub_handle_errored_block: 1508 callbacks suppressed
 [13321.537525] btrfs: unable to fixup (regular) error at logical
 560630136832 on dev /dev/sdq
 [13321.537821] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40021, gen 0
 [13321.538081] btrfs: unable to fixup (regular) error at logical
 560630140928 on dev /dev/sdq
 [13321.538438] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40022, gen 0
 [13321.538715] btrfs: unable to fixup (regular) error at logical
 560630145024 on dev /dev/sdq
 [13321.539016] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40023, gen 0
 [13321.539234] btrfs: unable to fixup (regular) error at logical
 560630149120 on dev /dev/sdq
 [13321.539522] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40024, gen 0
 [13321.539739] btrfs: unable to fixup (regular) error at logical
 560630153216 on dev /dev/sdq
 [13321.540027] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
 40025, gen 0
 [13321.540242] btrfs: unable to fixup (regular) error at logical
 560630157312 on dev /dev/sdq
 [13321.540620] btrfs: unable to fixup (regular) error at logical
 560630161408 on dev /dev/sdq
 [13321.541140] btrfs: unable to fixup (regular) error at logical
 560630165504 on dev /dev/sdq
 [13321.541571] btrfs: unable to fixup (regular) error at logical
 560630169600 on dev /dev/sdq
 [13321.541931] btrfs: unable to fixup (regular) error at logical
 560630173696 on dev /dev/sdq

 Luckily all the corruption seems to be in a single very large file,
 but on different part of it on different disks. The file was written
 by rtorrent which have the option system.file_allocate.set = yes
 configured.
 I also have samba configured with strict allocate = yes because it
 is recommended for best performance on extent based filesystems. Do
 that mean even samba files vulnerable to this corruption too?
 If so this could become very ugly very fast on certain systems.

 Mvh

 Hans-Kristian Bakke


 On 23 October 2013 23:24, Hans-Kristian Bakke hkba...@gmail.com wrote:
 I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
 RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
 rebalance because of failed csum.

 [10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
 csum 2566472073 private 151366068
 [10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
 csum 2566472073 private 3056924305
 [10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
 csum 2566472073 private 906093395
 [10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
 csum 2566472073 private 2680502892
 [10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
 csum 2566472073 private 1940162924
 [10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
 csum 2566472073 private 2939385278
 [10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
 csum 2566472073 private 645310077
 [10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
 csum 2566472073 private 3600741549
 [10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
 csum 2566472073 private 200201951
 [10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
 csum 2566472073 private 1002916440

 The system is running a scrub now and I will return with some more
 details later. I do not think systemd is logging to this volume, but
 the scrub wil probably show which files are affected.

 As this is a very serious issue for those hit by the 

Re: Why cannot I move a read-only snapshot around?

2013-10-24 Thread Chris Murphy

On Oct 24, 2013, at 9:29 AM, Karl Kiniger karl.kini...@med.ge.com wrote:

 Dear list, (newbie alert)
 
 After sucessfully sending and receiving a dozen of  related snapshots
 I want to move them all to the readonly folder but I cannot:
 
 ls -l
 .
 drwxr-xr-x. 1 root root 682 Oct 24 16:01 @20131001
 drwxr-xr-x. 1 root root 682 Oct 24 16:07 @20131004
 drwxr-xr-x. 1 root root 682 Oct 24 16:10 @20131008
 drwxr-xr-x. 1 root root 682 Oct 24 16:16 @20131010
 drwxr-xr-x. 1 root root 682 Oct 24 16:23 @20131014
 drwxr-xr-x. 1 root root 706 Oct 24 16:24 @20131018
 drwxr-xr-x. 1 root root 706 Oct 24 16:31 @20131021
 drwxr-xr-x. 1 root root 734 Oct 24 16:36 @20131023
 drwxr-xr-x. 1 root root 734 Oct 24 16:41 @20131024
 drwxr-xr-x. 1 root root 734 Oct 24 16:41 F19
 drwxr-xr-x. 1 root root   0 Oct 24 17:21 readonly
 
 
 mv \@20131024 readonly
 
 mv: cannot move ‘@20131024’ to ‘readonly/@20131024’: Read-only file system

Are the @ snapshot read only snapshots? And is read only just a regular 
directory?

I don't know that this is a bug, it seems like it could be intentional because 
a read only file system wouldn't let you move it out of one tree into another. 
But there was a bug that prevented moving of subvolumes into subvolumes 
(untested if moving subvolumes into folders worked) that was fixed in kernel 
3.11.6 so that might be worth a shot.


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: swapfile on btrfs, temporary solution for wiki

2013-10-24 Thread Timofey Titovets
Hello, i suggest temporary solution to use swap file under btrfs.
I test it, and it work good.

I invent simple the way, how create and using swap file, just see
following sh code:

swapfile=$(losetup -f) #free loop device
truncate -s 8G /swap   #create 8G sparse swap file
losetup $swapfile /swap #mount file to loop
mkswap  $swapfile
swapon  $swapfile

i just adding this to rc.local and this work good.
May be, add it to btrfs Wiki as temporary solution to using swap file?

Timofey

2013/10/21 Тимофей Титовец nefelim...@gmail.com:
 Hello list, i know what btrfs  don't support swap files.
 I read arch wiki and when i reading about systemd addon for auto
 create swapfile on btrfs, i invent the way, how create and using swap
 file, just see following sh code:

 swapfile=$(losetup -f) #free loop device
 truncate -s 8G /swap   #create 8G sparse swap file
 losetup $swapfile /swap #mount file to loop
 mkswap  $swapfile
 swapon  $swapfile

 i just adding this to rc.local and this just work.
 May be, add it to Wiki as temporary solution to using swap file?
 (sorry for my bad english)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why cannot I move a read-only snapshot around?

2013-10-24 Thread Karl Kiniger
Hi, 

On Thu 131024, Chris Murphy wrote:
 
 On Oct 24, 2013, at 9:29 AM, Karl Kiniger karl.kini...@med.ge.com wrote:
 
  Dear list, (newbie alert)
  
  After sucessfully sending and receiving a dozen of  related snapshots
  I want to move them all to the readonly folder but I cannot:
  
  ls -l
  .
..
  drwxr-xr-x. 1 root root 734 Oct 24 16:36 @20131023
  drwxr-xr-x. 1 root root 734 Oct 24 16:41 @20131024
  drwxr-xr-x. 1 root root 734 Oct 24 16:41 F19
  drwxr-xr-x. 1 root root   0 Oct 24 17:21 readonly
  
  
  mv \@20131024 readonly
  
  mv: cannot move ‘@20131024’ to ‘readonly/@20131024’: Read-only file system
 
 Are the @ snapshot read only snapshots? And is read only just a regular 
 directory?

Yes they are read only snapshots (just received by btrfs receive) and
readonly is a regular directory. I deliberately did not try to move
those snapshots into other snapshots.

I can move r/w snapshots  around without problems
(into some regular directory), just the r/o snapshots refuse moving.

cat /proc/version
Linux version 3.11.6-200.fc19.x86_64

Still curious,

Karl

 
 I don't know that this is a bug, it seems like it could be intentional 
 because a read only file system wouldn't let you move it out of one tree into 
 another. But there was a bug that prevented moving of subvolumes into 
 subvolumes (untested if moving subvolumes into folders worked) that was fixed 
 in kernel 3.11.6 so that might be worth a shot.
 
 
 Chris Murphy--
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why cannot I move a read-only snapshot around?

2013-10-24 Thread Karl Kiniger
Hi

(pls see also my other reply in this thread)

On Thu 131024, Duncan wrote:
 Karl Kiniger posted on Thu, 24 Oct 2013 17:29:56 +0200 as excerpted:
 
  Dear list, (newbie alert)
  
  After sucessfully sending and receiving a dozen of  related snapshots I
  want to move them all to the readonly folder but I cannot:
 
 I see you mention fedora 19 in a followup, but for those not on fedora, 
 that's not much help figuring out which kernel you're running.  It's 
 likely that the following is your problem, tho there's not enough 
 information in your post to be sure.

I promise to include more info in the future but just received
snapshots should be read-only if I read the docs correctly.

 
 There was a recent regression with nested subvolumes that may be what 
 you're running into.  Kernel 3.11 was affected as well as early 3.12-rcs 
 and I believe 3.10 also but I'm not sure how far back, except that 
 someone mentioned trying an old kernel (3.8 or 3.6-ish) and moving 
 subvolumes into subvolumes worked there (tho doing anything involving 
 writing into read-only snapshots shouldn't work, by design, but that 
 doesn't appear to be what you're doing, you're just trying to move read-
 only snapshots to a different location on a read/write base or parent 
 subvolume, this post assuming it's a parent subvolume, thus triggering 
 the nested subvolumes bug).

No nested subvolumes involved. (Is this true? This all is inside the top 
level volume or what it is called in btrfs.) 

 A fix is available but I'm not sure whether it got into 3.12 (which is 
 just about to be released) or will now have to wait for 3.13.  So either 
 try latest 3.12 git and see if its there, or find and cherry-pick the 
 patch, applying it against 3.11 or 3.12.  (Given that btrfs is still an 
 experimental filesystem with fixes applied every kernel, while reverting 
 to an old enough kernel should unregress this particular problem, I can't 
 recommend it except possibly for testing against data you don't care 
 about, since by doing so you're exposing yourself to other known and now 
 fixed bugs.)
Agreed, I dont want to go back to older kernels - too risky. The data  are
backed up anyways (on ZFS if you are curious)  but the time invested  into
my current btrfs setup would be gone.

I can live with the current situation, its just not nice to have the
snapshots lying around in a place where they should not belong.

If it were possible to temporarily make the r/o snapshots r/w just for
the purpose of moving (being aware that caution is needed) I would
not hesitate ane try that.

Karl


 
 -- 
 Duncan - List replies preferred.   No HTML msgs.
 Every nonfree program has a lord, a master --
 and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why cannot I move a read-only snapshot around?

2013-10-24 Thread Karl Kiniger
On Thu 131024, Chris Murphy wrote:
 dr. 1 chris chris   0 Oct 24 16:15 donotmove
 
 [chris@f20s ~]$ mv donotmove/ Videos/
 mv: cannot move ‘donotmove/’ to ‘Videos/donotmove’: Permission denied'
 
 I own that directory. But because it's read only, I can't move it because 
 moving it changes it. Of course if I become root, that overrides posix 
 permissions, but the readonly status of a subvolume isn't like posix 
 permissions and I see now reason why root should be able to modify it. And 
 moving it does modify it.

tries this all as root.

drwxr-xr-x. 1 root root 734 Oct 24 16:41 @20131024  (this is a r/o snap)

It looks to me similar to a read-only mounted filesystem:

pc2:/u2/F19/@20131024# touch foo
touch: cannot touch ‘foo’: Read-only file system

In what way would a r/o snapshot be modified because of moving its
mount point ? No one is ever doing something inside.

Karl

 Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why cannot I move a read-only snapshot around?

2013-10-24 Thread Chris Murphy

On Oct 24, 2013, at 4:46 PM, Karl Kiniger karl.kini...@med.ge.com wrote:

 On Thu 131024, Chris Murphy wrote:
 dr. 1 chris chris   0 Oct 24 16:15 donotmove
 
 [chris@f20s ~]$ mv donotmove/ Videos/
 mv: cannot move ‘donotmove/’ to ‘Videos/donotmove’: Permission denied'
 
 I own that directory. But because it's read only, I can't move it because 
 moving it changes it. Of course if I become root, that overrides posix 
 permissions, but the readonly status of a subvolume isn't like posix 
 permissions and I see now reason why root should be able to modify it. And 
 moving it does modify it.
 
 tries this all as root.
 
 drwxr-xr-x. 1 root root 734 Oct 24 16:41 @20131024  (this is a r/o snap)
 
 It looks to me similar to a read-only mounted filesystem:
 
 pc2:/u2/F19/@20131024# touch foo
 touch: cannot touch ‘foo’: Read-only file system
 
 In what way would a r/o snapshot be modified because of moving its
 mount point ? No one is ever doing something inside.

For the same reason I can't move or rename a read only directory even though 
I'm not doing something inside.



Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix negative qgroup tracking from owner accounting (bug #61951)

2013-10-24 Thread Wang Shilong

Hello Jan,

On 10/24/2013 11:36 PM, Jan Schmidt wrote:

On Thu, October 24, 2013 at 16:49 (+0200), Wang Shilong wrote:

Hello Jan,


btrfs_dec_ref() queued a delayed ref for owner of a tree block. The qgroup
tracking is based on delayed refs. The owner of a tree block is set when a
tree block is allocated, it is never updated.

When you allocate a tree block and then remove the subvolume that did the
allocation, the qgroup accounting for that removal is correct. However, the
removal was accounted again for each subvolume deletion that also referenced
the tree block, because accounting was erroneously based on the owner.

Instead of queueing delayed refs for the non-existent owner, we now
queue delayed refs for the root being removed. This fixes the qgroup
accounting.

Thanks for tracking this, i apply your patch, and using the flowing patch,
found the problem still exist, the test script like the following:

Reproduced. Gives more negative numbers due to accounting triggered by the
cleaner thread, that's the common part here. I still believe that the fix I sent
is correct, it's probably not complete. Looking into it.
I really wait cleaner thread to finish work, and i use btrfs-debug-tree 
to confirm

all the fs tree have been deleted.

But using btrfs qgroup show, i still get negative numers, also root 
subvolume's

exclusive is wrong.. Statices are like following.

0/5 13090816 471040
0/257 13078528 0
0/259 13078528 0
0/260 13078528 0
0/261 13078528 0
.

...
0/350 13078528 0
0/351 13078528 0
0/352 13078528 0
0/353 13078528 0
0/354 13078528 0
0/355 13078528 0
0/356 13078528 0
0/357 13078528 0
0/358 12619776 -155648

Thanks,
Wang


Thanks,
-Jan


#!/bin/sh

for i in $(seq 1000)
do
dd if=/dev/zero 
of=mnt/$iaaa  bs=10K 
count=1
done

btrfs sub snapshot mnt mnt/1
for i in $(seq 100)
do
btrfs sub snapshot mnt/$i mnt/$(($i+1))
done

for i in $(seq 101)
do
btrfs sub delete mnt/$i
done


Thanks,
Wang

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html