Re: [PATCH] xfstests: make BTRFS_UTIL_PROG filesystem defragment work

2015-04-14 Thread Filipe David Manana
On Tue, Apr 14, 2015 at 10:01 AM, Liu Bo bo.li@oracle.com wrote:
 _require_defrag() needs to check if the command is executable, but btrfs has
 its subcommand filesystem defragment, which makes this checking fail.

 This workarounds it and now we can run case generic/324, generic/018, 
 btrfs/005.

There's already a patch from Zhao to fix the regression:

https://patchwork.kernel.org/patch/6205031/

thanks


 Signed-off-by: Liu Bo bo.li@oracle.com
 ---
  common/defrag | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/common/defrag b/common/defrag
 index f923dc0..f36a68b 100644
 --- a/common/defrag
 +++ b/common/defrag
 @@ -37,7 +37,11 @@ _require_defrag()
 ;;
  esac

 -_require_command $DEFRAG_PROG defragment
 +if [ $FSTYP == btrfs ]; then
 +   _require_command $BTRFS_UTIL_PROG defragment
 +else
 +   _require_command $DEFRAG_PROG defragment
 +fi
  _require_xfs_io_command fiemap
  }

 --
 1.8.2.1

 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


directory defrag

2015-04-14 Thread Russell Coker
The current defragmentation options seem to only support defragmenting named 
files/directories or a recursive defragmentation of files and directories.

I'd like to recursively defragment directories.  One of my systems has a large 
number of large files, the files are write-once and read performance is good 
enough.  However performance of ls -al is often very poor, presumably due to 
metadata fragmentation.

The other thing I'd still like is the ability to force all metadata allocation 
to be from specified disks.  I'd like to have a pair of SSDs for RAID-1 storage 
of metadata and a set of hard drives for RAID-1 storage of data.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs receive hardening patches

2015-04-14 Thread Lauri Võsandi
---
 cmds-receive.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 44ef27e..6800401 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -867,15 +867,19 @@ static int do_receive(struct btrfs_receive *r, const char 
*tomnt, int r_fd,
goto out;
}
 
-   /*
-* find_mount_root returns a root_path that is a subpath of
-* dest_dir_full_path. Now get the other part of root_path,
-* which is the destination dir relative to root_path.
+   /**
+* Enforce chroot before parsing btrfs stream
 */
-   r-dest_dir_path = dest_dir_full_path + strlen(r-root_path);
-   while (r-dest_dir_path[0] == '/')
-   r-dest_dir_path++;
+   if (chroot(dest_dir_full_path)) {
+   fprintf(stderr,
+   ERROR: failed to chroot to %s\n,
+   dest_dir_full_path);
+   ret = -errno;
+   goto out;
+   }
 
+   r-root_path = r-dest_dir_path = strdup(/);
+   
ret = subvol_uuid_search_init(r-mnt_fd, r-sus);
if (ret  0)
goto out;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: incremental send, don't rename a directory too soon

2015-04-14 Thread Robbie Ko
Hi,

After applying the patch, I got WARN_ON.
btrfs progs finished without any error message,
but received subvolume is not the same as send subvolume.

Here's the related information.
thanks,
robbieko

uanme -a
Linux ubuntu 4.0.0-rc4-custom #2 SMP Tue Apr 14 11:43:00 CST 2015
x86_64 x86_64 x86_64 GNU/Linux
btrfs --version
Btrfs v3.14.1

Steps to reproduce:

 $ mkfs.btrfs -f /dev/sdb
 $ mount /dev/sdb /mnt
 $ mkfs.btrfs -f /dev/sdc
 $ mount /dev/sdc /mnt2

$ mkdir -p /mnt/data
$ mkdir -p /mnt/data/n1/n2
$ mkdir -p /mnt/data/n4
$ mkdir -p /mnt/data/n1/n2/p1
$ mkdir -p /mnt/data/t4
$ mkdir -p /mnt/data/p1
$ mkdir -p /mnt/data/p1/2

  $ btrfs subvolume snapshot -r /mnt /mnt/snap1

$ mv /mnt/data/n1/n2 /mnt/data/t4
$ mv /mnt/data/n4 /mnt/data/t4/n2
$ mv /mnt/data/t4/n2/p1 /mnt/data/t4/p1
$ mv /mnt/data/p1 /mnt/data/t4/n2

  $ btrfs subvolume snapshot -r /mnt /mnt/snap2

  $ btrfs send /mnt/snap1 | btrfs receive /mnt2
  $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2

Call trace message

[  135.498533] [ cut here ]

[  135.498557] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5934
btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]()

[  135.498560] Modules linked in: nf_conntrack_ipv4(E)
nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
x_tables(E) bridge(E) stp(E) llc(E) snd_intel8x0(E) snd_ac97_codec(E)
ac97_bus(E) snd_pcm(E) snd_timer(E) snd(E) iosf_mbi(E) soundcore(E)
ppdev(E) joydev(E) lp(E) serio_raw(E) parport_pc(E) i2c_piix4(E)
mac_hid(E) parport(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
usbhid(E) hid(E) ahci(E) psmouse(E) libahci(E) e1000(E) pata_acpi(E)

[  135.498578] CPU: 1 PID: 2346 Comm: btrfs Tainted: G E
4.0.0-rc4-custom #3

[  135.498580] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006

[  135.498583]  c0509016 88007a233c08 817b62f3
0007

[  135.498586]   88007a233c48 8107452a
c9567000

[  135.498590]  8800799f1400 8800799f1418 88007b3f82d0
880079f6

[  135.498593] Call Trace:

[  135.498602]  [817b62f3] dump_stack+0x45/0x57

[  135.498609]  [8107452a] warn_slowpath_common+0x8a/0xc0

[  135.498614]  [8107461a] warn_slowpath_null+0x1a/0x20

[  135.498626]  [c04f5c6c] btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]

[  135.498632]  [8118020e] ? __alloc_pages_nodemask+0x1ae/0xab0

[  135.498638]  [810a4775] ? sched_clock_local+0x25/0x90

[  135.498643]  [8109182e] ? alloc_pid+0x2e/0x530

[  135.498655]  [c04bc4f6] btrfs_ioctl+0x286/0x27e0 [btrfs]

[  135.498660]  [810a6368] ? __enqueue_entity+0x78/0x80

[  135.498665]  [810add70] ? enqueue_entity+0x400/0xc20

[  135.498679]  [8101dc3a] ? native_sched_clock+0x2a/0x90

[  135.498686]  [810ae708] ? enqueue_task_fair+0x178/0x730

[  135.498698]  [81047c1d] ? native_smp_send_reschedule+0x4d/0x70

[  135.498703]  [8109d5f0] ? resched_curr+0x70/0xc0

[  135.498710]  [8109e12a] ? check_preempt_curr+0x5a/0xa0

[  135.498715]  [810a152f] ? wake_up_new_task+0x12f/0x1b0

[  135.498722]  [81204010] do_vfs_ioctl+0x2e0/0x4e0

[  135.498728]  [8107368c] ? do_fork+0x13c/0x370

[  135.498733]  [81204291] SyS_ioctl+0x81/0xa0

[  135.498738]  [81073946] ? SyS_clone+0x16/0x20

[  135.498839]  [817bd80d] ? stub_clone+0x6d/0x90

[  135.498845]  [817bd50d] system_call_fastpath+0x16/0x1b

[  135.498848] ---[ end trace e1dd916182de3a9d ]---

[  135.498851] [ cut here ]

[  135.498871] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5951
btrfs_ioctl_send+0x28f/0x11e0 [btrfs]()

[  135.498894] Modules linked in: nf_conntrack_ipv4(E)
nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
x_tables(E) bridge(E) stp(E) llc(E) snd_intel8x0(E) snd_ac97_codec(E)
ac97_bus(E) snd_pcm(E) snd_timer(E) snd(E) iosf_mbi(E) soundcore(E)
ppdev(E) joydev(E) lp(E) serio_raw(E) parport_pc(E) i2c_piix4(E)
mac_hid(E) parport(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
usbhid(E) hid(E) ahci(E) psmouse(E) libahci(E) e1000(E) pata_acpi(E)

[  135.498900] CPU: 1 PID: 2346 Comm: btrfs Tainted: G W   E
4.0.0-rc4-custom #3

[  135.498903] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006

[  135.498910]  c0509016 88007a233c08 817b62f3
4c724c72

[  135.498918]   88007a233c48 8107452a
88007a233c38

[  135.498923]  8800799f1400 880078d52cc0 880078d52cd8
8800799f15d8

[  135.498927] Call Trace:

[  135.498937]  [817b62f3] dump_stack+0x45/0x57

[  135.498945]  [8107452a] warn_slowpath_common+0x8a/0xc0

[  135.498951]  [8107461a] warn_slowpath_null+0x1a/0x20

[  135.498975]  [c04f52af] 

Re: [PATCH v2] Btrfs: incremental send, don't rename a directory too soon

2015-04-14 Thread Filipe David Manana
On Tue, Apr 14, 2015 at 8:33 AM, Robbie Ko robbi...@synology.com wrote:
 Hi,

 After applying the patch, I got WARN_ON.
 btrfs progs finished without any error message,
 but received subvolume is not the same as send subvolume.

 Here's the related information.
 thanks,
 robbieko

 uanme -a
 Linux ubuntu 4.0.0-rc4-custom #2 SMP Tue Apr 14 11:43:00 CST 2015
 x86_64 x86_64 x86_64 GNU/Linux
 btrfs --version
 Btrfs v3.14.1

 Steps to reproduce:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt2

 $ mkdir -p /mnt/data
 $ mkdir -p /mnt/data/n1/n2
 $ mkdir -p /mnt/data/n4
 $ mkdir -p /mnt/data/n1/n2/p1
 $ mkdir -p /mnt/data/t4
 $ mkdir -p /mnt/data/p1
 $ mkdir -p /mnt/data/p1/2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap1

 $ mv /mnt/data/n1/n2 /mnt/data/t4
 $ mv /mnt/data/n4 /mnt/data/t4/n2
 $ mv /mnt/data/t4/n2/p1 /mnt/data/t4/p1
 $ mv /mnt/data/p1 /mnt/data/t4/n2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap2

   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2

So this is a new case, different from the ones you've sent before, isn't it?

You should have all previous patches applied too, not just this one
you're replying to.
Also, it isn't clear, are you saying this happens only with this
particular patch applied but doesn't happen without it (and all other
recent ones)?

thanks



 Call trace message

 [  135.498533] [ cut here ]

 [  135.498557] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5934
 btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]()

 [  135.498560] Modules linked in: nf_conntrack_ipv4(E)
 nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
 nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
 x_tables(E) bridge(E) stp(E) llc(E) snd_intel8x0(E) snd_ac97_codec(E)
 ac97_bus(E) snd_pcm(E) snd_timer(E) snd(E) iosf_mbi(E) soundcore(E)
 ppdev(E) joydev(E) lp(E) serio_raw(E) parport_pc(E) i2c_piix4(E)
 mac_hid(E) parport(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
 usbhid(E) hid(E) ahci(E) psmouse(E) libahci(E) e1000(E) pata_acpi(E)

 [  135.498578] CPU: 1 PID: 2346 Comm: btrfs Tainted: G E
 4.0.0-rc4-custom #3

 [  135.498580] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
 VirtualBox 12/01/2006

 [  135.498583]  c0509016 88007a233c08 817b62f3
 0007

 [  135.498586]   88007a233c48 8107452a
 c9567000

 [  135.498590]  8800799f1400 8800799f1418 88007b3f82d0
 880079f6

 [  135.498593] Call Trace:

 [  135.498602]  [817b62f3] dump_stack+0x45/0x57

 [  135.498609]  [8107452a] warn_slowpath_common+0x8a/0xc0

 [  135.498614]  [8107461a] warn_slowpath_null+0x1a/0x20

 [  135.498626]  [c04f5c6c] btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]

 [  135.498632]  [8118020e] ? __alloc_pages_nodemask+0x1ae/0xab0

 [  135.498638]  [810a4775] ? sched_clock_local+0x25/0x90

 [  135.498643]  [8109182e] ? alloc_pid+0x2e/0x530

 [  135.498655]  [c04bc4f6] btrfs_ioctl+0x286/0x27e0 [btrfs]

 [  135.498660]  [810a6368] ? __enqueue_entity+0x78/0x80

 [  135.498665]  [810add70] ? enqueue_entity+0x400/0xc20

 [  135.498679]  [8101dc3a] ? native_sched_clock+0x2a/0x90

 [  135.498686]  [810ae708] ? enqueue_task_fair+0x178/0x730

 [  135.498698]  [81047c1d] ? native_smp_send_reschedule+0x4d/0x70

 [  135.498703]  [8109d5f0] ? resched_curr+0x70/0xc0

 [  135.498710]  [8109e12a] ? check_preempt_curr+0x5a/0xa0

 [  135.498715]  [810a152f] ? wake_up_new_task+0x12f/0x1b0

 [  135.498722]  [81204010] do_vfs_ioctl+0x2e0/0x4e0

 [  135.498728]  [8107368c] ? do_fork+0x13c/0x370

 [  135.498733]  [81204291] SyS_ioctl+0x81/0xa0

 [  135.498738]  [81073946] ? SyS_clone+0x16/0x20

 [  135.498839]  [817bd80d] ? stub_clone+0x6d/0x90

 [  135.498845]  [817bd50d] system_call_fastpath+0x16/0x1b

 [  135.498848] ---[ end trace e1dd916182de3a9d ]---

 [  135.498851] [ cut here ]

 [  135.498871] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5951
 btrfs_ioctl_send+0x28f/0x11e0 [btrfs]()

 [  135.498894] Modules linked in: nf_conntrack_ipv4(E)
 nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
 nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
 x_tables(E) bridge(E) stp(E) llc(E) snd_intel8x0(E) snd_ac97_codec(E)
 ac97_bus(E) snd_pcm(E) snd_timer(E) snd(E) iosf_mbi(E) soundcore(E)
 ppdev(E) joydev(E) lp(E) serio_raw(E) parport_pc(E) i2c_piix4(E)
 mac_hid(E) parport(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
 usbhid(E) hid(E) ahci(E) psmouse(E) libahci(E) e1000(E) pata_acpi(E)

 [  135.498900] CPU: 1 PID: 2346 Comm: btrfs Tainted: G W   E
 4.0.0-rc4-custom #3

 [  135.498903] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
 VirtualBox 12/01/2006

 [  135.498910]  

[PATCH] xfstests: make BTRFS_UTIL_PROG filesystem defragment work

2015-04-14 Thread Liu Bo
_require_defrag() needs to check if the command is executable, but btrfs has
its subcommand filesystem defragment, which makes this checking fail.

This workarounds it and now we can run case generic/324, generic/018, btrfs/005.

Signed-off-by: Liu Bo bo.li@oracle.com
---
 common/defrag | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/common/defrag b/common/defrag
index f923dc0..f36a68b 100644
--- a/common/defrag
+++ b/common/defrag
@@ -37,7 +37,11 @@ _require_defrag()
;;
 esac
 
-_require_command $DEFRAG_PROG defragment
+if [ $FSTYP == btrfs ]; then
+   _require_command $BTRFS_UTIL_PROG defragment
+else
+   _require_command $DEFRAG_PROG defragment
+fi
 _require_xfs_io_command fiemap
 }
 
-- 
1.8.2.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests: make BTRFS_UTIL_PROG filesystem defragment work

2015-04-14 Thread Liu Bo
On Tue, Apr 14, 2015 at 10:14:59AM +0100, Filipe David Manana wrote:
 On Tue, Apr 14, 2015 at 10:01 AM, Liu Bo bo.li@oracle.com wrote:
  _require_defrag() needs to check if the command is executable, but btrfs has
  its subcommand filesystem defragment, which makes this checking fail.
 
  This workarounds it and now we can run case generic/324, generic/018, 
  btrfs/005.
 
 There's already a patch from Zhao to fix the regression:
 
 https://patchwork.kernel.org/patch/6205031/

Got it, thanks for pointing it out.

Thanks,

-liubo

 
 thanks
 
 
  Signed-off-by: Liu Bo bo.li@oracle.com
  ---
   common/defrag | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)
 
  diff --git a/common/defrag b/common/defrag
  index f923dc0..f36a68b 100644
  --- a/common/defrag
  +++ b/common/defrag
  @@ -37,7 +37,11 @@ _require_defrag()
  ;;
   esac
 
  -_require_command $DEFRAG_PROG defragment
  +if [ $FSTYP == btrfs ]; then
  +   _require_command $BTRFS_UTIL_PROG defragment
  +else
  +   _require_command $DEFRAG_PROG defragment
  +fi
   _require_xfs_io_command fiemap
   }
 
  --
  1.8.2.1
 
  --
  To unsubscribe from this list: send the line unsubscribe fstests in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 
 -- 
 Filipe David Manana,
 
 Reasonable men adapt themselves to the world.
  Unreasonable men adapt the world to themselves.
  That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: directory defrag

2015-04-14 Thread Sander
Russell Coker wrote (ao):
 The current defragmentation options seem to only support defragmenting
 named files/directories or a recursive defragmentation of files and
 directories.
 
 I'd like to recursively defragment directories.

find / -xdev -type d -execdir btrfs filesystem defrag -c {} +

Would that work for you?

Sander
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance {,meta}data to raid5 not working?

2015-04-14 Thread Piotr Szymaniak
On Sat, Apr 11, 2015 at 02:14:44AM +, Duncan wrote:
 But you could try the latest 4.0-rc7+ kernel and see if it works with 
 that, yet.

Will try that.


 2b) If instead your intention was to convert it to raid5 before upgrading 
 it to three devices, just add the third device first, then do the balance-
 conversion.  It'll save quite some time over effectively doing the 
 balance-conversion twice.

I want to grow it later to 3 or 4 devices. Now it is still a test setup
and I want to try convert to raid5 and also check +- time it takes (and
if it will end OK).


Piotr Szymaniak.
-- 
 - Zamówiłeś baby-sitterkę? - chciała wiedzieć Maggie.
  -- Graham Masterton, Zaklęci (przełożył Juliusz Garztecki)


signature.asc
Description: Digital signature


Re: [PATCH] btrfs-progs: enforce chroot for btrfs receive

2015-04-14 Thread David Sterba
On Tue, Apr 14, 2015 at 01:44:32PM +0300, Lauri Võsandi wrote:
 This patch forces btrfs receive to issue chroot before
 parsing the btrfs stream to confine the process and
 minimize damage that could be done via malicious
 btrfs stream.

Thanks.

As we've discussed, there are possibly some things to resolve:

* chdir(/) after chroot
* commandline options to enable/disable chroot, choose the default

Receive should work for a non-root user so chroot should be conditional,
but I'm not sure if this should be guessed from the UID or if this would
be better to specify only by the commandline options.

I'll put the patch into a separate branch for now.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: incremental send, don't rename a directory too soon

2015-04-14 Thread Robbie Ko
Hi,

Sorry for not making it clear.

2015-04-14 16:16 GMT+08:00 Filipe David Manana fdman...@gmail.com:
 On Tue, Apr 14, 2015 at 8:33 AM, Robbie Ko robbi...@synology.com wrote:
 Hi,

 After applying the patch, I got WARN_ON.
 btrfs progs finished without any error message,
 but received subvolume is not the same as send subvolume.

 Here's the related information.
 thanks,
 robbieko

 uanme -a
 Linux ubuntu 4.0.0-rc4-custom #2 SMP Tue Apr 14 11:43:00 CST 2015
 x86_64 x86_64 x86_64 GNU/Linux
 btrfs --version
 Btrfs v3.14.1

 Steps to reproduce:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt2

 $ mkdir -p /mnt/data
 $ mkdir -p /mnt/data/n1/n2
 $ mkdir -p /mnt/data/n4
 $ mkdir -p /mnt/data/n1/n2/p1
 $ mkdir -p /mnt/data/t4
 $ mkdir -p /mnt/data/p1
 $ mkdir -p /mnt/data/p1/2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap1

 $ mv /mnt/data/n1/n2 /mnt/data/t4
 $ mv /mnt/data/n4 /mnt/data/t4/n2
 $ mv /mnt/data/t4/n2/p1 /mnt/data/t4/p1
 $ mv /mnt/data/p1 /mnt/data/t4/n2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap2

   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2

 So this is a new case, different from the ones you've sent before, isn't it?

 You should have all previous patches applied too, not just this one
 you're replying to.

Hi,

I have applied all the patches fixed recently.
Then WARN_ON happened with steps mentioned above.
I tested it without these patches, no WARN_ON but the following error
appeared instead.
ERROR: rename data/t4/n2/p1 - data/t4/n2/p1/p1 failed. Invalid argument

I started to revert these patches and found that this patch causes the
WARN_ON problem.

I'm not sure whether it's a new case.

thanks,
robbieko

 Also, it isn't clear, are you saying this happens only with this
 particular patch applied but doesn't happen without it (and all other
 recent ones)?

 thanks



 Call trace message

 [  135.498533] [ cut here ]

 [  135.498557] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5934
 btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]()

 [  135.498560] Modules linked in: nf_conntrack_ipv4(E)
 nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
 nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
 x_tables(E) bridge(E) stp(E) llc(E) snd_intel8x0(E) snd_ac97_codec(E)
 ac97_bus(E) snd_pcm(E) snd_timer(E) snd(E) iosf_mbi(E) soundcore(E)
 ppdev(E) joydev(E) lp(E) serio_raw(E) parport_pc(E) i2c_piix4(E)
 mac_hid(E) parport(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
 usbhid(E) hid(E) ahci(E) psmouse(E) libahci(E) e1000(E) pata_acpi(E)

 [  135.498578] CPU: 1 PID: 2346 Comm: btrfs Tainted: G E
 4.0.0-rc4-custom #3

 [  135.498580] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
 VirtualBox 12/01/2006

 [  135.498583]  c0509016 88007a233c08 817b62f3
 0007

 [  135.498586]   88007a233c48 8107452a
 c9567000

 [  135.498590]  8800799f1400 8800799f1418 88007b3f82d0
 880079f6

 [  135.498593] Call Trace:

 [  135.498602]  [817b62f3] dump_stack+0x45/0x57

 [  135.498609]  [8107452a] warn_slowpath_common+0x8a/0xc0

 [  135.498614]  [8107461a] warn_slowpath_null+0x1a/0x20

 [  135.498626]  [c04f5c6c] btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]

 [  135.498632]  [8118020e] ? __alloc_pages_nodemask+0x1ae/0xab0

 [  135.498638]  [810a4775] ? sched_clock_local+0x25/0x90

 [  135.498643]  [8109182e] ? alloc_pid+0x2e/0x530

 [  135.498655]  [c04bc4f6] btrfs_ioctl+0x286/0x27e0 [btrfs]

 [  135.498660]  [810a6368] ? __enqueue_entity+0x78/0x80

 [  135.498665]  [810add70] ? enqueue_entity+0x400/0xc20

 [  135.498679]  [8101dc3a] ? native_sched_clock+0x2a/0x90

 [  135.498686]  [810ae708] ? enqueue_task_fair+0x178/0x730

 [  135.498698]  [81047c1d] ? native_smp_send_reschedule+0x4d/0x70

 [  135.498703]  [8109d5f0] ? resched_curr+0x70/0xc0

 [  135.498710]  [8109e12a] ? check_preempt_curr+0x5a/0xa0

 [  135.498715]  [810a152f] ? wake_up_new_task+0x12f/0x1b0

 [  135.498722]  [81204010] do_vfs_ioctl+0x2e0/0x4e0

 [  135.498728]  [8107368c] ? do_fork+0x13c/0x370

 [  135.498733]  [81204291] SyS_ioctl+0x81/0xa0

 [  135.498738]  [81073946] ? SyS_clone+0x16/0x20

 [  135.498839]  [817bd80d] ? stub_clone+0x6d/0x90

 [  135.498845]  [817bd50d] system_call_fastpath+0x16/0x1b

 [  135.498848] ---[ end trace e1dd916182de3a9d ]---

 [  135.498851] [ cut here ]

 [  135.498871] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5951
 btrfs_ioctl_send+0x28f/0x11e0 [btrfs]()

 [  135.498894] Modules linked in: nf_conntrack_ipv4(E)
 nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
 nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
 x_tables(E) bridge(E) 

[PATCH] btrfs-progs: enforce chroot for btrfs receive

2015-04-14 Thread Lauri Võsandi
This patch forces btrfs receive to issue chroot before
parsing the btrfs stream to confine the process and
minimize damage that could be done via malicious
btrfs stream.

Signed-off-by: Lauri Võsandi lauri.vosa...@gmail.com
---
 cmds-receive.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 44ef27e..8be92ea 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -867,15 +867,17 @@ static int do_receive(struct btrfs_receive *r, const char 
*tomnt, int r_fd,
goto out;
}
 
-   /*
-* find_mount_root returns a root_path that is a subpath of
-* dest_dir_full_path. Now get the other part of root_path,
-* which is the destination dir relative to root_path.
-*/
-   r-dest_dir_path = dest_dir_full_path + strlen(r-root_path);
-   while (r-dest_dir_path[0] == '/')
-   r-dest_dir_path++;
+   if (chroot(dest_dir_full_path)) {
+   ret = -errno;
+   fprintf(stderr,
+   ERROR: failed to chroot to %s, %s\n,
+   dest_dir_full_path,
+   strerror(-ret));
+   goto out;
+   }
 
+   r-root_path = r-dest_dir_path = strdup(/);
+   
ret = subvol_uuid_search_init(r-mnt_fd, r-sus);
if (ret  0)
goto out;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Christoph Hellwig
On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
 Yuck! How the heck do you clean up the mess if that happens? I guess
 you're just stuck redoing the copy with normal READ/WRITE?
 
 Maybe we need to have the interface return a hard error in that
 case and not try to give back any sort of offset?

The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
expect us to simply ignore it and only implement my new CLONE operation
with sane semantics.  That is unless someone can show some real life
use case for the inter server copy, in which case we'll have to deal
with that mess.  But getting that one right at the VFS level will
be a nightmare anyway.

Make this a vote from me to not support partial copies and just return
and error in that case.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Anna Schumaker
On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
 On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
 Yuck! How the heck do you clean up the mess if that happens? I guess
 you're just stuck redoing the copy with normal READ/WRITE?

 Maybe we need to have the interface return a hard error in that
 case and not try to give back any sort of offset?
 
 The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
 expect us to simply ignore it and only implement my new CLONE operation
 with sane semantics.  That is unless someone can show some real life
 use case for the inter server copy, in which case we'll have to deal
 with that mess.  But getting that one right at the VFS level will
 be a nightmare anyway.
 
 Make this a vote from me to not support partial copies and just return
 and error in that case.

Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a 
ca_synchronous flags that let the client state if the copy should be done 
consecutively or synchronously.  I expected to always set consecutive to true 
for the Linux client.

Anna

 --
 To unsubscribe from this list: send the line unsubscribe linux-nfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Christoph Hellwig
On Tue, Apr 14, 2015 at 09:53:44AM -0700, Christoph Hellwig wrote:
 On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
  Yuck! How the heck do you clean up the mess if that happens? I guess
  you're just stuck redoing the copy with normal READ/WRITE?
  
  Maybe we need to have the interface return a hard error in that
  case and not try to give back any sort of offset?
 
 The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
 expect us to simply ignore it and only implement my new CLONE operation
 with sane semantics.  That is unless someone can show some real life
 use case for the inter server copy, in which case we'll have to deal
 with that mess.  But getting that one right at the VFS level will
 be a nightmare anyway.

Btw, in case someone cares about the NFS CLONE implementation here is
my prototype based on Anna's older COPY prototype.  It's simple enough that
it might be worth adding to the copy_file_range patch set.

http://git.infradead.org/users/hch/pnfs.git/shortlog/refs/heads/clone
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 3/3] btrfs: add .copy_file_range file operation

2015-04-14 Thread Chris Mason
On 04/10/2015 06:00 PM, Zach Brown wrote:
 This rearranges the existing COPY_RANGE ioctl implementation so that the
 .copy_file_range file operation can call the core loop that copies file
 data extent items.
 
 The extent copying loop is lifted up into its own function.  It retains
 the core btrfs error checks that should be shared.
 

Thanks Zach, the btrfs bits look reasonable

Signed-off-by: Chris Mason c...@fb.com

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread J. Bruce Fields
On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
 On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
  On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
   On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
Yuck! How the heck do you clean up the mess if that happens? I
guess you're just stuck redoing the copy with normal READ/WRITE?
   
Maybe we need to have the interface return a hard error in that
case and not try to give back any sort of offset?

The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
expect us to simply ignore it and only implement my new CLONE
operation with sane semantics.  That is unless someone can show some
real life use case for the inter server copy, in which case we'll
have to deal with that mess.  But getting that one right at the VFS
level will be a nightmare anyway.

Make this a vote from me to not support partial copies and just
return and error in that case.
   
   Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
   ca_synchronous flags that let the client state if the copy should be
   done consecutively or synchronously.  I expected to always set
   consecutive to true for the Linux client.
  
  That's supposed to mean results are well-defined in the partial-copy
  case, but I think Christoph's suggesting eliminating the partial-copy
  case entirely?
  
  Which would be fine with me.
  
  It might actually have been me advocating for partial copies.  But that
  was only because a partial-copy-handling-loop seemed simpler to me than
  progress callbacks if we were going to support long-running copies.
  
  I'm happy enough not to have it at all.
 
 Ah, OK, that's great news.
 
 I thought at one point we were worried about very long running RPCs on
 the server.  Are we not worried about that now?
 
 Is the client expected to cut the work up into arbitrarily managable
 chunks?  Is the server expected to fail COPY/CLONE requests that it
 thinks would take way too long?  Something else?

Christoph is proposing a CLONE rpc that's required to be atomic:


https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-35#section-15.13
The CLONE operation is atomic, that is either all changes or no
changes are seen by the client or other clients.

So that couldn't be really long-running (or the server is nuts).

So that'd mean Anna would rip out the server-side copy loop and we'd
initially just support btrfs or whatever.

I mean the server-side copy loop may also be useful but I'm all for
wiring up the obvious case first.

--b.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread J. Bruce Fields
On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
 On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
  On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
  Yuck! How the heck do you clean up the mess if that happens? I
  guess you're just stuck redoing the copy with normal READ/WRITE?
 
  Maybe we need to have the interface return a hard error in that
  case and not try to give back any sort of offset?
  
  The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
  expect us to simply ignore it and only implement my new CLONE
  operation with sane semantics.  That is unless someone can show some
  real life use case for the inter server copy, in which case we'll
  have to deal with that mess.  But getting that one right at the VFS
  level will be a nightmare anyway.
  
  Make this a vote from me to not support partial copies and just
  return and error in that case.
 
 Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
 ca_synchronous flags that let the client state if the copy should be
 done consecutively or synchronously.  I expected to always set
 consecutive to true for the Linux client.

That's supposed to mean results are well-defined in the partial-copy
case, but I think Christoph's suggesting eliminating the partial-copy
case entirely?

Which would be fine with me.

It might actually have been me advocating for partial copies.  But that
was only because a partial-copy-handling-loop seemed simpler to me than
progress callbacks if we were going to support long-running copies.

I'm happy enough not to have it at all.

--b.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Zach Brown
On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
 On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
  On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
   On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
   Yuck! How the heck do you clean up the mess if that happens? I
   guess you're just stuck redoing the copy with normal READ/WRITE?
  
   Maybe we need to have the interface return a hard error in that
   case and not try to give back any sort of offset?
   
   The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
   expect us to simply ignore it and only implement my new CLONE
   operation with sane semantics.  That is unless someone can show some
   real life use case for the inter server copy, in which case we'll
   have to deal with that mess.  But getting that one right at the VFS
   level will be a nightmare anyway.
   
   Make this a vote from me to not support partial copies and just
   return and error in that case.
  
  Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
  ca_synchronous flags that let the client state if the copy should be
  done consecutively or synchronously.  I expected to always set
  consecutive to true for the Linux client.
 
 That's supposed to mean results are well-defined in the partial-copy
 case, but I think Christoph's suggesting eliminating the partial-copy
 case entirely?
 
 Which would be fine with me.
 
 It might actually have been me advocating for partial copies.  But that
 was only because a partial-copy-handling-loop seemed simpler to me than
 progress callbacks if we were going to support long-running copies.
 
 I'm happy enough not to have it at all.

Ah, OK, that's great news.

I thought at one point we were worried about very long running RPCs on
the server.  Are we not worried about that now?

Is the client expected to cut the work up into arbitrarily managable
chunks?  Is the server expected to fail COPY/CLONE requests that it
thinks would take way too long?  Something else?

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: enforce chroot for btrfs receive

2015-04-14 Thread Austin S Hemmelgarn

On 2015-04-14 08:28, David Sterba wrote:

On Tue, Apr 14, 2015 at 01:44:32PM +0300, Lauri Võsandi wrote:

This patch forces btrfs receive to issue chroot before
parsing the btrfs stream to confine the process and
minimize damage that could be done via malicious
btrfs stream.


Thanks.

As we've discussed, there are possibly some things to resolve:

* chdir(/) after chroot
* commandline options to enable/disable chroot, choose the default

Receive should work for a non-root user so chroot should be conditional,
but I'm not sure if this should be guessed from the UID or if this would
be better to specify only by the commandline options.

I'll put the patch into a separate branch for now.


Personally, I would expect it to default to not using chroot(), provide 
a commandline option to tell it to do so, and then just catch the error 
from trying to chroot as a non-root user.





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v2 4/6] Btrfs: fail on mismatched subvol and subvolid mount options

2015-04-14 Thread David Sterba
On Fri, Apr 10, 2015 at 08:39:53AM +0800, Qu Wenruo wrote:
  There's nothing to stop a user from passing both subvol= and subvolid=
  to mount, but if they don't refer to the same subvolume, someone is
  going to be surprised at some point. Error out on this case, but allow
  users to pass in both if they do match (which they could, for example,
  get out of /proc/mounts).
 Not sure should we do this extra check, as later mount options override 
 previous mount option.
 
 I previous tried to do such thing for mount option like inode/noinode, 
 but was rejected for that reason.

Do you have a link to the discussion?

 So not sure such error-out behavior is OK or not.
 Maybe only taking the latest subvol/subvolid is a better choice?

If not sure, follow the principle of least surprise. If both subvolid
and subvol are passed and match then it's IMHO ok, no matter if the
options match by accident or intentionally. Eg. copypaste from
/proc/mounts should work.

If the options do not match we can't decide which one is the right one.
The surprise would come if the user wants one (eg. subvolid) but the
other one would be applied in the end (subvol).
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GSoC 2015] Btrfs content based storage

2015-04-14 Thread harshad shirwadkar
Thanks David and that's true: there will be a large overlap with dedup
work from Liu Bo. In fact, a few days after I sent the first mail, I
got in touch with Liu and he agreed upon mentoring. So, I am all set
now and will start working on the project in May end..

Best,
Harshad.

On Mon, Apr 13, 2015 at 10:47 AM, David Sterba dste...@suse.cz wrote:
 On Fri, Mar 27, 2015 at 10:58:42AM -0400, harshad shirwadkar wrote:
 I am a CS graduate student from Carnegie Mellon University. I am
 hoping to build the feature - Content based storage mode under
 Google Summer of Code 2015. This project has also been listed as an
 idea on BTRFS ideas page. However, I have not found a mentor yet, and
 without a mentor I can not participate in the program. Please let me
 know if anybody is interested in mentoring this project. Here is a
 link to my proposal:

 http://harshadjs.github.io/2015/03/27/Fedora-BTRFS-Content-Storage-Mode/

 This probably has a significant overlap with the in-band dedup work from
 Liu bo [1]. Your proposal expects an interface to look up the data by
 hash which hasn't been implemented afaik.

 [1] http://thread.gmane.org/gmane.comp.file-systems.btrfs/34097 (v10)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Zach Brown
On Tue, Apr 14, 2015 at 02:29:06PM -0400, J. Bruce Fields wrote:
 On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
  On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
   On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
 On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
 Yuck! How the heck do you clean up the mess if that happens? I
 guess you're just stuck redoing the copy with normal READ/WRITE?

 Maybe we need to have the interface return a hard error in that
 case and not try to give back any sort of offset?
 
 The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
 expect us to simply ignore it and only implement my new CLONE
 operation with sane semantics.  That is unless someone can show some
 real life use case for the inter server copy, in which case we'll
 have to deal with that mess.  But getting that one right at the VFS
 level will be a nightmare anyway.
 
 Make this a vote from me to not support partial copies and just
 return and error in that case.

Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
ca_synchronous flags that let the client state if the copy should be
done consecutively or synchronously.  I expected to always set
consecutive to true for the Linux client.
   
   That's supposed to mean results are well-defined in the partial-copy
   case, but I think Christoph's suggesting eliminating the partial-copy
   case entirely?
   
   Which would be fine with me.
   
   It might actually have been me advocating for partial copies.  But that
   was only because a partial-copy-handling-loop seemed simpler to me than
   progress callbacks if we were going to support long-running copies.
   
   I'm happy enough not to have it at all.
  
  Ah, OK, that's great news.
  
  I thought at one point we were worried about very long running RPCs on
  the server.  Are we not worried about that now?
  
  Is the client expected to cut the work up into arbitrarily managable
  chunks?  Is the server expected to fail COPY/CLONE requests that it
  thinks would take way too long?  Something else?
 
 Christoph is proposing a CLONE rpc that's required to be atomic:
 
   
 https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-35#section-15.13
   The CLONE operation is atomic, that is either all changes or no
   changes are seen by the client or other clients.
 
 So that couldn't be really long-running (or the server is nuts).
 
 So that'd mean Anna would rip out the server-side copy loop and we'd
 initially just support btrfs or whatever.

Is this relying on btrfs range cloning being atomic?  It certainly
doesn't look atomic.  It can modify items across an arbitrarily large
number of leaf blocks.  It can make the changes across multiple
transactions which could introduce partial modification on reboot after
crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
desintation partially modified.

 I mean the server-side copy loop may also be useful but I'm all for
 wiring up the obvious case first.

Sure, I'm all for wiring up the simple version that doesn't return
partial progress.  If that'll work for you guys.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Christoph Hellwig
On Tue, Apr 14, 2015 at 11:54:08AM -0700, Zach Brown wrote:
 Is this relying on btrfs range cloning being atomic?  It certainly
 doesn't look atomic.  It can modify items across an arbitrarily large
 number of leaf blocks.  It can make the changes across multiple
 transactions which could introduce partial modification on reboot after
 crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
 desintation partially modified.

I didn't mean atomic in the failure atomic sense, but in the sense of
being atomic vs other writes, similar to how Posix specifies it for
writes vs other writes.  Guess I need to express this intent better.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper

2015-04-14 Thread Zach Brown
On Tue, Apr 14, 2015 at 12:23:25PM -0700, Christoph Hellwig wrote:
 On Tue, Apr 14, 2015 at 11:54:08AM -0700, Zach Brown wrote:
  Is this relying on btrfs range cloning being atomic?  It certainly
  doesn't look atomic.  It can modify items across an arbitrarily large
  number of leaf blocks.  It can make the changes across multiple
  transactions which could introduce partial modification on reboot after
  crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
  desintation partially modified.
 
 I didn't mean atomic in the failure atomic sense, but in the sense of
 being atomic vs other writes, similar to how Posix specifies it for
 writes vs other writes.  Guess I need to express this intent better.

Ah, right, OK.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: incremental send, don't rename a directory too soon

2015-04-14 Thread Filipe David Manana
On Tue, Apr 14, 2015 at 12:09 PM, Robbie Ko robbi...@synology.com wrote:
 Hi,

 Sorry for not making it clear.

 2015-04-14 16:16 GMT+08:00 Filipe David Manana fdman...@gmail.com:
 On Tue, Apr 14, 2015 at 8:33 AM, Robbie Ko robbi...@synology.com wrote:
 Hi,

 After applying the patch, I got WARN_ON.
 btrfs progs finished without any error message,
 but received subvolume is not the same as send subvolume.

 Here's the related information.
 thanks,
 robbieko

 uanme -a
 Linux ubuntu 4.0.0-rc4-custom #2 SMP Tue Apr 14 11:43:00 CST 2015
 x86_64 x86_64 x86_64 GNU/Linux
 btrfs --version
 Btrfs v3.14.1

 Steps to reproduce:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt2

 $ mkdir -p /mnt/data
 $ mkdir -p /mnt/data/n1/n2
 $ mkdir -p /mnt/data/n4
 $ mkdir -p /mnt/data/n1/n2/p1
 $ mkdir -p /mnt/data/t4
 $ mkdir -p /mnt/data/p1
 $ mkdir -p /mnt/data/p1/2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap1

 $ mv /mnt/data/n1/n2 /mnt/data/t4
 $ mv /mnt/data/n4 /mnt/data/t4/n2
 $ mv /mnt/data/t4/n2/p1 /mnt/data/t4/p1
 $ mv /mnt/data/p1 /mnt/data/t4/n2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap2

   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2

 So this is a new case, different from the ones you've sent before, isn't it?

 You should have all previous patches applied too, not just this one
 you're replying to.

 Hi,

 I have applied all the patches fixed recently.
 Then WARN_ON happened with steps mentioned above.
 I tested it without these patches, no WARN_ON but the following error
 appeared instead.
 ERROR: rename data/t4/n2/p1 - data/t4/n2/p1/p1 failed. Invalid argument

 I started to revert these patches and found that this patch causes the
 WARN_ON problem.

 I'm not sure whether it's a new case.

So it's a case that didn't work neither before nor after all the
recent fixes, but for different reasons.
I have 2 cases here, one triggered by your fuzz tester script and
another one I know of for quite some time (involving creation of new
directories and removing old ones in the second snapshot) but haven't
had the time to find a solution without breaking other cases that are
currently working (and have xfstests). Haven't checked however if your
reproducer fails for the same reasons as those 2 cases I know of.

thanks


 thanks,
 robbieko

 Also, it isn't clear, are you saying this happens only with this
 particular patch applied but doesn't happen without it (and all other
 recent ones)?

 thanks



 Call trace message

 [  135.498533] [ cut here ]

 [  135.498557] WARNING: CPU: 1 PID: 2346 at fs/btrfs/send.c:5934
 btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]()

 [  135.498560] Modules linked in: nf_conntrack_ipv4(E)
 nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E)
 nf_reject_ipv4(E) xt_tcpudp(E) iptable_filter(E) ip_tables(E)
 x_tables(E) bridge(E) stp(E) llc(E) snd_intel8x0(E) snd_ac97_codec(E)
 ac97_bus(E) snd_pcm(E) snd_timer(E) snd(E) iosf_mbi(E) soundcore(E)
 ppdev(E) joydev(E) lp(E) serio_raw(E) parport_pc(E) i2c_piix4(E)
 mac_hid(E) parport(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
 usbhid(E) hid(E) ahci(E) psmouse(E) libahci(E) e1000(E) pata_acpi(E)

 [  135.498578] CPU: 1 PID: 2346 Comm: btrfs Tainted: G E
 4.0.0-rc4-custom #3

 [  135.498580] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
 VirtualBox 12/01/2006

 [  135.498583]  c0509016 88007a233c08 817b62f3
 0007

 [  135.498586]   88007a233c48 8107452a
 c9567000

 [  135.498590]  8800799f1400 8800799f1418 88007b3f82d0
 880079f6

 [  135.498593] Call Trace:

 [  135.498602]  [817b62f3] dump_stack+0x45/0x57

 [  135.498609]  [8107452a] warn_slowpath_common+0x8a/0xc0

 [  135.498614]  [8107461a] warn_slowpath_null+0x1a/0x20

 [  135.498626]  [c04f5c6c] btrfs_ioctl_send+0xc4c/0x11e0 [btrfs]

 [  135.498632]  [8118020e] ? __alloc_pages_nodemask+0x1ae/0xab0

 [  135.498638]  [810a4775] ? sched_clock_local+0x25/0x90

 [  135.498643]  [8109182e] ? alloc_pid+0x2e/0x530

 [  135.498655]  [c04bc4f6] btrfs_ioctl+0x286/0x27e0 [btrfs]

 [  135.498660]  [810a6368] ? __enqueue_entity+0x78/0x80

 [  135.498665]  [810add70] ? enqueue_entity+0x400/0xc20

 [  135.498679]  [8101dc3a] ? native_sched_clock+0x2a/0x90

 [  135.498686]  [810ae708] ? enqueue_task_fair+0x178/0x730

 [  135.498698]  [81047c1d] ? native_smp_send_reschedule+0x4d/0x70

 [  135.498703]  [8109d5f0] ? resched_curr+0x70/0xc0

 [  135.498710]  [8109e12a] ? check_preempt_curr+0x5a/0xa0

 [  135.498715]  [810a152f] ? wake_up_new_task+0x12f/0x1b0

 [  135.498722]  [81204010] do_vfs_ioctl+0x2e0/0x4e0

 [  135.498728]  [8107368c] ? do_fork+0x13c/0x370

 [  135.498733]  [81204291] SyS_ioctl+0x81/0xa0

Re: directory defrag

2015-04-14 Thread David Sterba
On Tue, Apr 14, 2015 at 07:37:17AM +, Russell Coker wrote:
 The current defragmentation options seem to only support defragmenting named 
 files/directories or a recursive defragmentation of files and directories.
 
 I'd like to recursively defragment directories.  One of my systems has a 
 large 
 number of large files, the files are write-once and read performance is good 
 enough.  However performance of ls -al is often very poor, presumably due 
 to 
 metadata fragmentation.

Ie. the directory metadata in the b-tree. That's possible, but not all
of the code is there. So fare only whole-tree defragmentation is
implemented, ie. the extent tree or any subvolume tree. We'd have to
extend the defrag api to take a key range and then use it to span the
directory key range.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] btrfs-progs: improve troubleshooting avoid duplicate error strings

2015-04-14 Thread David Sterba
On Mon, Apr 13, 2015 at 08:37:01PM +0800, Anand Jain wrote:
 my troubleshooting experience says have unique error string per module.
 
 In the below eg, its one additional step to know error line,
 
 cat -n cmds-device.c | egrep error removing the device
185ERROR: error removing the device '%s' - %s\n,
190ERROR: error removing the device '%s' - %s\n,
 
 which is completely avoidable.

It is, we can merge both branches into one.

 
 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
  cmds-device.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/cmds-device.c b/cmds-device.c
 index 1c72e90..1c32771 100644
 --- a/cmds-device.c
 +++ b/cmds-device.c
 @@ -187,7 +187,7 @@ static int cmd_rm_dev(int argc, char **argv)
   ret++;
   } else if (res  0) {
   fprintf(stderr,
 - ERROR: error removing the device '%s' - %s\n,
 + ERROR: ioctl error removing the device '%s' - 
 %s\n,

The only difference is the strerror vs btrfs_err_str. As both ret  0
and ret  0 report some kind of error, the wording would be very similar
so I think that one error message would fit better. I'll fix that.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] Btrfs-progs: fix compile warnings

2015-04-14 Thread David Sterba
On Mon, Apr 13, 2015 at 10:48:54PM +0800, Anand Jain wrote:
 simple compile time warning fixes.
 
 cmds-check.c: In function ‘del_file_extent_hole’:
 cmds-check.c:289: warning: ‘prev.len’ may be used uninitialized in this 
 function
 cmds-check.c:289: warning: ‘prev.start’ may be used uninitialized in this 
 function
 cmds-check.c:290: warning: ‘next.len’ may be used uninitialized in this 
 function
 cmds-check.c:290: warning: ‘next.start’ may be used uninitialized in this 
 function
 
 btrfs-calc-size.c: In function ‘print_seek_histogram’:
 btrfs-calc-size.c:221: warning: ‘group_start’ may be used uninitialized in 
 this function
 btrfs-calc-size.c:223: warning: ‘group_end’ may be used uninitialized in this 
 function
 
 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
  btrfs-calc-size.c | 4 ++--
  cmds-check.c  | 3 +++
  2 files changed, 5 insertions(+), 2 deletions(-)
 
 diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
 index 1372084..88f92e1 100644
 --- a/btrfs-calc-size.c
 +++ b/btrfs-calc-size.c
 @@ -218,9 +218,9 @@ static void print_seek_histogram(struct root_stats *stat)
   struct rb_node *n = rb_first(stat-seek_root);
   struct seek *seek;
   u64 tick_interval;
 - u64 group_start;
 + u64 group_start = 0;
   u64 group_count = 0;
 - u64 group_end;
 + u64 group_end = 0;
   u64 i;
   u64 max_seek = stat-max_seek_len;
   int digits = 1;
 diff --git a/cmds-check.c b/cmds-check.c
 index ed8c698..de22185 100644
 --- a/cmds-check.c
 +++ b/cmds-check.c
 @@ -293,6 +293,9 @@ static int del_file_extent_hole(struct rb_root *holes,
   int have_next = 0;
   int ret = 0;
  
 + memset(prev, 0, sizeof(struct file_extent_hole));
 + memset(next, 0, sizeof(struct file_extent_hole));

While this fixes the warning, I've found that we're not using the
file_extent_hole in a good way. It contains a rb_node that's unused, the
initializtion to 0 is IMO wrong here. Would be better to use plain
variables for prev/next + start/len.

I'll drop that hunk from the patch and apply the first one.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html