[PATCH] get uuid of subvol and snapshot

2012-08-14 Thread Anand jain
From: Anand Jain anand.j...@oracle.com

 This patch is on top the patch-set titled 
 'Include otime in the snapshot list' sent by me before.

 To show the uuid of the subvol and snapshots.

btrfs su list -u /btrfs1
ID 256 gen 6 top level 5 uuid 4b7188e4-7d48-f247-b956-1a260b721e1d path sv1
ID 259 gen 6 top level 5 uuid 3cf8931a-de31-5545-8ede-435d25fe3c3f path 
sv1/.snapshot/ss1

Anand Jain (1):
  add -u to show subvol uuid

 btrfs-list.c |  148 ++
 cmds-subvolume.c |   14 +++--
 2 files changed, 124 insertions(+), 38 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] add -u to show subvol uuid

2012-08-14 Thread Anand jain
From: Anand Jain anand.j...@oracle.com

Applications would need to know the uuid to manage the configurations
associated with the subvol and snapshots

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-list.c |  148 ++
 cmds-subvolume.c |   14 +++--
 2 files changed, 124 insertions(+), 38 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 6e83b31..d6b22a1 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -34,6 +34,7 @@
 #include ctree.h
 #include transaction.h
 #include utils.h
+#include uuid/uuid.h
 
 /* we store all the roots we find in an rbtree so that we can
  * search for them later.
@@ -63,6 +64,8 @@ struct root_info {
/* creation time of this root in sec*/
time_t otime;
 
+   u8 uuid[BTRFS_UUID_SIZE];
+
/* path from the subvol we live in to this root, including the
 * root's name.  This is null until we do the extra lookup ioctl.
 */
@@ -188,7 +191,7 @@ static struct root_info *tree_search(struct rb_root *root, 
u64 root_id)
  */
 static int add_root(struct root_lookup *root_lookup,
u64 root_id, u64 ref_tree, u64 dir_id, char *name,
-   int name_len, u64 *gen, time_t ot)
+   int name_len, u64 *gen, time_t ot, void *uuid)
 {
struct root_info *ri;
struct rb_node *ret;
@@ -210,6 +213,11 @@ static int add_root(struct root_lookup *root_lookup,
ri-gen = *gen;
ri-otime = ot;
 
+   if (uuid) 
+   memcpy(ri-uuid, uuid, BTRFS_UUID_SIZE);
+   else
+   memset(ri-uuid, 0, BTRFS_UUID_SIZE);
+
ret = tree_insert(root_lookup-root, root_id, ref_tree, gen,
  ri-rb_node);
if (ret) {
@@ -220,7 +228,7 @@ static int add_root(struct root_lookup *root_lookup,
 }
 
 static int update_root(struct root_lookup *root_lookup, u64 root_id, u64 gen,
-   time_t ot)
+   time_t ot, void *uuid)
 {
struct root_info *ri;
 
@@ -231,6 +239,11 @@ static int update_root(struct root_lookup *root_lookup, 
u64 root_id, u64 gen,
}
ri-gen = gen;
ri-otime = ot;
+   if (uuid)
+   memcpy(ri-uuid, uuid, BTRFS_UUID_SIZE);
+   else
+   memset(ri-uuid, 0, BTRFS_UUID_SIZE);
+
return 0;
 }
 
@@ -665,6 +678,7 @@ static int __list_subvol_search(int fd, struct root_lookup 
*root_lookup)
int i;
int get_gen = 0;
time_t t;
+   u8 uuid[BTRFS_UUID_SIZE];
 
root_lookup_init(root_lookup);
memset(args, 0, sizeof(args));
@@ -718,16 +732,20 @@ again:
dir_id = btrfs_stack_root_ref_dirid(ref);
 
add_root(root_lookup, sh-objectid, sh-offset,
-dir_id, name, name_len, NULL, 0);
+dir_id, name, name_len, NULL, 0, NULL);
} else if (get_gen  sh-type == BTRFS_ROOT_ITEM_KEY) {
ri = (struct btrfs_root_item *)(args.buf + off);
gen = btrfs_root_generation(ri);
-   if(ri-generation == ri-generation_v2)
+   if(ri-generation == ri-generation_v2) {
t = ri-otime.sec;
-   else
+   memcpy(uuid, ri-uuid, BTRFS_UUID_SIZE);
+   } else {
t = 0;
+   memset(uuid, 0, BTRFS_UUID_SIZE);
+   }
 
-   update_root(root_lookup, sh-objectid, gen, t);
+   update_root(root_lookup, sh-objectid, gen, t,
+   uuid);
}
 
off += sh-len;
@@ -818,19 +836,24 @@ static int __list_snapshot_search(int fd, struct 
root_lookup *root_lookup)
for (i = 0; i  sk-nr_items; i++) {
struct btrfs_root_item *item;
time_t  t;
+   u8 uuid[BTRFS_UUID_SIZE];
+
sh = (struct btrfs_ioctl_search_header *)(args.buf +
  off);
off += sizeof(*sh);
if (sh-type == BTRFS_ROOT_ITEM_KEY  sh-offset) {
item = (struct btrfs_root_item *)(args.buf + 
off);
-   if(item-generation == item-generation_v2)
+   if(item-generation == item-generation_v2) {
t = item-otime.sec;
-   else
+   memcpy(uuid, item-uuid, 
BTRFS_UUID_SIZE);
+   } else {
  

[PATCH] btrfs-progs: correct the comment for extent_io.c/clear_extent_bits

2012-08-14 Thread Wang Sheng-Hui
It should be clear instead of set for clear_extent_bits.

Signed-off-by: Wang Sheng-Hui shh...@gmail.com
---
 extent_io.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/extent_io.c b/extent_io.c
index ebb35b2..638ee0e 100644
--- a/extent_io.c
+++ b/extent_io.c
@@ -195,7 +195,7 @@ static int clear_state_bit(struct extent_io_tree *tree,
 }
 
 /*
- * set some bits on a range in the tree.
+ * clear some bits on a range in the tree.
  */
 int clear_extent_bits(struct extent_io_tree *tree, u64 start,
  u64 end, int bits, gfp_t mask)
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] btrfs: extended inode refs

2012-08-14 Thread Jan Schmidt
On Wed, August 08, 2012 at 20:55 (+0200), Mark Fasheh wrote:
 +/*
 + * btrfs_insert_inode_extref() - Inserts an extended inode ref into a tree.
 + *
 + * The caller must have checked against BTRFS_LINK_MAX already.
 + */
 +static int btrfs_insert_inode_extref(struct btrfs_trans_handle *trans,
 +  struct btrfs_root *root,
 +  const char *name, int name_len,
 +  u64 inode_objectid, u64 ref_objectid, u64 
 index)
 +{
 + struct btrfs_inode_extref *extref;
 + int ret;
 + int ins_len = name_len + sizeof(*extref);
 + unsigned long ptr;
 + struct btrfs_path *path;
 + struct btrfs_key key;
 + struct extent_buffer *leaf;
 + struct btrfs_item *item;
 +
 + key.objectid = inode_objectid;
 + key.type = BTRFS_INODE_EXTREF_KEY;
 + key.offset = btrfs_extref_hash(ref_objectid, name, name_len);
 +
 + path = btrfs_alloc_path();
 + if (!path)
 + return -ENOMEM;
 +
 + path-leave_spinning = 1;
 + ret = btrfs_insert_empty_item(trans, root, path, key,
 +   ins_len);
 + if (ret == -EEXIST) {
 + if (btrfs_find_name_in_ext_backref(path, name, name_len, NULL))
 + goto out;
 +
 + btrfs_extend_item(trans, root, path, ins_len);
 + }
 + if (ret  0)
 + goto out;

This doesn't look right. Did you actually test it? I haven't, but I claim that
with this version of the patch, you won't be able to add a hash collision link.

Jeff changed btrfs_extend_item from int to void in March, so we're no longer
setting ret there. I suggest adding ret = 0 within the EEXIST-block.

The rest of this patch looks good to me.

-Jan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Update LZO compression

2012-08-14 Thread Markus F.X.J. Oberhumer
On 2012-08-14 05:15, Andi Kleen wrote:
 On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
 Hi all,

 as suggested on the mailing list I have converted the updated LZO
 code into git, so please pull my lzo-update branch from

   git://github.com/markus-oberhumer/linux.git lzo-update

 You can browse the branch at

   https://github.com/markus-oberhumer/linux/compare/lzo-update
 
 Looks ok to me from a quick look.
 
 Since kernel lzo is security relevant, I assume the new version
 has been fuzz tested?

Of course!

 I couldn't tell from the github view, but I assume you follow
 standard coding style.

I've tried my best to follow the style guidelines.

Cheers,
Markus

 
 -Andi
 

-- 
Markus Oberhumer, mar...@oberhumer.com, http://www.oberhumer.com/
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Calvin Walton
On Mon, 2012-08-13 at 05:48 +0700, Fajar A. Nugraha wrote:
 On Sun, Aug 12, 2012 at 11:46 PM, Daniel Pocock dan...@pocock.com.au wrote:

  d) what about booting from a btrfs system?  Is it recommended to follow
  the ages-old practice of keeping a real partition of 128-500MB,
  formatting it as btrfs, even if all other data is in subvolumes as per (b)?
 
 You can have one single partition only and boot directly from that.
 However btrfs has the same problems as zfs in this regard:
 - grub can read both, but can't write to either. In other words, no
 support for grubenv
 - the best compression method (gzip for zfs, lzo for btrfs) is not
 supported by grub

This is actually not true; the grub 2.00 release does support reading
from lzo-compressed btrfs filesystems. (Of course, if any other new
compression algorithms are added, this issue will happen again.)

 For the first problem, an easy workaroud is just to disable the grub
 configuration that uses grubenv. Easy enough, and no major
 functionality loss.
 
 The second one is harder for btrfs. zfs allows you to have separate
 dataset (i.e. subvolume, in btfs terms) with different compression, so
 you can have a dedicated dataset for /boot with different compression
 setting from the rest of the dataset. With btrfs you're currently
 stuck with using the same compression setting for everything, so if
 you love lzo this might be a major setback.

It's possible to disable compression on individual files on btrfs. If
you disable compression on everything in /boot/grub{2,} and on your
kernels and initramfses then grub will be able to read them no matter
what.

Unfortunately, this is a bit tricky to do at the moment: you have to
remount the filesystem with `-o compress=no`, then run `btrfs fi defrag`
individually on all the files that you want uncompressed.

A patch to add support for `btrfs fi defrag -c none file` or so would
make this easier, and shouldn't be to hard to do :)

 Due to second and third problem, I'd recommend you just use a separate
 partition with ext2/4 for now.

Even with my comments, this is still my recommendation. (Although if
you're using a EFI bios, you could just stick all the bootloader stuff
on the VFAT EFI system partition instead.)

-- 
Calvin Walton calvin.wal...@kepstin.ca

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Update LZO compression

2012-08-14 Thread Johannes Stezenbach
On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
 On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
 
  As stated in the README this version is significantly faster (typically more
  than 2 times faster!) than the current version, has been thoroughly tested 
  on
  x86_64/i386/powerpc platforms and is intended to get included into the
  official Linux 3.6 or 3.7 release.
 
  I encourage all compression users to test and benchmark this new version,
  and I also would ask some official LZO maintainer to convert the updated
  source files into a GIT commit and possibly push it to Linus or linux-next.

Sorry for not reporting earlier, but I didn't have time to do real
benchmarks, just a quick test on ARM926EJ-S using barebox,
and found in the new version decompression is slower:
http://lists.infradead.org/pipermail/barebox/2012-July/008268.html

BTW, do you have userspace code matching the old and new
lzo versions?  It would be easier to benchmark.

Unfortunately I cannot claim high confidence in my benchmark results
due to missing time to do it properly, it would be useful if
someone else could do some benchmarks on ARM before merging this.


Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze

2012-08-14 Thread Marco Stornelli

Il 14/08/2012 07:01, liub.li...@gmail.com ha scritto:

From: Liu Bo bo.li@oracle.com

I found this while testing xfstests 068, the story is

 t1t2
   sys_syncthaw_super
 iterate_supers
   down_read(sb-s_umount)   down_write(sb-s_umount) 
---wait for t1
   sync_fs (with wait mode)
 start_transaction
   sb_start_intwrite  wait for t2 to set 
s_writers.frozen to SB_UNFROZEN

In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.



IMHO, we should avoid to call the sync operation on a frozen fs. The 
freeze operation, indeed, already include a sync operation. According to 
man page, no other operation should modify the fs after the freeze. So 
for me the modification is inside sync_filesystem (and sync_one_sb).


Marco
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Daniel Pocock


On 12/08/12 22:48, Fajar A. Nugraha wrote:
 On Sun, Aug 12, 2012 at 11:46 PM, Daniel Pocock dan...@pocock.com.au wrote:


 I notice this question on the wiki/faq:


 https://btrfs.wiki.kernel.org/index.php/UseCases#What_is_best_practice_when_partitioning_a_device_that_holds_one_or_more_btr-filesystems

 and as it hasn't been answered, can anyone make any comments on the subject

 Various things come to mind:

 a) partition the disk, create an LVM partition, and create lots of small
 LVs, format each as btrfs

 b) partition the disk, create an LVM partition, and create one big LV,
 format as btrfs, make subvolumes

 c) what about using btrfs RAID1?  Does either approach (a) or (b) seem
 better for someone who wants the RAID1 feature?
 
 IMHO when the qgroup feature is stable (i.e. adopted by distros, or
 at least in stable kernel) then simply creating one big partition (and
 letting btrfs handle RAID1, if you use it) is better. When 3.6 is out,
 perhaps?
 
 Until then I'd use LVM.
 

Can you just elaborate on the qgroups feature?
- Does this just mean I can make the subvolume sizes rigid, like LV sizes?
- Or is it per-user restrictions or some other more elaborate solution?

If I create 10 LVs today, with btrfs on each, can I merge them all into
subvolumes on a single btrfs later?

If I just create a 1TB btrfs with subvolumes now, can I upgrade to
qgroups later?  Or would I have to recreate the filesystem?

I really appreciate the answers from people.  Reflecting on some of the
comments and past experience, my feeling is that I should do the following:

a) create the partition table as normal
b) create one big partition as LVM (type 0x8e)
c) create one big LV (for all of the disk)
d) format the LV as btrfs
e) create a subvolume to hold the data from each LV that I have on my
old disk

My reason for doing (b) and (c) is that I may want to have the following
options in future - would these still be possible without LVM at all,
using btrfs on a raw 1TB partition?
- using pvmove to move the filesystem to another physical device (e.g.
if I purchase a 2TB drive to replace the 1TB drive)
- using lvresize to expand the allocation onto such a new drive

If I understand correctly, if I don't use LVM, then such move and resize
operations can't be done for an online filesystem and it has more risk.


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Olivier Bonvalet

On 14/08/2012 15:28, Daniel Pocock wrote:

If I create 10 LVs today, with btrfs on each,



From my understanding of Btrfs, it achieve good write performance by 
making near all writes sequential. But if you split your disk in 10 
sub-parts, and set btrfs on each of them, writes operations of Btrfs 
will not really be sequential anymore.


So, for me, to have good performance btrfs should manage all the disk
(maybe excepting the /boot/ directory, just to avoid any problem with grub).
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze

2012-08-14 Thread Liu Bo
On 08/14/2012 08:59 PM, Marco Stornelli wrote:
 Il 14/08/2012 07:01, liub.li...@gmail.com ha scritto:
 From: Liu Bo bo.li@oracle.com

 I found this while testing xfstests 068, the story is

  t1t2
sys_syncthaw_super
  iterate_supers
down_read(sb-s_umount)   down_write(sb-s_umount) 
 ---wait for t1
sync_fs (with wait mode)
  start_transaction
sb_start_intwrite  wait for t2 to set 
 s_writers.frozen to SB_UNFROZEN

 In this patch, I add an helper sb_start_intwrite_trylock() and use it before 
 we
 start_transaction in sync_fs() with wait mode so that we won't hit the 
 deadlock.

 
 IMHO, we should avoid to call the sync operation on a frozen fs. The freeze 
 operation, indeed, already include a sync operation.
 According to man page, no other operation should modify the fs after the 
 freeze. 
 So for me the modification is inside sync_filesystem (and sync_one_sb).

Do you mean that we should add the trylock check in sync_filesystem?

But it seems to be useless because we already run into down_read(sb-s_umount) 
before starting sync_one_sb().

thanks,
liubo

 
 Marco
 -- 
 To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Fajar A. Nugraha
On Tue, Aug 14, 2012 at 8:28 PM, Daniel Pocock dan...@pocock.com.au wrote:
 Can you just elaborate on the qgroups feature?
 - Does this just mean I can make the subvolume sizes rigid, like LV sizes?

Pretty much.

 - Or is it per-user restrictions or some other more elaborate solution?

No


 If I create 10 LVs today, with btrfs on each, can I merge them all into
 subvolumes on a single btrfs later?

No


 If I just create a 1TB btrfs with subvolumes now, can I upgrade to
 qgroups later?

Yes

  Or would I have to recreate the filesystem?

No

 If I understand correctly, if I don't use LVM, then such move and resize
 operations can't be done for an online filesystem and it has more risk.

You can resize, add, and remove devices from btrfs online without the
need for LVM. IIRC LVM has finer granularity though, you can do
something like move only the first 10GB now, I'll move the rest
later.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread cwillu
 If I understand correctly, if I don't use LVM, then such move and resize
 operations can't be done for an online filesystem and it has more risk.

 You can resize, add, and remove devices from btrfs online without the
 need for LVM. IIRC LVM has finer granularity though, you can do
 something like move only the first 10GB now, I'll move the rest
 later.

You can certainly resize the filesystem itself, but without lvm I
don't believe you can resize the underlying partition online.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze

2012-08-14 Thread Marco Stornelli

Il 14/08/2012 15:53, Liu Bo ha scritto:

On 08/14/2012 08:59 PM, Marco Stornelli wrote:

Il 14/08/2012 07:01, liub.li...@gmail.com ha scritto:

From: Liu Bo bo.li@oracle.com

I found this while testing xfstests 068, the story is

  t1t2
sys_syncthaw_super
  iterate_supers
down_read(sb-s_umount)   down_write(sb-s_umount) 
---wait for t1
sync_fs (with wait mode)
  start_transaction
sb_start_intwrite  wait for t2 to set 
s_writers.frozen to SB_UNFROZEN

In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.



IMHO, we should avoid to call the sync operation on a frozen fs. The freeze 
operation, indeed, already include a sync operation.
According to man page, no other operation should modify the fs after the freeze.
So for me the modification is inside sync_filesystem (and sync_one_sb).


Do you mean that we should add the trylock check in sync_filesystem?

But it seems to be useless because we already run into down_read(sb-s_umount) 
before starting sync_one_sb().

thanks,
liubo



I meant that we should check if there are in a complete freeze state 
(according to the states of a freeze transaction) and simply skip the 
sync operation.


Marco
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Fajar A. Nugraha
On Tue, Aug 14, 2012 at 9:09 PM, cwillu cwi...@cwillu.com wrote:
 If I understand correctly, if I don't use LVM, then such move and resize
 operations can't be done for an online filesystem and it has more risk.

 You can resize, add, and remove devices from btrfs online without the
 need for LVM. IIRC LVM has finer granularity though, you can do
 something like move only the first 10GB now, I'll move the rest
 later.

 You can certainly resize the filesystem itself, but without lvm I
 don't believe you can resize the underlying partition online.

I'm pretty sure you can do that with parted. At least, when your
version of parted is NOT 2.2.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Calvin Walton
On Tue, 2012-08-14 at 08:09 -0600, cwillu wrote:
  If I understand correctly, if I don't use LVM, then such move and resize
  operations can't be done for an online filesystem and it has more risk.
 
  You can resize, add, and remove devices from btrfs online without the
  need for LVM. IIRC LVM has finer granularity though, you can do
  something like move only the first 10GB now, I'll move the rest
  later.
 
 You can certainly resize the filesystem itself, but without lvm I
 don't believe you can resize the underlying partition online.

There are actually some patches floating around that will allow
partitions (MBR/GPT) to be resized online, I think they're queued up to
be included in some upcoming linux release:
http://lwn.net/Articles/481141/

You still can't move partitions online, of course.

-- 
Calvin Walton calvin.wal...@kepstin.ca

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread cwillu
On Tue, Aug 14, 2012 at 8:21 AM, Fajar A. Nugraha l...@fajar.net wrote:
 On Tue, Aug 14, 2012 at 9:09 PM, cwillu cwi...@cwillu.com wrote:
 If I understand correctly, if I don't use LVM, then such move and resize
 operations can't be done for an online filesystem and it has more risk.

 You can resize, add, and remove devices from btrfs online without the
 need for LVM. IIRC LVM has finer granularity though, you can do
 something like move only the first 10GB now, I'll move the rest
 later.

 You can certainly resize the filesystem itself, but without lvm I
 don't believe you can resize the underlying partition online.

 I'm pretty sure you can do that with parted. At least, when your
 version of parted is NOT 2.2.

block/ioctl.c:blkdev_reread_part calls into
block/partition-generic.c:rescan_partitions, which fails out early
with EBUSY if block/partition-generic.c:drop_partitions sees a
non-zero bdev-bd_part_count, which is a count of the open partition
bdev's.

Calvin mentions below that this may be changing shortly, but I'm going
to claim to be right on this one.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH] Btrfs: manage metadata cache ourselves

2012-08-14 Thread Liu Bo
On 08/02/2012 05:06 AM, Josef Bacik wrote:
 ===
 PLEASE REVIEW AND TEST THIS CAREFULLY
 
 I've dug this patch out of the bin and cleaned it up but who knows what kind 
 of
 crust I've missed.  This makes the create empty files until the file system is
 full run 5 minutes faster on my hardware so it's a pretty awesome improvement,
 plus it lets us get rid of a lot of complexity.  I think it works pretty well,
 and I've been going through and widdling it down, but now I need somebody
 *cough*Dave*cough* to go through it with a fine toothed comb and point out all
 the stupid mistakes I've made.
 
 ===
 This patch moves the management of the metadata cache from pagecache to our 
 own
 internal caching which can choose to evict things based on what is no longer 
 in
 use.  Thanks,
 


I'll try to look into the patch :)

but slab complains about memory leak on extent buffer with this patch on latest 
3.6.0-rc1:

[14856.442224] BUG extent_buffers (Tainted: G   O): Objects remaining 
on kmem_cache_close()
[14856.442224] 
-
[14856.442224] 
[14856.442225] INFO: Slab 0xea000405d980 objects=22 used=12 
fp=0x8801017673b0 flags=0x404080
[14856.442226] Pid: 29913, comm: rmmod Tainted: G   O 3.6.0-rc1+ #6
[14856.442227] Call Trace:
[14856.442229]  [81174341] slab_err+0x91/0xc0
[14856.442230]  [8117729c] ? __kmalloc+0x14c/0x1b0
[14856.442232]  [81176bb0] ? deactivate_slab+0x580/0x580
[14856.442233]  [811777d3] list_slab_objects.constprop.22+0x63/0x170
[14856.442234]  [81178c58] kmem_cache_destroy+0x108/0x1f0
[14856.442242]  [a062baa4] extent_io_exit+0x54/0x100 [btrfs]
[14856.442250]  [a066f8c4] exit_btrfs_fs+0x18/0x754 [btrfs]
[14856.442252]  [810bd796] sys_delete_module+0x1a6/0x2b0
[14856.442254]  [810d7ecc] ? __audit_syscall_entry+0xcc/0x310
[14856.442255]  [81618329] system_call_fastpath+0x16/0x1b
[14856.442258] INFO: Object 0x880101766000 @offset=0
[14856.442258] INFO: Object 0x880101766168 @offset=360
[14856.442259] INFO: Object 0x8801017662d0 @offset=720
[14856.442260] INFO: Object 0x8801017665a0 @offset=1440
[14856.442260] INFO: Object 0x8801017669d8 @offset=2520
[14856.442261] INFO: Object 0x880101766b40 @offset=2880
[14856.442262] INFO: Object 0x880101766ca8 @offset=3240
[14856.442262] INFO: Object 0x880101766e10 @offset=3600
[14856.442263] INFO: Object 0x880101766f78 @offset=3960
[14856.442264] INFO: Object 0x880101767518 @offset=5400
[14856.442264] INFO: Object 0x8801017677e8 @offset=6120
[14856.442265] INFO: Object 0x880101767ab8 @offset=6840

thanks,
liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] btrfs: extended inode refs

2012-08-14 Thread Mark Fasheh
On Tue, Aug 14, 2012 at 11:32:43AM +0200, Jan Schmidt wrote:
 On Wed, August 08, 2012 at 20:55 (+0200), Mark Fasheh wrote:
  +/*
  + * btrfs_insert_inode_extref() - Inserts an extended inode ref into a tree.
  + *
  + * The caller must have checked against BTRFS_LINK_MAX already.
  + */
  +static int btrfs_insert_inode_extref(struct btrfs_trans_handle *trans,
  +struct btrfs_root *root,
  +const char *name, int name_len,
  +u64 inode_objectid, u64 ref_objectid, u64 
  index)
  +{
  +   struct btrfs_inode_extref *extref;
  +   int ret;
  +   int ins_len = name_len + sizeof(*extref);
  +   unsigned long ptr;
  +   struct btrfs_path *path;
  +   struct btrfs_key key;
  +   struct extent_buffer *leaf;
  +   struct btrfs_item *item;
  +
  +   key.objectid = inode_objectid;
  +   key.type = BTRFS_INODE_EXTREF_KEY;
  +   key.offset = btrfs_extref_hash(ref_objectid, name, name_len);
  +
  +   path = btrfs_alloc_path();
  +   if (!path)
  +   return -ENOMEM;
  +
  +   path-leave_spinning = 1;
  +   ret = btrfs_insert_empty_item(trans, root, path, key,
  + ins_len);
  +   if (ret == -EEXIST) {
  +   if (btrfs_find_name_in_ext_backref(path, name, name_len, NULL))
  +   goto out;
  +
  +   btrfs_extend_item(trans, root, path, ins_len);
  +   }
  +   if (ret  0)
  +   goto out;
 
 This doesn't look right. Did you actually test it? I haven't, but I claim that
 with this version of the patch, you won't be able to add a hash collision 
 link.
 
 Jeff changed btrfs_extend_item from int to void in March, so we're no longer
 setting ret there. I suggest adding ret = 0 within the EEXIST-block.

Doh, yeah I didn't test collisions on this version of the patch (as you can
obviously see, I messed up the merge of Jeff's change).


 The rest of this patch looks good to me.

Awesome, I'll send the above as a fix to this patch series (I have at least
one other fix to send soon too)
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Hung I/O, Kernel BUG with corrupt leaf (bad key order)

2012-08-14 Thread Peter Marheine
Hi all,

I'm running btrfs in a 3-disk RAID1 configuration. After a hard
power-off, I'm seeing a lot of hung I/O tasks on this volume,
apparently due to a corrupt leaf. I first noticed the problem on
kernel 3.4.7, and it's persisted with 3.4.8. Relevant parts of the
kernel log follow.

[   85.179621] block group 38684065792 has an wrong amount of free space
[   85.179667] btrfs: failed to load free space cache for block group
38684065792
[  136.969477] btrfs: corrupt leaf, bad key order:
block=1478255230976,root=1, slot=26
[  136.998953] btrfs: corrupt leaf, bad key order:
block=1478255230976,root=1, slot=26
[  137.000492] btrfs: corrupt leaf, bad key order:
block=1478255230976,root=1, slot=26
[  137.000708] btrfs: corrupt leaf, bad key order:
block=1478255230976,root=1, slot=26
[  153.912922] btrfs: corrupt leaf, bad key order:
block=1478255230976,root=1, slot=26
[  153.913020] [ cut here ]
[  153.913055] kernel BUG at fs/btrfs/inode.c:828!
[  153.913087] invalid opcode:  [#1] PREEMPT SMP
[  153.913142] CPU 1
[  153.913155] Modules linked in: nfsd exportfs arc4 snd_hda_codec_idt
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm ath5k ath microcode i915
video i2c_algo_bit acpi_cpufreq drm_kms_helper mperf mac80211 cfg80211
i2c_i801 rfkill serio_raw drm processor evdev snd_page_alloc snd_timer
snd coretemp soundcore mei(C) psmouse pcspkr e1000e iTCO_wdt i2c_core
button iTCO_vendor_support intel_agp intel_gtt nfs nfs_acl lockd
auth_rpcgss sunrpc fscache dm_mod floppy btrfs crc32c libcrc32c
zlib_deflate ext4 crc16 jbd2 mbcache uhci_hcd ehci_hcd usbcore
usb_common sd_mod ahci libahci pata_marvell libata scsi_mod
[  153.913685]
[  153.913698] Pid: 325, comm: btrfs-transacti Tainted: G C
3.4.8-1-ARCH #1  /DG33TL
[  153.913767] RIP: 0010:[a0197cd0]  [a0197cd0]
cow_file_range+0x3d0/0x4b0 [btrfs]
[  153.913841] RSP: 0018:8801a1fb1580  EFLAGS: 00010246
[  153.913873] RAX: 88019cd38000 RBX: 8801a1fb18e8 RCX: 
[  153.913911] RDX: 88019d8bb800 RSI: ea00060d0040 RDI: 88017dff47f0
[  153.913951] RBP: 8801a1fb1640 R08: 8801a1fb18d4 R09: 8801a1fb18e8
[  153.913990] R10: 0001 R11: 0001 R12: 
[  153.914029] R13:  R14: 1000 R15: 88017dff47f0
[  153.914068] FS:  () GS:8801abc8()
knlGS:
[  153.914112] CS:  0010 DS:  ES:  CR0: 8005003b
[  153.914144] CR2: 7f085106b000 CR3: 000198736000 CR4: 07e0
[  153.914182] DR0:  DR1:  DR2: 
[  153.914221] DR3:  DR6: 0ff0 DR7: 0400
[  153.914261] Process btrfs-transacti (pid: 325, threadinfo
8801a1fb, task 88019cd7b790)
[  153.914308] Stack:
[  153.914322]   880162624b60 0286
0003
[  153.914377]   88017dff4620 8801a1fb15f0
ea00060d0040
[  153.914431]  8801a1fb15f0 88019d8bb800 8801a09ad360
8801a1fb18d4
[  153.914485] Call Trace:
[  153.914516]  [a01b687f] ? free_extent_buffer+0x2f/0x70 [btrfs]
[  153.914565]  [a0198173] run_delalloc_nocow+0x3c3/0x950 [btrfs]
[  153.914615]  [a0198a31] run_delalloc_range+0x331/0x3a0 [btrfs]
[  153.914665]  [a01b52f1] __extent_writepage+0x341/0x7c0 [btrfs]
[  153.914715]  [a01b5a52]
extent_write_cache_pages.isra.26.constprop.44+0x2e2/0x3e0 [btrfs]
[  153.914775]  [a01b5da5] extent_writepages+0x45/0x60 [btrfs]
[  153.914823]  [a0194330] ? btrfs_writepage+0x70/0x70 [btrfs]
[  153.914871]  [a01b191e] ? free_extent_state+0x1e/0x30 [btrfs]
[  153.914919]  [a0193338] btrfs_writepages+0x28/0x30 [btrfs]
[  153.916201]  [81118082] do_writepages+0x22/0x50
[  153.916315]  [8110d5fb] __filemap_fdatawrite_range+0x5b/0x60
[  153.916315]  [8110d61f] filemap_fdatawrite+0x1f/0x30
[  153.920013]  [8110d665] filemap_write_and_wait+0x35/0x60
[  153.920013]  [a01cf622] __btrfs_write_out_cache+0x792/0x9a0 [btrfs]
[  153.920013]  [a0175b25] ? __find_space_info+0x85/0xa0 [btrfs]
[  153.920013]  [a017f28b] ?
btrfs_run_delayed_refs+0x1cb/0x450 [btrfs]
[  153.920013]  [a01cf8c5] btrfs_write_out_cache+0x95/0xf0 [btrfs]
[  153.920013]  [a017fa2f]
btrfs_write_dirty_block_groups+0x51f/0x5f0 [btrfs]
[  153.920013]  [a01e9b2a] commit_cowonly_roots+0xec/0x1c6 [btrfs]
[  153.920013]  [a0190895]
btrfs_commit_transaction+0x575/0xaa0 [btrfs]
[  153.920013]  [81073b50] ? abort_exclusive_wait+0xb0/0xb0
[  153.920013]  [a0188e15] transaction_kthread+0x235/0x2b0 [btrfs]
[  153.920013]  [a0188be0] ? btrfs_alloc_root+0x50/0x50 [btrfs]
[  153.920013]  [810731c3] kthread+0x93/0xa0
[  153.920013]  [8146bfa4] kernel_thread_helper+0x4/0x10
[  153.920013]  

linux 3.5.0: BTRFS error in compress_file_range:581 (failed to join transaction)

2012-08-14 Thread Marc MERLIN
My laptop oopsed due to a wireless bug

When I rebooted, the system came back ok, and seemed to work, but soon went
to read only with the error in the subject line.

I have hourly snapshots for each of the 5 subvolumes in that btrfs
filesystem.

How do I recover from this? Revert all the snapshots one hour, find/guess
which one caused the problem somehow and revert just that one? (the error
message didn't give a subvolume or directory).

Also, before I do this, is there debug info I can get off my system?

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: do not allocate chunks as agressively

2012-08-14 Thread Josef Bacik
Swinging this pendulum back the other way.  We've been allocating chunks up
to 2% of the disk no matter how much we actually have allocated.  So instead
fix this calculation to only allocate chunks if we have more than 80% of the
space available allocated.  Please test this as it will likely cause all
sorts of ENOSPC problems to pop up suddenly.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/extent-tree.c |   12 +++-
 1 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ce494b9..eaf1a9e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3487,7 +3487,8 @@ static int should_alloc_chunk(struct btrfs_root *root,
 * and purposes it's used space.  Don't worry about locking the
 * global_rsv, it doesn't change except when the transaction commits.
 */
-   num_allocated += global_rsv-size;
+   if (sinfo-flags  BTRFS_BLOCK_GROUP_METADATA)
+   num_allocated += global_rsv-size;
 
/*
 * in limited mode, we want to have some free space up to
@@ -3501,15 +3502,8 @@ static int should_alloc_chunk(struct btrfs_root *root,
if (num_bytes - num_allocated  thresh)
return 1;
}
-   thresh = btrfs_super_total_bytes(root-fs_info-super_copy);
 
-   /* 256MB or 2% of the FS */
-   thresh = max_t(u64, 256 * 1024 * 1024, div_factor_fine(thresh, 2));
-   /* system chunks need a much small threshold */
-   if (sinfo-flags  BTRFS_BLOCK_GROUP_SYSTEM)
-   thresh = 32 * 1024 * 1024;
-
-   if (num_bytes  thresh  sinfo-bytes_used  div_factor(num_bytes, 8))
+   if (num_allocated + alloc_bytes  div_factor(num_bytes, 8))
return 0;
return 1;
 }
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hung I/O, Kernel BUG with corrupt leaf (bad key order)

2012-08-14 Thread Peter Marheine
 Is there some way to fix this corruption? I noticed what looks like
 the same problem in an earlier message on the list (btrfs unmountable
 after failed suspend, February 7), but with no resolution. I have
 offline backups, but recovering those in their entirety will take some
 time, so a solution that doesn't require wiping the entire FS would be
 preferred.
I did some further investigation into the problem, and I have
determined the problematic directory (by seeing where `ls -R` hangs).
If I skip the corrupt directory, everything works properly, but
attempting to list its contents causes the entire volume to stop
responding.

At this point I'd like to simply unlink the corrupt directory (without
enumerating it). Is that possible, or should I just image the volume
minus the corrupt directory and recreate my fs?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux 3.5.0: BTRFS error in compress_file_range:581 (failed to join transaction)

2012-08-14 Thread Marc MERLIN
On Tue, Aug 14, 2012 at 11:23:14AM -0700, Marc MERLIN wrote:
 My laptop oopsed due to a wireless bug
 
 When I rebooted, the system came back ok, and seemed to work, but soon went
 to read only with the error in the subject line.
 
 I have hourly snapshots for each of the 5 subvolumes in that btrfs
 filesystem.
 
 How do I recover from this? Revert all the snapshots one hour, find/guess
 which one caused the problem somehow and revert just that one? (the error
 message didn't give a subvolume or directory).
 
 Also, before I do this, is there debug info I can get off my system?

I'm likely to have to do this tonight to get back to a working system.

If someone wants debug info before I lose it potentially, please ask soon ;)

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux 3.5.0: BTRFS error in compress_file_range:581 (failed to join transaction)

2012-08-14 Thread Liu Bo
On 08/15/2012 10:48 AM, Marc MERLIN wrote:
 On Tue, Aug 14, 2012 at 11:23:14AM -0700, Marc MERLIN wrote:
 My laptop oopsed due to a wireless bug

 When I rebooted, the system came back ok, and seemed to work, but soon went
 to read only with the error in the subject line.

 I have hourly snapshots for each of the 5 subvolumes in that btrfs
 filesystem.

 How do I recover from this? Revert all the snapshots one hour, find/guess
 which one caused the problem somehow and revert just that one? (the error
 message didn't give a subvolume or directory).

 Also, before I do this, is there debug info I can get off my system?
 
 I'm likely to have to do this tonight to get back to a working system.
 
 If someone wants debug info before I lose it potentially, please ask soon ;)
 
 Marc
 

What does the 'ret' shows?  Is it -ENOSPC?

thanks,
liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux 3.5.0: BTRFS error in compress_file_range:581 (failed to join transaction)

2012-08-14 Thread Marc MERLIN
On Wed, Aug 15, 2012 at 11:02:31AM +0800, Liu Bo wrote:
 On 08/15/2012 10:48 AM, Marc MERLIN wrote:
  On Tue, Aug 14, 2012 at 11:23:14AM -0700, Marc MERLIN wrote:
  My laptop oopsed due to a wireless bug
 
  When I rebooted, the system came back ok, and seemed to work, but soon went
  to read only with the error in the subject line.
 
  I have hourly snapshots for each of the 5 subvolumes in that btrfs
  filesystem.
 
  How do I recover from this? Revert all the snapshots one hour, find/guess
  which one caused the problem somehow and revert just that one? (the error
  message didn't give a subvolume or directory).
 
  Also, before I do this, is there debug info I can get off my system?
  
  I'm likely to have to do this tonight to get back to a working system.
  
  If someone wants debug info before I lose it potentially, please ask soon ;)
 
 What does the 'ret' shows?  Is it -ENOSPC?

I got nothing else in my logs.

I powered the laptop back on and it came up like nothing ever happened.

[   15.626700] device label btrfs_pool1 devid 1 transid 10222 
/dev/mapper/cryptroot
[   15.627161] btrfs: disk space caching is enabled
[   15.631704] btrfs: bdev /dev/mapper/cryptroot errs: wr 0, rd 0, flush 0, 
corrupt 0, gen 0

It looks like the SSD was loose inside the laptop (the tray that holds it isn't 
quite the
right size it seems).
I think it may not have had a good connection, but what's interesting is that I 
got absolutely no lower level errors in dmesg (which didn't get written to 
disk).

All that said, I think it may just have been a write error due to a hardware 
connection problem (no way to prove this now).
If there is nothing suspicious in the code around that line, let's just ignore 
my report.

For what it's worth, I do have plenty of space left:
Label: 'btrfs_pool1'  uuid: 92584fa9-85cd-4df6-b182-d32198b76a0b
Total devices 1 FS bytes used 227.44GB
devid1 size 441.70GB used 297.04GB path /dev/dm-0

Thanks for your reply.
Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html