Re: strange btrfs sub list output

2011-06-02 Thread C Anthony Risinger
On Tue, May 31, 2011 at 2:32 PM, C Anthony Risinger anth...@xtfx.me wrote:
 On Tue, May 31, 2011 at 1:50 PM, Andreas Philipp
 philipp.andr...@gmail.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 31.05.2011 19:40, C Anthony Risinger wrote:
 On Tue, May 31, 2011 at 5:00 AM, Stephane Chazelas
 stephane_chaze...@yahoo.fr wrote:
 2011-05-27 13:49:52 +0200, Andreas Philipp: [...]
 Thanks, I can understand that. What I don't get is how one
 creates a subvol with a top-level other than 5. I might be
 missing the obvious, though.

 If I do:

 btrfs sub create A btrfs sub create A/B btrfs sub snap A
 A/B/C

 A, A/B, A/B/C have their top-level being 5. How would I get a
 new snapshot to be a child of A/B for instance?

 In my case, 285, was not appearing in the btrfs sub list
 output, 287 was a child of 285 with path data while all I
 did was create a snapshot of 284 (path
 u6:10022/vm+xfs@u8/xvda1/g8/v3/data in vol 5) in
 u6:10022/vm+xfs@u8/xvda1/g8/v3/snapshots/2011-03-30

 So I did manage to get a volume with a parent other than 5,
 but I did not ask for it.
 [...]
 Reconsidering the explanations on btrfs subvolume list in this
 thread I get the impression that a line in the output of btrfs
 subvolume list with top level other than 5 indicates that the
 backrefs from one subvolume to its parent are broken.

 What's your opinion on this?
 [...]

 Given that I don't really get what the parent-child relationship
 means in that context, I can't really comment.

 In effect, the snapshot had been created and was attached to the
 right directory (but didn't appear in the sub list), and there
 was an additional data volume that I had not asked for nor
 created that had the snapshot above as parent and that did appear
 in the sub list.

 It pretty much looks like a bug to me, I'd like to understand
 more so that I can maybe try and avoid running into it again.

 i'm actually really interested in the conclusion to this thread
 because i _want_ to create subvols with a new parent ... i didn't
 realize this wasn't possible (nor the mount option) until reading
 this thread. this would give me a little more flexibility with
 initcpio hooks and the like vs. packing the btrfs root with tons of
 hidden files [subvols] or using IDs directly ...

 i tried absolutely everything i could think of to reproduce this
 but all subvols ended up having a top level id of `5`.

 ... so, is there any known way to _purposefully_ create parented
 subvols with the current tools?

 Hopefully, I can help clarify this a little bit. In fact, this is the
 'usual' case. With the attached patch to the master branch of
 btrfs-progs-unstable you can 'watch' how the btrfs subvolume list
 command builds the full path of the listed subvolumes. Additionally,
 it gives you the IDs of the parent subvolumes. See the following example.

 ID 256 top level 5 path test1
 ID 257 top level 256 path test1.1
 ID 257 top level 5 path test1/test1.1
 ID 258 top level 5 path test2
 ID 259 top level 258 path test2.1
 ID 259 top level 5 path test2/test2.1

 - From the second line you see that subvolume ID 256 really is ID 257's
 parent. Additionally, only test1 and test2 have parent ID 5 or in your
 terminology are in the btrfs root.

 aaah, ok ... this is what i thought was happening too after taking a
 peek at the sources (albeit i don't write any C) and seems to match
 what Hugo was saying if i understand him correctly.

 this also makes sense what you said about a broken link ... since
 normally the `btrfs` tool will not let you remove a subvol that has
 other subvols nested within it ... though *technically* it does not
 seem to matter, yes?  must have been a fluke/bug in the `btrfs` tool
 where a higher level subvol was removed before it's child somehow, is
 this correct?  or is the FS itself slightly broken when this happens?

 yeah i know that's kind of my terminology :-) ... i've spent a lot
 of time explaining btrfs concepts to others and that term always
 seemed to makes the most sense to people ... `top-level` can change,
 `default` can change, etc, etc ... but `the btrfs root` can only mean
 one thing -- the most bottomest of the bottom (or top, if you prefer
 :-)

 i'll try this out later tonight, thanks.

after booting the correct kernel in KVM, this works exactly as
advertised by the commit that added it, and by your explanation --
thanks -- this will be of much use wrt designing sub-root layouts
for advanced initramfs recovery options ... i always felt limited by
the requirement to be in the btrfs root, and mounting by id looses
some flexibility, eg. when trying to use names like pointers/symlinks.

... now i can put subvols anywhere, and user/script only needs to
determine the stable parent ids once.  nice ... to the laboratory!

-- 

C Anthony
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-gui

2011-06-02 Thread Fajar A. Nugraha
On Thu, Jun 2, 2011 at 6:20 AM, Hugo Mills h...@carfax.org.uk wrote:
   Over the last few weeks, I've been playing with a foolish idea,
 mostly triggered by a cluster of people being confused by btrfs's free
 space reporting (df vs btrfs fi df vs btrfs fi show). I also wanted an
 excuse, and some code, to mess around in the depths of the FS data
 structures.

   Like all silly ideas, this one got a bit out of hand, and seems to
 have turned into something vaguely useful. I'm therefore pleased to
 announce the first major public release of btrfs-gui[1]: a point-and-
 click tool for managing btrfs filesystems.

   The tool currently can scan for and list btrfs filesystems and the
 volumes they live on. It can show the allocation and usage of data in
 a selected filesystem, categorised by use, replication, and device. It
 can show and manipulate subvolumes and snapshots: creation, deletion,
 and setting the default.


Some comments:
(1) Currently it needs to be run from the directory where it's
downloaded, even after a python3 setup.py install. When run from
other directory, it bails with

Traceback (most recent call last):
  File /usr/local/bin/btrfs-gui, line 5, in module
btrfsgui.main.main()
  File /usr/local/lib/python3.2/dist-packages/btrfsgui/main.py, line
24, in main
subproc = init_root_process(options)
  File /usr/local/lib/python3.2/dist-packages/btrfsgui/sudo.py, line
31, in init_root_process
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
  File /usr/lib/python3.2/subprocess.py, line 736, in __init__
restore_signals, start_new_session)
  File /usr/lib/python3.2/subprocess.py, line 1330, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 2] No such file or directory: './btrfs-gui-helper'

Is this intentional?

(2) When showing space usage for a single-device FS, selecting Show
unallocated space as raw space, why is the top and bottom graph
different? Shouldn't it be the same, since there's only one device?

(3) Not directly related to btrfs-gui, but I've been wondering what's
the correct way to SHOW the current default subvolume?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-gui

2011-06-02 Thread Hugo Mills
On Thu, Jun 02, 2011 at 03:31:16PM +0700, Fajar A. Nugraha wrote:
 On Thu, Jun 2, 2011 at 6:20 AM, Hugo Mills h...@carfax.org.uk wrote:
    Over the last few weeks, I've been playing with a foolish idea,
  mostly triggered by a cluster of people being confused by btrfs's free
  space reporting (df vs btrfs fi df vs btrfs fi show). I also wanted an
  excuse, and some code, to mess around in the depths of the FS data
  structures.
 
    Like all silly ideas, this one got a bit out of hand, and seems to
  have turned into something vaguely useful. I'm therefore pleased to
  announce the first major public release of btrfs-gui[1]: a point-and-
  click tool for managing btrfs filesystems.
 
    The tool currently can scan for and list btrfs filesystems and the
  volumes they live on. It can show the allocation and usage of data in
  a selected filesystem, categorised by use, replication, and device. It
  can show and manipulate subvolumes and snapshots: creation, deletion,
  and setting the default.
 
 
 Some comments:
 (1) Currently it needs to be run from the directory where it's
 downloaded, even after a python3 setup.py install. When run from
 other directory, it bails with
 
 Traceback (most recent call last):
   File /usr/local/bin/btrfs-gui, line 5, in module
 btrfsgui.main.main()
   File /usr/local/lib/python3.2/dist-packages/btrfsgui/main.py, line
 24, in main
 subproc = init_root_process(options)
   File /usr/local/lib/python3.2/dist-packages/btrfsgui/sudo.py, line
 31, in init_root_process
 stdin=subprocess.PIPE, stdout=subprocess.PIPE)
   File /usr/lib/python3.2/subprocess.py, line 736, in __init__
 restore_signals, start_new_session)
   File /usr/lib/python3.2/subprocess.py, line 1330, in _execute_child
 raise child_exception_type(errno_num, err_msg)
 OSError: [Errno 2] No such file or directory: './btrfs-gui-helper'
 
 Is this intentional?

   No, and will be fixed later today. I forsee an emergency 0.2.1
coming shortly. :)

 (2) When showing space usage for a single-device FS, selecting Show
 unallocated space as raw space, why is the top and bottom graph
 different? Shouldn't it be the same, since there's only one device?

   Good question. I shall investigate what's going on.

 (3) Not directly related to btrfs-gui, but I've been wondering what's
 the correct way to SHOW the current default subvolume?

   The btrfs-progs tools don't have a way of doing this. It's
relatively easy to do: use the tree-search ioctl to search for a
DIR_ITEM key in the tree of tree roots with the name default. The
corresponding FS tree is the default subvolume. It's just not
implemented yet.

   Thanks for the feedback.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Welcome to Rivendell,  Mr Anderson... ---  


signature.asc
Description: Digital signature


filesystem seeding ... BUGs on .38, .39, loopback, real devices, tmp branch ... everything

2011-06-02 Thread C Anthony Risinger
hello,

i'm trying to setup a seeded FS -- was only able to find this:

http://thread.gmane.org/gmane.comp.file-systems.btrfs/10529

... and announcement-like info from 2009 or so.  i keep hitting
bugs/oops, and even though the FS *appears* to work correctly
afterwards, sometimes mount/strace/etc will hang (could only be .38
here).

i tried with loop devices at first, then real devices -- this is all
under KVM/QEMU, and with FSs that are/will be smaller than 1G.

i've tried with .38, .39, mixed blocks groups, the `tmp` branch from
btrfs-progs, and everything else i can think of ... what am i doing
wrong?  are these known problems (per the other thread ...)?  what is
the correct way to seed a filesystem?  `btrfs filesystem show` reports
the FS as missing a device ... but it seems to mount fine.  the
operation appears to spawn a whole new FS with the same label -- would
anyone mind elaborating more?

i used the --rootdir feature of mkfs.btrfs to populate the image
with ~100MB of barebone filesystem.  the `mtime` of the seeded image
never changes, so it *seems* to be working ... but i dont understand
how to reuse this image or what's really going on (per my above
questions :-).

... procedures follow along with the oops it produces; let me know
what, if anything, else i should try.  i get similar results no matter
what i attempt, so i only included the one for now.

thanks,

C Anthony

2.6.39, loopback, `tmp` branch
-
#!/bin/bash

mkdir ro rw
truncate -s700MB output.img.rw
mkfs.btrfs -M -r /root/btrfs/fs -m raid0 -d raid0 -L _btrfs_ro_seed
btrfstune -S1 output.img
losetup /dev/loop0 output.img
losetup /dev/loop1 output.img.rw
mount /dev/loop0 ro
btrfs device add /dev/loop1 ro
# !!! BUG !!!
mount /dev/loop1 rw
echo there  ro/hi
# FAILS: Read-only filesystem
echo there  rw/hi
# SEEMS TO WORK ...
-

# btrfs filesystem show
-
Label: '_btrfs_ro_seed'  uuid: 1f29a49c-e437-4329-ac81-d50adea03688
Total devices 1 FS bytes used 97.29MB
devid1 size 256.00MB used 169.96MB path /dev/loop0

Label: '_btrfs_ro_seed'  uuid: 5824e2ce-6008-4d6d-8e1a-a5b6621214ac
Total devices 2 FS bytes used 97.29MB
devid2 size 667.57MB used 82.75MB path /dev/loop1
*** Some devices missing

Btrfs v0.19-50-ge6bd18d-dirty
-

START OOPS
-
[ 2808.826972] device label _btrfs_ro_seed devid 1 transid 13 /dev/loop0
[ 2808.832205] device label _btrfs_ro_seed devid 1 transid 13 /dev/loop0
[ 2808.834058] btrfs: disk space caching is enabled
[ 2808.906916] [ cut here ]
[ 2808.907673] WARNING: at fs/btrfs/extent-tree.c:5790
btrfs_alloc_free_block+0x1ec/0x330 [btrfs]()
[ 2808.908981] Hardware name: Bochs
[ 2808.909604] Modules linked in: btrfs zlib_deflate crc32c libcrc32c
loop ipv6 ext2 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer
snd uhci_hcd i2c_piix4 psmouse floppy ehci_hcd processor soundcore
snd_page_alloc pcspkr evdev button usbcore serio_raw sg i2c_core ext4
mbcache jbd2 crc16 sr_mod cdrom pata_acpi ata_piix libata scsi_mod
virtio_rng virtio_pci virtio_net virtio_console virtio_blk
virtio_balloon virtio_ring virtio
[ 2808.920256] Pid: 2695, comm: btrfs Tainted: G  D W   2.6.39-ARCH #1
[ 2808.921097] Call Trace:
[ 2808.921724]  [810577da] warn_slowpath_common+0x7a/0xb0
[ 2808.922532]  [81057825] warn_slowpath_null+0x15/0x20
[ 2808.923371]  [a036c74c] btrfs_alloc_free_block+0x1ec/0x330 [btrfs]
[ 2808.924257]  [a039f337] ? read_extent_buffer+0xd7/0x1e0 [btrfs]
[ 2808.925134]  [a0358d19] __btrfs_cow_block+0x159/0x890 [btrfs]
[ 2808.925992]  [a0359563] btrfs_cow_block+0x113/0x350 [btrfs]
[ 2808.926863]  [a035ef82] btrfs_search_slot+0x1d2/0x9f0 [btrfs]
[ 2808.927736]  [a039a202] ? free_extent_state+0x32/0x50 [btrfs]
[ 2808.928589]  [a0360930] btrfs_insert_empty_items+0x70/0xd0 [btrfs]
[ 2808.929455]  [a03609f0] btrfs_insert_item+0x60/0xe0 [btrfs]
[ 2808.930323]  [a036e40c] btrfs_make_block_group+0x25c/0x2d0 [btrfs]
[ 2808.931204]  [a03a41c4] __btrfs_alloc_chunk+0x524/0x970 [btrfs]
[ 2808.932083]  [a03a4874] init_first_rw_device+0x74/0x130 [btrfs]
[ 2808.932945]  [813c95a9] ? __mutex_lock_slowpath+0x239/0x320
[ 2808.933805]  [a03a6512] btrfs_init_new_device+0x642/0xcd0 [btrfs]
[ 2808.934669]  [a03ac128] ? btrfs_ioctl+0x7c8/0xa20 [btrfs]
[ 2808.935519]  [a03ac14a] btrfs_ioctl+0x7ea/0xa20 [btrfs]
[ 2808.936328]  [81174d8e] ? __blkdev_put+0x1ee/0x200
[ 2808.937151]  [8115264e] do_vfs_ioctl+0x8e/0x500
[ 2808.937938]  [8115ec2a] ? mntput+0x1a/0x30
[ 2808.938710]  [81142927] ? fput+0x167/0x210
[ 2808.939458]  [81152b51] sys_ioctl+0x91/0xa0
[ 2808.940236]  [813cb312] 

Re: filesystem seeding ... BUGs on .38, .39, loopback, real devices, tmp branch ... everything

2011-06-02 Thread Geoff Ritter
On Thu, 2011-06-02 at 04:20 -0500, C Anthony Risinger wrote:
 
 i tried with loop devices at first, then real devices -- this is all
 under KVM/QEMU, and with FSs that are/will be smaller than 1G.

I have tried the seed option as well.  I was able to successfully mount
the read write partition after setting up the seed.  However, both had
to be independent partitions on a real device.

During testing, both .38 and .39rc could NOT create a seed if one or
both partitions were encrypted.  I believe encrypted partitions also
work with a loop device for the unlocked version you write too.  The
response I got after a few days is as follows:

 Chris Mason chris.ma...@oracle.com
 cwillu cwi...@cwillu.com
 date  Thu, May 5, 2011 at 4:42 PM
 Ok, looks like I busted the seed support
 when I fixed up some of the chunk
 allocations.  I'll reproduce this and
 work out a fix.

I just assumed it would take a while to fix so I haven't tried again
since.  If the root of the problem appears to be loop devices, you might
want to report that.  Err I guess you did.  To me, this doesn't explain
why it wouldn't work in a Virtual Machine.  I would have thought the VM
would treat it as a real device.



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 11/20] evm: add evm_inode_post_init call in btrfs

2011-06-02 Thread Mimi Zohar
After creating the initial LSM security extended attribute, call
evm_inode_post_init_security() to create the 'security.evm'
extended attribute.

Signed-off-by: Mimi Zohar zo...@us.ibm.com
---
 fs/btrfs/xattr.c |   39 +--
 1 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
index f3107e4..8e9afcb2 100644
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -22,6 +22,7 @@
 #include linux/rwsem.h
 #include linux/xattr.h
 #include linux/security.h
+#include linux/evm.h
 #include ctree.h
 #include btrfs_inode.h
 #include transaction.h
@@ -367,31 +368,49 @@ int btrfs_xattr_security_init(struct btrfs_trans_handle 
*trans,
  const struct qstr *qstr)
 {
int err;
-   size_t len;
-   void *value;
-   char *suffix;
+   struct xattr lsm_xattr;
+   struct xattr evm_xattr;
char *name;
 
-   err = security_inode_init_security(inode, dir, qstr, suffix, value,
-  len);
+   err = security_inode_init_security(inode, dir, qstr, lsm_xattr.name,
+  lsm_xattr.value,
+  lsm_xattr.value_len);
if (err) {
if (err == -EOPNOTSUPP)
return 0;
return err;
}
 
-   name = kmalloc(XATTR_SECURITY_PREFIX_LEN + strlen(suffix) + 1,
+   name = kmalloc(XATTR_SECURITY_PREFIX_LEN + strlen(lsm_xattr.name) + 1,
   GFP_NOFS);
if (!name) {
err = -ENOMEM;
} else {
strcpy(name, XATTR_SECURITY_PREFIX);
-   strcpy(name + XATTR_SECURITY_PREFIX_LEN, suffix);
-   err = __btrfs_setxattr(trans, inode, name, value, len, 0);
+   strcpy(name + XATTR_SECURITY_PREFIX_LEN, lsm_xattr.name);
+   err = __btrfs_setxattr(trans, inode, name, lsm_xattr.value,
+  lsm_xattr.value_len, 0);
kfree(name);
}
+   if (err)
+   goto out;
+
+   err = evm_inode_post_init_security(inode, lsm_xattr, evm_xattr);
+   if (err)
+   goto out;
 
-   kfree(suffix);
-   kfree(value);
+   name = kasprintf(GFP_NOFS, %s%s, XATTR_SECURITY_PREFIX,
+evm_xattr.name);
+   if (!name)
+   err = -ENOMEM;
+   else {
+   err = __btrfs_setxattr(trans, inode, name, evm_xattr.value,
+  evm_xattr.value_len, 0);
+   kfree(name);
+   }
+   kfree(evm_xattr.value);
+out:
+   kfree(lsm_xattr.name);
+   kfree(lsm_xattr.value);
return err;
 }
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RESEND] btrfs: separate superblock items out of fs_info

2011-06-02 Thread David Sterba
fs_info is now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)

Add a wrapper for freeing fs_info.

Signed-off-by: David Sterba dste...@suse.cz
---
based on top of 3.0-rc1
pushed to git://repo.or.cz/linux-2.6/btrfs-unstable.git #cleanups
passed xfstests

 fs/btrfs/compression.c |3 ++-
 fs/btrfs/ctree.h   |   12 ++--
 fs/btrfs/disk-io.c |   21 ++---
 fs/btrfs/extent-tree.c |   18 +-
 fs/btrfs/file-item.c   |   17 ++---
 fs/btrfs/inode.c   |2 +-
 fs/btrfs/ioctl.c   |6 +++---
 fs/btrfs/scrub.c   |2 +-
 fs/btrfs/super.c   |   20 ++--
 fs/btrfs/sysfs.c   |6 +++---
 fs/btrfs/transaction.c |   10 +-
 fs/btrfs/tree-log.c|4 ++--
 fs/btrfs/volumes.c |   24 
 13 files changed, 78 insertions(+), 67 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index bfe42b0..2c7aa1a 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -85,7 +85,8 @@ struct compressed_bio {
 static inline int compressed_bio_size(struct btrfs_root *root,
  unsigned long disk_size)
 {
-   u16 csum_size = btrfs_super_csum_size(root-fs_info-super_copy);
+   u16 csum_size = btrfs_super_csum_size(root-fs_info-super_copy);
+
return sizeof(struct compressed_bio) +
((disk_size + root-sectorsize - 1) / root-sectorsize) *
csum_size;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6c093fa..26b27da 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -941,8 +941,8 @@ struct btrfs_fs_info {
wait_queue_head_t transaction_blocked_wait;
wait_queue_head_t async_submit_wait;
 
-   struct btrfs_super_block super_copy;
-   struct btrfs_super_block super_for_commit;
+   struct btrfs_super_block *super_copy;
+   struct btrfs_super_block *super_for_commit;
struct block_device *__bdev;
struct super_block *sb;
struct inode *btree_inode;
@@ -2350,6 +2350,14 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct extent_buffer *node,
struct extent_buffer *parent);
+
+static inline void free_fs_info(struct btrfs_fs_info *fs_info)
+{
+   kfree(fs_info-super_copy);
+   kfree(fs_info-super_for_commit);
+   kfree(fs_info);
+}
+
 /* root-item.c */
 int btrfs_find_root_ref(struct btrfs_root *tree_root,
struct btrfs_path *path,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 98b6a71..b6f8aeb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -211,8 +211,7 @@ void btrfs_csum_final(u32 crc, char *result)
 static int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf,
   int verify)
 {
-   u16 csum_size =
-   btrfs_super_csum_size(root-fs_info-super_copy);
+   u16 csum_size = btrfs_super_csum_size(root-fs_info-super_copy);
char *result = NULL;
unsigned long len;
unsigned long cur_len;
@@ -1737,14 +1736,14 @@ struct btrfs_root *open_ctree(struct super_block *sb,
goto fail_alloc;
}
 
-   memcpy(fs_info-super_copy, bh-b_data, sizeof(fs_info-super_copy));
-   memcpy(fs_info-super_for_commit, fs_info-super_copy,
-  sizeof(fs_info-super_for_commit));
+   memcpy(fs_info-super_copy, bh-b_data, sizeof(*fs_info-super_copy));
+   memcpy(fs_info-super_for_commit, fs_info-super_copy,
+  sizeof(*fs_info-super_for_commit));
brelse(bh);
 
-   memcpy(fs_info-fsid, fs_info-super_copy.fsid, BTRFS_FSID_SIZE);
+   memcpy(fs_info-fsid, fs_info-super_copy-fsid, BTRFS_FSID_SIZE);
 
-   disk_super = fs_info-super_copy;
+   disk_super = fs_info-super_copy;
if (!btrfs_super_root(disk_super))
goto fail_alloc;
 
@@ -2132,10 +2131,10 @@ fail_srcu:
 fail:
kfree(extent_root);
kfree(tree_root);
-   kfree(fs_info);
kfree(chunk_root);
kfree(dev_root);
kfree(csum_root);
+   free_fs_info(fs_info);
return ERR_PTR(err);
 }
 
@@ -2304,10 +2303,10 @@ int write_all_supers(struct btrfs_root *root, int 
max_mirrors)
int total_errors = 0;
u64 flags;
 
-   max_errors = btrfs_super_num_devices(root-fs_info-super_copy) - 1;
+   max_errors = btrfs_super_num_devices(root-fs_info-super_copy) - 1;
do_barriers = !btrfs_test_opt(root, NOBARRIER);
 
-   sb = root-fs_info-super_for_commit;
+   sb = root-fs_info-super_for_commit;
dev_item = sb-dev_item;
 
mutex_lock(root-fs_info-fs_devices-device_list_mutex);
@@ -2596,7 

Re: filesystem seeding ... BUGs on .38, .39, loopback, real devices, tmp branch ... everything

2011-06-02 Thread C Anthony Risinger
On Thu, Jun 2, 2011 at 6:40 AM, Geoff Ritter geoff.rit...@gmail.com wrote:
 On Thu, 2011-06-02 at 04:20 -0500, C Anthony Risinger wrote:

 i tried with loop devices at first, then real devices -- this is all
 under KVM/QEMU, and with FSs that are/will be smaller than 1G.

 I have tried the seed option as well.  I was able to successfully mount
 the read write partition after setting up the seed.  However, both had
 to be independent partitions on a real device.

 During testing, both .38 and .39rc could NOT create a seed if one or
 both partitions were encrypted.  I believe encrypted partitions also
 work with a loop device for the unlocked version you write too.  The
 response I got after a few days is as follows:

 Chris Mason chris.ma...@oracle.com
 cwillu cwi...@cwillu.com
 date  Thu, May 5, 2011 at 4:42 PM
 Ok, looks like I busted the seed support
 when I fixed up some of the chunk
 allocations.  I'll reproduce this and
 work out a fix.

 I just assumed it would take a while to fix so I haven't tried again
 since.  If the root of the problem appears to be loop devices, you might
 want to report that.  Err I guess you did.  To me, this doesn't explain
 why it wouldn't work in a Virtual Machine.  I would have thought the VM
 would treat it as a real device.

yeah ... i wasn't sure if this was the same exact problem you had or
what, i can't find much info at all about anyone using seed support.

i tried loop devices on my real machine too (.38), and because of
continuous oops/locks i moved to a VM so i didn't hose my system.
however, when i tried with real devices in the VM, these were not
loopbacks, they were just regular raw files used as backing for QEMU
(though i don't know if it internally uses loopback) ... they were
exposed as virtio devices /dev/vdb and /dev/vdc.  i got the exact same
results using those devices, using btrfs-vol instead of btrfs, and a
whole slew of other trial and error that all led to the same issue.

the 10 lines or so i provided earlier reproduces consistently for me
... in the end, it *seemed* to work, but still :-)

what i REALLY want though, is simply more information on how seeding
works and should be used ... the wiki et al seem to imply that i can
reuse the seed device for MULTIPLE filesystems ... how can i do this?
i tried adding the device to an existing array but i couldnt see any
files ... can anyone shed some light on this feature?

thanks much,

C Anthony
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Having parent transid verify failed

2011-06-02 Thread Johannes Hirte
On Thursday 05 May 2011 22:32:42 Chris Mason wrote:
 Excerpts from Konstantinos Skarlatos's message of 2011-05-05 16:27:54 -0400:
  I think i made some progress. When i tried to remove the directory that
  i suspect contains the problematic file, i got this on the console
  
  rm -rf serverloft/
 
 Ok, our one bad block is in the extent allocation tree.  This is going
 to be the very hardest thing to fix.
 
 Until I finish off the code to rebuild parts of the extent allocation
 tree, I think your best bet is to copy the files off.
 
 The big question is, what happened to make this error?  Can you describe
 your setup in more detail?
 
 -chris

It seems that I run into the same problem:

parent transid verify failed on 32940560384 wanted 210334 found 210342
BUG: scheduling while atomic: chrome/17058/0x0002
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec sr_mod cdrom ac97_bus snd_pcm sg snd_timer snd e1000 fschmd 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
Pid: 17058, comm: chrome Tainted: GW   2.6.39 #29
Call Trace:
 [c13cf70c] ? schedule+0x78/0x6ef
 [c11acabb] ? generic_make_request+0x1d5/0x22f
 [c11acbad] ? submit_bio+0x98/0x9f
 [c118026a] ? btrfs_map_bio+0x1ab/0x1b5
 [c13cfdc2] ? io_schedule+0x3f/0x50
 [c105723d] ? sleep_on_page+0x5/0x8
 [c13d0292] ? __wait_on_bit+0x31/0x58
 [c1057238] ? __lock_page+0x52/0x52
 [c1057388] ? wait_on_page_bit+0x5a/0x62
 [c1037f92] ? autoremove_wake_function+0x29/0x29
 [c117ab39] ? read_extent_buffer_pages+0x33a/0x3b5
 [c115891f] ? btree_read_extent_buffer_pages.clone.51+0x44/0x9e
 [c11578b0] ? verify_parent_transid+0x147/0x147
 [c11593aa] ? read_tree_block+0x2d/0x3e
 [c1144f90] ? read_block_for_search.clone.36+0xc3/0x35d
 [c11863bf] ? btrfs_tree_unlock+0x19/0x3a
 [c11420bb] ? unlock_up+0x88/0x9f
 [c1146f7e] ? btrfs_search_slot+0x39d/0x4fe
 [c1149fa1] ? lookup_inline_extent_backref+0x116/0x49b
 [c11773b0] ? set_extent_dirty+0x19/0x1d
 [c114cbd0] ? __btrfs_free_extent+0xe2/0x6c6
 [c114fa28] ? run_clustered_refs+0x6ad/0x720
 [c1191330] ? btrfs_find_ref_cluster+0x53/0x11f
 [c114fb53] ? btrfs_run_delayed_refs+0xb8/0x18d
 [c115d395] ? __btrfs_end_transaction+0x5a/0x17f
 [c115d4dc] ? btrfs_end_transaction+0x9/0xb
 [c1165e19] ? btrfs_evict_inode+0x190/0x1a7
 [c1092c45] ? evict+0x56/0xeb
 [c108baa8] ? do_unlinkat+0xc3/0x103
 [c13d1c90] ? sysenter_do_call+0x12/0x26
 [c13d] ? console_conditional_schedule+0x8/0xf
parent transid verify failed on 32940560384 wanted 210334 found 210342
BUG: scheduling while atomic: chrome/17058/0x0002
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec sr_mod cdrom ac97_bus snd_pcm sg snd_timer snd e1000 fschmd 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
Pid: 17058, comm: chrome Tainted: GW   2.6.39 #29
Call Trace:
 [c13cf70c] ? schedule+0x78/0x6ef
 [c11acabb] ? generic_make_request+0x1d5/0x22f
 [c11acbad] ? submit_bio+0x98/0x9f
 [c118026a] ? btrfs_map_bio+0x1ab/0x1b5
 [c13cfdc2] ? io_schedule+0x3f/0x50
 [c105723d] ? sleep_on_page+0x5/0x8
 [c13d0292] ? __wait_on_bit+0x31/0x58
 [c1057238] ? __lock_page+0x52/0x52
 [c1057388] ? wait_on_page_bit+0x5a/0x62
 [c1037f92] ? autoremove_wake_function+0x29/0x29
 [c117ab39] ? read_extent_buffer_pages+0x33a/0x3b5
 [c116bd50] ? lookup_extent_mapping+0x5a/0x148
 [c115891f] ? btree_read_extent_buffer_pages.clone.51+0x44/0x9e
 [c11578b0] ? verify_parent_transid+0x147/0x147
 [c11593aa] ? read_tree_block+0x2d/0x3e
 [c1144f90] ? read_block_for_search.clone.36+0xc3/0x35d
 [c11863bf] ? btrfs_tree_unlock+0x19/0x3a
 [c11420bb] ? unlock_up+0x88/0x9f
 [c1146f7e] ? btrfs_search_slot+0x39d/0x4fe
 [c1149fa1] ? lookup_inline_extent_backref+0x116/0x49b
 [c11773b0] ? set_extent_dirty+0x19/0x1d
 [c114cbd0] ? __btrfs_free_extent+0xe2/0x6c6
 [c114fa28] ? run_clustered_refs+0x6ad/0x720
 [c1191330] ? btrfs_find_ref_cluster+0x53/0x11f
 [c114fb53] ? btrfs_run_delayed_refs+0xb8/0x18d
 [c115d395] ? __btrfs_end_transaction+0x5a/0x17f
 [c115d4dc] ? btrfs_end_transaction+0x9/0xb
 [c1165e19] ? btrfs_evict_inode+0x190/0x1a7
 [c1092c45] ? evict+0x56/0xeb
 [c108baa8] ? do_unlinkat+0xc3/0x103
 [c13d1c90] ? sysenter_do_call+0x12/0x26
 [c13d] ? console_conditional_schedule+0x8/0xf
parent transid verify failed on 32940560384 wanted 210334 found 210342
BUG: scheduling while atomic: chrome/17058/0x0002
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec sr_mod cdrom ac97_bus snd_pcm sg snd_timer snd e1000 fschmd 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
Pid: 17058, comm: chrome Tainted: GW   2.6.39 #29
Call Trace:
 [c13cf70c] ? schedule+0x78/0x6ef
 [c11acabb] ? generic_make_request+0x1d5/0x22f
 [c11acbad] ? submit_bio+0x98/0x9f
 [c118026a] ? 

Re: [PATCH][RESEND] btrfs: separate superblock items out of fs_info

2011-06-02 Thread Sergei Trofimovich
On Thu,  2 Jun 2011 18:13:22 +0200
David Sterba dste...@suse.cz wrote:

 fs_info is now ~9kb, more than fits into one page. This will cause
 mount failure when memory is too fragmented. Top space consumers are
 super block structures super_copy and super_for_commit, ~2.8kb each.
 Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)
 
 Add a wrapper for freeing fs_info.
 
 Signed-off-by: David Sterba dste...@suse.cz
 ---
...
 @@ -801,6 +800,15 @@ static struct dentry *btrfs_mount(struct 
 file_system_type *fs_type, int flags,
   fs_info-fs_devices = fs_devices;
   tree_root-fs_info = fs_info;
  
 + fs_info-super_copy = kzalloc(sizeof(struct btrfs_super_block),
 + GFP_NOFS);
 + fs_info-super_for_commit = kzalloc(sizeof(struct btrfs_super_block),
 + GFP_NOFS);

In light of 
http://thread.gmane.org/gmane.comp.file-systems.btrfs/11004/focus=11038
what do you think of allocating BTRFS_SUPER_INFO_SIZE instead of sizeof?

 + if (!fs_info-super_copy || !fs_info-super_for_commit) {
 + error = -ENOMEM;
 + goto error_close_devices;
 + }
 +
   bdev = fs_devices-latest_bdev;
   s = sget(fs_type, btrfs_test_super, btrfs_set_super, tree_root);
   if (IS_ERR(s))

-- 

  Sergei


signature.asc
Description: PGP signature


Re: [BUG 3.0-rc1] oops during file removal, severe lock contention

2011-06-02 Thread Andi Kleen
Dave Chinner da...@fromorbit.com writes:

 Also, there is massive lock contention while running these workloads.
 perf top shows this for the create after about 5m inodes have been
 created:

We saw pretty much the same thing in some simple tests on large systems
(extent io tree locking and higher level b*tree locks are a problem)
It is being looked at I believe.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs w/ckd

2011-06-02 Thread kathleen.hodge
Hello,
I'm trying to use mod3 ckds which are already RAID10 protection. (most of the 
doc I'm looking at uses fba instead of ckd, so I didn't know if this was a 
limitation)
I'm addressing the head device and able to use these devices with no problem as 
ext3.  I've been reading that cache may remember that it used to be ext3, so I 
detached and reattached, but get the same result when trying to mount after 
formatting.

I use the following to make the file system after I've attached, varied online, 
formated and partitioned (also tried without the journaling switch).
mke2fs -j -t btrfs /dev/dasdbj1

Then the mount command:
mount -t btrfs /dev/dasdbj1 /home/sdasd/d01
mount: /dev/dasdbj1: can't read superblock

Can you please give me some direction?Thanks,
  Kathy Hodge

   EMC²
Mainframe Software QA
Phone:  1-508-293-6121 x76121
Cell:  1-303 -887-9292
Email:kathleen.ho...@emc.com



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs w/ckd

2011-06-02 Thread Hugo Mills
On Thu, Jun 02, 2011 at 03:45:07PM -0400, kathleen.ho...@emc.com wrote:
 Hello,
 I'm trying to use mod3 ckds which are already RAID10 protection. (most of the 
 doc I'm looking at uses fba instead of ckd, so I didn't know if this was a 
 limitation)
 I'm addressing the head device and able to use these devices with no problem 
 as ext3.  I've been reading that cache may remember that it used to be ext3, 
 so I detached and reattached, but get the same result when trying to mount 
 after formatting.
 
 I use the following to make the file system after I've attached, varied 
 online, formated and partitioned (also tried without the journaling switch).
 mke2fs -j -t btrfs /dev/dasdbj1

   Quoting the man page for that command:

mke2fs  is used to create an ext2, ext3, or ext4 filesystem, [...]

The -t option specifies which of ext2, ext3, and ext4 to create.  I'm
surprised it did anything at all given a type of btrfs...

   You probably wanted mkfs.btrfs instead:

# mkfs.btrfs -d single -m single /dev/dasdbj1

 Then the mount command:
 mount -t btrfs /dev/dasdbj1 /home/sdasd/d01
 mount: /dev/dasdbj1: can't read superblock

   If it created an ext2/3/4 filesystem, I'm not entirely surprised
that it can't find a btrfs superblock there. :)

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- We teach people management skills by examining characters in ---   
Shakespeare.  You could look at Claudius's crisis
   management techniques, for example.   


signature.asc
Description: Digital signature


Re: [PATCH 0/9] some fixes for bugs spotted by valgrind

2011-06-02 Thread Sergei Trofimovich
On Thu, 02 Jun 2011 13:17:55 -0700
Andi Kleen a...@firstfloor.org wrote:

 Sergei Trofimovich sly...@gentoo.org writes:
 
  Am I too paranoid about the issue?
 
 It sounds weird, because if the kernel would really checksum
 mutexes on disk you would have a lot of on disk
 format incompatibility between different kernel versions
 (e.g. between lockdep and normal kernels or kernels
 running on different architectures)

 If it would really happen (no opinion on that) it would
 be a serious bug.

Oh, I don't think things are so bad.

In order it to be a problem superblock loading would have to
be loaded exactly the same way as it's stored, but it isn't.
At least super copies (baked into btrfs_fs_info) are read
to separate data block (buffer_hear) and then copied properly
(in open_ctree) to super_copy/super_for_commit:

bh = btrfs_read_dev_super(fs_devices-latest_bdev);
if (!bh) {
err = -EINVAL;
goto fail_alloc;
}

memcpy(fs_info-super_copy, bh-b_data, sizeof(fs_info-super_copy));
memcpy(fs_info-super_for_commit, fs_info-super_copy,
   sizeof(fs_info-super_for_commit));
brelse(bh);

But the way superblocks are written look racy.

-- 

  Sergei


signature.asc
Description: PGP signature