Re: Recover Corruption - verify_parent_transid

2010-08-09 Thread A. James Lewis
Not being one of the developers on this project, I cannot offer you a
solution to recovering data from this volume, and my guess is that a
ready solution is unlikely to be forthcoming simply because if this was
possible then btrfsck would include the code to recover the filesystem
already.

However, rather than simply observe that anything not backed up is
lost... I thought I would offer the solution of the 4th dimension
The data is not "lost", but simply unavailable to you until the tools to
repair the filesystem have been developed further.  If I had something
critical which had become inaccessible, I might be tempted to put the
drive on a shelf for 6 months and see if btrfsck was able to repair
filesystems by then.  It may be that the on disk format is changed
before that happens, but I understand that this is now relatively
unlikely.

A. James Lewis


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Recover Corruption - verify_parent_transid

2010-08-09 Thread Jason Switzer
I have a btrfs partition that is failing to mount and I was hoping I
could recover it somehow.
Mount returns immediately with a bad superblock error:

mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

Looking in the syslog, we see the following errors:

[ 4328.614123] device fsid d4838bde94c38c2-ac2179bda6fc3b8b devid 1
transid 1133 /dev/sdb1
[ 4328.618807] parent transid verify failed on 117069225984 wanted
1133 found 812
[ 4328.619570] parent transid verify failed on 117069225984 wanted
1133 found 812
[ 4328.619928] parent transid verify failed on 117069225984 wanted
1133 found 812
[ 4328.621323] btrfs: open_ctree failed

These are the same errors that btrfsck returns, which I learned in
retrospect will not fix errors. Here is what btrfs-show returns:

Label: none  uuid: c2384ce9-bd38-480d-8b3b-fca6bd7921ac
  Total devices 1 FS bytes used 82.52GB
  devid    1 size 1.82TB used 112.04GB path /dev/sdb1

Btrfs Btrfs v0.19

This system is running the following kernel:

$ uname -a
Linux erebus 2.6.32-gentoo-r7 #7 Sun May 30 21:33:03 CDT 2010 i686
Pentium III (Coppermine) GenuineIntel GNU/Linux

The system was busy copying files to the partition when it died due to
a power failure. The partition/device was not renamed between reboots.
The partition is a 2TB partition (big media drive), so if I have to
lose the data associated with the errors, I'd be okay with that
(rollback the transaction for that object or just remove the data
outright).

-Jason "s1n" Switzer
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Poor read performance on high-end server

2010-08-09 Thread Chris Mason
On Mon, Aug 09, 2010 at 04:45:45PM +0200, Freek Dijkstra wrote:
> Hi all,
> 
> Thanks a lot for the great feedback from before the weekend. Since one
> of my colleagues needed the machine, I could only do the tests today.
> 
> In short: just installing 2.6.35 did make some difference, but I was
> mostly impressed with the speedup gained by the hardware acceleration of
> the crc32c_intel module.
> 
> Here is some quick data.
> 
> Reference figures:
> 16* single disk (theoretical limit): 4092 MiByte/s
> fio data layer tests (achievable limit): 3250 MiByte/s
> ZFS performance: 2505 MiByte/s
> 
> BtrFS figures:
> IOzone on 2.6.32: 919 MiByte/s
> fio btrfs tests on 2.6.35:   1460 MiByte/s

Was this one with O_DIRECT?

> IOzone on 2.6.35 with crc32c:1250 MiByte/s
> IOzone on 2.6.35 with crc32c_intel:  1629 MiByte/s
> IOzone on 2.6.35, using -o nodatasum:1955 MiByte/s
> 
> For those finding this message and want a howto: the easiest way to use
> crc32c_intel is to add the module name to /etc/modules:
>  # echo "crc32c_intel" >> /etc/modules
>  # reboot
> 
> Now the next step for us is to tune the block sizes. We only did that
> preliminary, but now that we have a good knowledge of what software to
> use, we can start tuning that in more detail.
> 
> If there is interest on this list, I'll gladly post our results here.

Definitely, please do.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Intermittent no space errors

2010-08-09 Thread Simon Kirby
On Wed, Aug 04, 2010 at 07:21:00PM +0800, Yan, Zheng  wrote:

> > We're seeing this too, since upgrading from 2.6.33.2 + merged old git btrfs
> > unstable HEAD to plain 2.6.35.
> >
> > [sr...@backup01:.../.rmagic]# rm *
> > rm: cannot remove `WEEKLY_bar3d.png': No space left on device
> > rm: cannot remove `WEEKLY.html': No space left on device
> > rm: cannot remove `YEARLY_bar3d.png': No space left on device
> > rm: cannot remove `YEARLY.html': No space left on device
>...
> > Aug ?3 18:44:44 backup01 kernel: [ cut here ]
> > Aug ?3 18:44:44 backup01 kernel: WARNING: at fs/btrfs/extent-tree.c:3441 
> > btrfs_block_rsv_check+0x151/0x180()
>...
> 
> These warning is because btrfs in 2.6.35 reserves more metadata space
> for internal use
> than older kernel. Your FS is too full, btrfs can't reserve enough
> metadata space.

Hello!

Is it possible that 2.6.33.2 btrfs has mucked up the on-disk stuff in a
way that causes 2.6.35 to be unhappy?  The file system in question was
reported to be 85% full, according to "df".

In the meantime, we've been having some other problems on 2.6.35; for
example, rsync has been trying to append a block to a file for the past
5 days.  The file system is reported as 45% full:

[sr...@backup01:/root]# df -Pt btrfs /backup/bu000/vol05/
Filesystem 1024-blocks  Used Available Capacity Mounted on
/dev/mapper/bu000-vol05 3221225472 1429529324 1791696148  45% 
/backup/bu000/vol05
[sr...@backup01:/root]# btrfs files df /backup/bu000/vol05
Data: total=1.57TB, used=1.31TB
Metadata: total=15.51GB, used=10.48GB
System: total=12.00MB, used=192.00KB

At some point today, the kernel also spat this out:

BUG: soft lockup - CPU#3 stuck for 61s! [rsync:21903]
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler aoe bnx2
CPU 3
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler aoe bnx2

Pid: 21903, comm: rsync Tainted: GW   2.6.35-hw #2 0NK937/PowerEdge 1950
RIP: 0010:[]  [] iput+0x5d/0x70
RSP: 0018:8802c14abb48  EFLAGS: 0246
RAX:  RBX: 8802c14abb58 RCX: 0003
RDX:  RSI: 0002 RDI: 88007c075980
RBP: 8100a84e R08: 0001 R09: 8000
R10: 0002 R11:  R12: ff66
R13: 81af04e0 R14:  R15: 7fff
FS:  7fd13bbb06e0() GS:880001cc() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 02f5a108 CR3: 0001eb94a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process rsync (pid: 21903, threadinfo 8802c14aa000, task 880080b04b00)
Stack:
88007c075888 88007c0757b0 8802c14abb98 812d7439
<0> 81664cde 0001 04d8 4000
<0> 88042a708178 88042a708000 8802c14abc08 812c599c
Call Trace:
[] ? btrfs_start_one_delalloc_inode+0x129/0x160
[] ? _raw_spin_lock+0xe/0x20
[] ? shrink_delalloc+0x8c/0x130
[] ? btrfs_delalloc_reserve_metadata+0x189/0x190
[] ? file_update_time+0x11e/0x180
[] ? btrfs_delalloc_reserve_space+0x43/0x60
[] ? btrfs_file_aio_write+0x508/0x970
[] ? apic_timer_interrupt+0xe/0x20
[] ? do_sync_write+0xd1/0x120
[] ? poll_select_copy_remaining+0xf7/0x140
[] ? vfs_write+0xcb/0x1a0
[] ? sys_write+0x50/0x90
[] ? system_call_fastpath+0x16/0x1b
Code: 00 01 00 00 48 c7 c2 a0 2c 10 81 48 8b 40 30 48 85 c0 74 12 48 8b 50 20 
48 c7 c0 a0 2c 10 81 48 85 d2 48 0
Call Trace:
[] ? btrfs_start_one_delalloc_inode+0x129/0x160
[] ? _raw_spin_lock+0xe/0x20
[] ? shrink_delalloc+0x8c/0x130
[] ? btrfs_delalloc_reserve_metadata+0x189/0x190
[] ? file_update_time+0x11e/0x180
[] ? btrfs_delalloc_reserve_space+0x43/0x60
[] ? btrfs_file_aio_write+0x508/0x970
[] ? apic_timer_interrupt+0xe/0x20
[] ? do_sync_write+0xd1/0x120
[] ? poll_select_copy_remaining+0xf7/0x140
[] ? vfs_write+0xcb/0x1a0
[] ? sys_write+0x50/0x90
[] ? system_call_fastpath+0x16/0x1b

[sr...@backup01:/root]# ls -l /proc/21903/fd/1
lrwx-- 1 root root 64 2010-08-09 18:21 /proc/21903/fd/1 -> 
/backup/bu000/vol05/vg005_web11_backup/2010-08-04-17-00/64/54/.../customer 
file.mov.aYX4Js
[sr...@backup01:/root]# ls -lL /proc/21903/fd/1
-rw--- 1 root root 977797120 2010-08-04 20:39 /proc/21903/fd/1
[sr...@backup01:/root]# ps auxw|grep rsync
root 21903 73.2  0.0  12912   192 ?RAug04 5177:08 rsync -aHq 
--numeric-ids --exclude-from=/etc/backups/backup.exclude --delete 
--delete-excluded /storage/vg005/web11/64/54/ 
/backup/bu000/vol05/vg005_web11_backup/2010-08-04-17-00/64/54

In other words, the rsync has made no progress for 5 days (or at least
the mtime hasn't changed since then).

"perf top" still shows output like this, showing that btrfs is trying
to btrfs_find_space_cluster all of the time:

 samples  pcnt function   DSO
 ___ _ __

Code bug or data bug?

2010-08-09 Thread K. Richard Pixley

 I've just gotten:

r...@diamonds:~$ time sudo /sbin/btrfsck /dev/sda7
btrfsck: btrfsck.c:585: splice_shared_node: Assertion `!(src == 
&src_node->root_cache)' failed.

Aborted

Does this indicate a coding error in btrfsck or a data error in my file 
system?


--rich

r...@diamonds:~$ dpkg -l | grep btrfs
ii  btrfs-tools
0.19+20100601-3 Checksumming Copy on 
Write Filesystem utilit

r...@diamonds:~$ uname -a
Linux diamonds 2.6.32-24-generic-pae #38-Ubuntu SMP Mon Jul 5 10:54:21 
UTC 2010 i686 GNU/Linux

r...@diamonds:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 10.04.1 LTS
Release:10.04
Codename:   lucid

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Poor read performance on high-end server

2010-08-09 Thread Freek Dijkstra
Hi all,

Thanks a lot for the great feedback from before the weekend. Since one
of my colleagues needed the machine, I could only do the tests today.

In short: just installing 2.6.35 did make some difference, but I was
mostly impressed with the speedup gained by the hardware acceleration of
the crc32c_intel module.

Here is some quick data.

Reference figures:
16* single disk (theoretical limit): 4092 MiByte/s
fio data layer tests (achievable limit): 3250 MiByte/s
ZFS performance: 2505 MiByte/s

BtrFS figures:
IOzone on 2.6.32: 919 MiByte/s
fio btrfs tests on 2.6.35:   1460 MiByte/s
IOzone on 2.6.35 with crc32c:1250 MiByte/s
IOzone on 2.6.35 with crc32c_intel:  1629 MiByte/s
IOzone on 2.6.35, using -o nodatasum:1955 MiByte/s

For those finding this message and want a howto: the easiest way to use
crc32c_intel is to add the module name to /etc/modules:
 # echo "crc32c_intel" >> /etc/modules
 # reboot

Now the next step for us is to tune the block sizes. We only did that
preliminary, but now that we have a good knowledge of what software to
use, we can start tuning that in more detail.

If there is interest on this list, I'll gladly post our results here.


Jens Axboe wrote:

>>> Also, I didn't see Chris mention this, but if you have a newer intel box
>>> you can use hw accellerated crc32c instead. For some reason my test box
>>> always loads crc32c and not crc32c-intel, so I need to do that manually.
> 
> it is pretty annoying to have to do it manually. Sometimes
> you forget. And it's not possible to de-select CRC32C and have
> the intel variant loaded.

You can, but only if you first unmount the partition:

 # unmount /mnt/mybtrfsdisk
 # rmmod btrfs
 # rmmod libcrc32c
 # rmmod crc32c
 # modprobe crc32c_intel
 # mount -t btrfs /dev/sda1 /mnt/mybtrfsdisk




We encountered a small bug: the btrfs partition with RAID0 that was made
on 2.6.32 did not mount after a reboot or after unmounting. Running
btrfsck fixes this, but after a next umount, we had to run btrfsck
again. After recreating the btrfs partition on 2.6.35, all was well.
btrfs partitions that don't use (software) RAID work fine.

~# mount -t btrfs -o ssd /dev/sdd /mnt/ssd3
mount: wrong fs type, bad option, bad superblock on /dev/sdd,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

~# dmesg | tail
device fsid ec4d518ec61d4496-81e5aeda2d8ef7b5 devid 1 transid 69 /dev/sdd
btrfs: use ssd allocation scheme
btrfs: failed to read the system array on sdd
btrfs: open_ctree failed

~# btrfsck /dev/sdd
found 550511136768 bytes used err is 0
total csum bytes: 536870912
total tree bytes: 755322880
total fs tree bytes: 77824
btree space waste bytes: 169152328
file data blocks allocated: 549755813888
 referenced 549755813888
Btrfs Btrfs v0.19

~# mount -t btrfs -o ssd /dev/sdd /mnt/ssd3
[and it mounts fine now]


Regards,
Freek Dijkstra
SARA High Performance Computing and Networking
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


unable to handle kernel NULL pointer dereference

2010-08-09 Thread Smets, Jan (Jan)
Hi list

Today I was running bonnie++ on a network ceph volume. The storage backend used 
is btrfs.

You can find a full dmesg output on http://jan.sin.khk.be/dmesg

If any other action is required from my side, please let me know. I hope this 
report is of any use.

Thanks.
 - Jan



[ 2400.226987] Call Trace:
[ 2400.226987]  [] ? btrfs_file_aio_write+0x686/0x7da [btrfs]
[ 2400.226987]  [] ? __d_lookup+0xc1/0x107
[ 2400.226987]  [] ? btrfs_file_aio_write+0x0/0x7da [btrfs]
[ 2400.226987]  [] ? do_sync_readv_writev+0x9a/0xd5
[ 2400.226987]  [] ? do_filp_open+0x510/0x58e
[ 2400.226987]  [] ? copy_from_user+0x18/0x30
[ 2400.226987]  [] ? rw_copy_check_uvector+0x6a/0xe1
[ 2400.226987]  [] ? do_readv_writev+0xa4/0x118
[ 2400.226987]  [] ? finish_task_switch+0x34/0xb4
[ 2400.226987]  [] ? mutex_lock+0xd/0x33
[ 2400.226987]  [] ? sys_writev+0x45/0x90
[ 2400.226987]  [] ? system_call_fastpath+0x16/0x1b


[ 2400.966380] Call Trace:
[ 2400.966380]  [] ? btrfs_file_aio_write+0x36a/0x7da [btrfs]
[ 2400.966380]  [] ? btrfs_file_aio_write+0x36a/0x7da [btrfs]
[ 2400.966380]  [] ? set_extent_buffer_dirty+0x4d/0x5e [btrfs]
[ 2401.148493]  [] ? btrfs_run_delayed_iputs+0x35/0x104 
[btrfs]
[ 2401.148493]  [] ? __btrfs_end_transaction+0x19a/0x1a8 
[btrfs]
[ 2401.148493]  [] ? __d_lookup+0xc1/0x107
[ 2401.148493]  [] ? btrfs_file_aio_write+0x0/0x7da [btrfs]
[ 2401.148493]  [] ? do_sync_readv_writev+0x9a/0xd5
[ 2401.148493]  [] ? do_filp_open+0x510/0x58e
[ 2401.148493]  [] ? copy_from_user+0x18/0x30
[ 2401.148493]  [] ? rw_copy_check_uvector+0x6a/0xe1
[ 2401.148493]  [] ? do_readv_writev+0xa4/0x118
[ 2401.148493]  [] ? mutex_lock+0xd/0x33
[ 2401.148493]  [] ? sys_writev+0x45/0x90
[ 2401.148493]  [] ? system_call_fastpath+0x16/0x1b

 2404.104503] Call Trace:
[ 2404.104503]  [] ? kstrdup+0x2a/0x40
[ 2404.104503]  [] ? vfs_rename+0xa9/0x3e4
[ 2404.104503]  [] ? __lookup_hash+0xc2/0xe9
[ 2404.104503]  [] ? sys_renameat+0x1aa/0x22b
[ 2404.104503]  [] ? rcu_start_gp+0x1f7/0x218
[ 2404.104503]  [] ? fput+0x18f/0x1c4
[ 2404.104503]  [] ? mntput_no_expire+0x23/0xde
[ 2404.104503]  [] ? filp_close+0x5f/0x6a
[ 2404.104503]  [] ? system_call_fastpath+0x16/0x1b
[ 2404.104503] Code: 44 00 00 48 89 44 24 08 fa 66 0f 1f 44 00 00 65 4c 8b 04 
25 80 ea 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 
04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 f6 48 89 ef 
[ 2404.104503] RIP  [] __kmalloc_track_caller+0xc7/0x12d

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Function btree_get_extent: Incorrect if-else if statement

2010-08-09 Thread André Nogueira
The btree_get_extent function (in file disk-io.c) calls the
add_extent_mapping (in file extent_map.c). The add_extent_mapping
function can return two values: 0 or -EEXIST.

After the call, it is used an if-else if statement. If the result is
-EEXIST, the if statement is executed. If the result is 0, the else if
statement will not be executed because it is false.

Thank you.

Signed-off-by: Andre Nogueira 

---
 fs/btrfs/disk-io.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 34f7c37..76eb161 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -164,7 +164,7 @@ static struct extent_map *btree_get_extent(struct
inode *inode,
   failed_len);
ret = -EIO;
}
-   } else if (ret) {
+   } else {
free_extent_map(em);
em = NULL;
}
-- 
1.6.3.3
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html