Re: [3.0-rc1] kernel BUG at fs/btrfs/relocation.c:4285!

2011-06-01 Thread liubo
On 05/31/2011 08:27 AM, Tsutomu Itoh wrote:
> The panic occurred when 'btrfs fi bal /test5' was executed.
> 
> /test5 is as follows:
> # mount -o space_cache,compress=lzo /dev/sdc3 /test5
> #
> # btrfs fi sh /dev/sdc3
> Label: none  uuid: 38ec48b2-a64b-4225-8cc6-5eb08024dc64
> Total devices 5 FS bytes used 7.87MB
> devid1 size 10.00GB used 2.02GB path /dev/sdc3
> devid2 size 15.01GB used 3.00GB path /dev/sdc5
> devid3 size 15.01GB used 3.00GB path /dev/sdc6
> devid4 size 20.01GB used 2.01GB path /dev/sdc7
> devid5 size 10.00GB used 2.01GB path /dev/sdc8
> 
> Btrfs v0.19-50-ge6bd18d
> # btrfs fi df /test5
> Data, RAID0: total=10.00GB, used=3.52MB
> Data: total=8.00MB, used=1.60MB
> System, RAID1: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=1.00GB, used=216.00KB
> Metadata: total=8.00MB, used=0.00
> 

Hi, Itoh san, 

I've come up with a patch aiming to fix this bug.
The problems is that the inode allocator stores one inode cache per root,
which is at least not good for relocation tree, cause we only find
new inode number from fs tree or file tree (subvol/snapshot).

I've tested with your run.sh and it works well on my box, so you can try this:

===
based on 3.0, commit d6c0cb379c5198487e4ac124728cbb2346d63b1f
===
diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index 0009705..ebc2a7b 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -372,6 +372,10 @@ int btrfs_save_ino_cache(struct btrfs_root *root,
int prealloc;
bool retry = false;
 
+   if (root->root_key.objectid != BTRFS_FS_TREE_OBJECTID &&
+   root->root_key.objectid < BTRFS_FIRST_FREE_OBJECTID)
+   return 0;
+
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;



thanks,
liubo

> ---
> Tsutomu
> 
> 
> 
> <6>device fsid 25424ba6b248ec38-64dc2480b05ec68c devid 5 transid 4 /dev/sdc8
> <6>device fsid 25424ba6b248ec38-64dc2480b05ec68c devid 1 transid 7 /dev/sdc3
> <6>btrfs: enabling disk space caching
> <6>btrfs: use lzo compression
> <6>device fsid 69423c117ae771dd-c275f966f982cf84 devid 1 transid 7 /dev/sdd4
> <6>btrfs: disk space caching is enabled
> <6>btrfs: relocating block group 1103101952 flags 9
> <6>btrfs: found 318 extents
> <0>[ cut here ]
> <2>kernel BUG at fs/btrfs/relocation.c:4285!
> <0>invalid opcode:  [#1] SMP
> <4>CPU 1
> <4>Modules linked in: btrfs autofs4 sunrpc 8021q garp stp llc 
> cpufreq_ondemand acpi_cpufreq freq_table m
> perf ipv6 zlib_deflate libcrc32c ext3 jbd dm_mirror dm_region_hash dm_log 
> dm_mod kvm uinput ppdev parpor
> t_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 
> shpchp i3000_edac edac_core ex
> t4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi 
> ata_generic ata_piix floppy [last
> unloaded: btrfs]
> <4>Pid: 6173, comm: btrfs Not tainted 3.0.0-rc1btrfs-test #1 FUJITSU-SV  
> PRIMERGY/D2399
> <4>RIP: 0010:[]  [] 
> btrfs_reloc_cow_block+0x22c/0x270 [btrfs]
> <4>RSP: 0018:8801514236a8  EFLAGS: 00010246
> <4>RAX: 8801930dc000 RBX: 8801936f5800 RCX: 880163241d60
> <4>RDX: 88016325dd18 RSI: 8801931a3000 RDI: 8801632fb3e0
> <4>RBP: 880151423708 R08: 880151423784 R09: 0100
> <4>R10:  R11: 880163224d58 R12: 8801931a3000
> <4>R13: 88016325dd18 R14: 8801632fb3e0 R15: 
> <4>FS:  7f41577ce740() GS:88019fd0() 
> knlGS:
> <4>CS:  0010 DS:  ES:  CR0: 8005003b
> <4>CR2: 010afb80 CR3: 00015142e000 CR4: 06e0
> <4>DR0:  DR1:  DR2: 
> <4>DR3:  DR6: 0ff0 DR7: 0400
> <4>Process btrfs (pid: 6173, threadinfo 880151422000, task 
> 880151997580)
> <0>Stack:
> <4> 88016325dd18 8801632fb3e0 880151423708 a042b2ed
> <4>  0001 880151423708 8801931a3000
> <4> 880163241d60 88016325dd18 8801632fb3e0 
> <0>Call Trace:
> <4> [] ? update_ref_for_cow+0x22d/0x330 [btrfs]
> <4> [] __btrfs_cow_block+0x451/0x5e0 [btrfs]
> <4> [] btrfs_cow_block+0x10b/0x250 [btrfs]
> <4> [] btrfs_search_slot+0x557/0x870 [btrfs]
> <4> [] ? generic_bin_search+0x1f2/0x210 [btrfs]
> <4> [] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
> <4> [] btrfs_update_inode+0xc2/0x140 [btrfs]
> <4> [] btrfs_save_ino_cache+0x7c/0x200 [btrfs]
> <4> [] commit_fs_roots+0xad/0x180 [btrfs]
> <4> [] btrfs_commit_transaction+0x385/0x7d0 [btrfs]
> <4> [] ? wake_up_bit+0x40/0x40
> <4> [] prepare_to_relocate+0xdf/0xf0 [btrfs]
> <4> [] relocate_block_group+0x41/0x600 [btrfs]
> <4> [] ? mutex_lock+0x1e/0x50
> <4> [] ? btrfs_clean_old_snapshots+0xa9/0x150 [btrfs]
> <4> [] btrfs_relocate_block_group+0x1b3/0x2e0 [btrf

Re: [3.0-rc1] kernel BUG at fs/btrfs/relocation.c:4285!

2011-06-01 Thread liubo
On 06/01/2011 03:44 PM, liubo wrote:
> On 05/31/2011 08:27 AM, Tsutomu Itoh wrote:
>> > The panic occurred when 'btrfs fi bal /test5' was executed.
>> > 
>> > /test5 is as follows:
>> > # mount -o space_cache,compress=lzo /dev/sdc3 /test5
>> > #
>> > # btrfs fi sh /dev/sdc3
>> > Label: none  uuid: 38ec48b2-a64b-4225-8cc6-5eb08024dc64
>> > Total devices 5 FS bytes used 7.87MB
>> > devid1 size 10.00GB used 2.02GB path /dev/sdc3
>> > devid2 size 15.01GB used 3.00GB path /dev/sdc5
>> > devid3 size 15.01GB used 3.00GB path /dev/sdc6
>> > devid4 size 20.01GB used 2.01GB path /dev/sdc7
>> > devid5 size 10.00GB used 2.01GB path /dev/sdc8
>> > 
>> > Btrfs v0.19-50-ge6bd18d
>> > # btrfs fi df /test5
>> > Data, RAID0: total=10.00GB, used=3.52MB
>> > Data: total=8.00MB, used=1.60MB
>> > System, RAID1: total=8.00MB, used=4.00KB
>> > System: total=4.00MB, used=0.00
>> > Metadata, RAID1: total=1.00GB, used=216.00KB
>> > Metadata: total=8.00MB, used=0.00
>> > 
> 
> Hi, Itoh san, 
> 
> I've come up with a patch aiming to fix this bug.
> The problems is that the inode allocator stores one inode cache per root,
> which is at least not good for relocation tree, cause we only find
> new inode number from fs tree or file tree (subvol/snapshot).
> 
> I've tested with your run.sh and it works well on my box, so you can try this:
> 

Sorry, I messed up BTRFS_FIRST_FREE_OBJECTID and BTRFS_LAST_FREE_OBJECTID,
plz ignore this.

> ===
> based on 3.0, commit d6c0cb379c5198487e4ac124728cbb2346d63b1f
> ===
> diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
> index 0009705..ebc2a7b 100644
> --- a/fs/btrfs/inode-map.c
> +++ b/fs/btrfs/inode-map.c
> @@ -372,6 +372,10 @@ int btrfs_save_ino_cache(struct btrfs_root *root,
>   int prealloc;
>   bool retry = false;
>  
> + if (root->root_key.objectid != BTRFS_FS_TREE_OBJECTID &&
> + root->root_key.objectid < BTRFS_FIRST_FREE_OBJECTID)
> + return 0;
> +
>   path = btrfs_alloc_path();
>   if (!path)
>   return -ENOMEM;
> 
> 
> 
> thanks,
> liubo
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs hang on brd

2011-06-01 Thread David Sterba
On Tue, May 31, 2011 at 10:03:12AM +0300, Adrian Hunter wrote:
> Hi
> 
> I seem to be able to get btrfs reproducibly to
> produce warnings and finally hang when running
> a stress test on a ramdisk.
> 
> Testing was done using the "integration-test"
> branch of btrfs-unstable.  Note that I also tested
> v2.6.39 and "integration-test" took much longer to
> hang i.e. it is an improvement
> 
> The test script and stack dumps are below.
> 
> Is this a valid test?
> 
> Is it worth me investigating these?

I've tried to reproduce myself, but the fsstress utility (taken from
latest LTP suite) crashes sometimes and I cannot take it as a proper
reproduction. Can you point me to the exact version you used?

(But no warning or hang observed, on top of 3.0-rc1 + cmason/for-linus)

> Test
> 
> 
> #!/bin/sh
> 
> sudo modprobe brd rd_size=262144

this is minimal size possible, 256MB

> 
> sudo umount /mnt/test/ 2> /dev/null
> 
> echo 'mkfs.btrfs /dev/ram0'
> 
> sudo mkfs.btrfs /dev/ram0
> 
> sudo mkdir -p /mnt/test
> 
> echo 'mount -t btrfs /dev/ram0 /mnt/test'
> 
> sudo mount -t btrfs /dev/ram0 /mnt/test
> 
> sudo mkdir -p /mnt/test/test
> 
> sudo chown $USER /mnt/test/test
> sudo chgrp $USER /mnt/test/test
> 
> sudo umount /mnt/test
> 
> full=0
> i=0
> while true; do
>   sudo mount -t btrfs /dev/ram0 /mnt/test
> 
>   if df | grep ram0 | grep 100% > /dev/null; then
>   full=`expr $full \+ 1`
>   if test $full -gt 6;then
>   rm -rf /mnt/test/test/*
>   full=0
>   fi
>   else
>   full=0
>   fi
> 
>   fsstress -c -r -d /mnt/test/test -p 3 -n 1000 -l 10
> 
>   sudo umount /mnt/test
> 
>   i=`expr $i \+ 1`
>   echo $i
> done
> 
> 
> 
> Stack dumps for warnings
> 
> 
> 
> [ 7481.520750] WARNING: at fs/btrfs/extent-tree.c:5648

5644 ret = block_rsv_use_bytes(block_rsv, blocksize);
5645 if (!ret)
5646 return block_rsv;
5647 if (ret) {

5648 WARN_ON(1);

5649 ret = reserve_metadata_bytes(trans, root, block_rsv, 
blocksize,
5650  0);

and block_rsv_use_bytes() returns nonzero in case of ENOSPC.

> [ 7481.521176] WARNING: at fs/btrfs/extent-tree.c:5648 
> btrfs_alloc_free_block+0x14e/0x357 [btrfs]()
> [ 7481.521178] Hardware name: XPS 8300
> [ 7481.521180] Modules linked in: tcp_lp tun btrfs zlib_deflate
> libcrc32c brd fuse cpufreq_ondemand acpi_cpufreq freq_table mperf
> ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter
> ip6_tables ipv6 uinput snd_hda_codec_hdmi snd_hda_codec_realtek
> snd_hda_intel snd_hda_codec broadcom tg3 snd_hwdep snd_seq
> snd_seq_device snd_pcm joydev pcspkr iTCO_wdt iTCO_vendor_support
> dcdbas serio_raw i2c_i801 snd_timer snd microcode soundcore
> snd_page_alloc usb_storage i915 drm_kms_helper drm i2c_algo_bit
> i2c_core video [last unloaded: scsi_wait_scan]
> [ 7481.521237] Pid: 3980, comm: btrfs-endio-wri Tainted: GW
> 2.6.39-integration-test-20110526-01+ #2
> [ 7481.521240] Call Trace:
> [ 7481.521245]  [] warn_slowpath_common+0x85/0x9d
> [ 7481.521250]  [] warn_slowpath_null+0x1a/0x1c
> [ 7481.521288]  [] btrfs_alloc_free_block+0x14e/0x357 
> [btrfs]
> [ 7481.521303]  [] ? map_private_extent_buffer+0xb1/0xd5 
> [btrfs]
> [ 7481.521313]  [] __btrfs_cow_block+0x102/0x31e [btrfs]
> [ 7481.521322]  [] ? btrfs_set_item_key+0x3/0x20 [btrfs]
> [ 7481.521341]  [] btrfs_cow_block+0x104/0x14d [btrfs]
> [ 7481.521353]  [] btrfs_search_slot+0x162/0x502 [btrfs]
> [ 7481.521378]  [] btrfs_lookup_file_extent+0x3c/0x3e 
> [btrfs]
> [ 7481.521388]  [] ? btrfs_alloc_path+0x1a/0x2b [btrfs]
> [ 7481.521405]  [] btrfs_drop_extents+0x10e/0x731 [btrfs]
> [ 7481.521410]  [] ? need_resched+0x23/0x2d
> [ 7481.521415]  [] ? _cond_resched+0xe/0x22
> [ 7481.521420]  [] ? slab_pre_alloc_hook.clone.32+0x2d/0x31
> [ 7481.521426]  [] ? kmem_cache_alloc+0x29/0xf7
> [ 7481.521441]  [] 
> insert_reserved_file_extent.clone.34+0x70/0x1fc [btrfs]
> [ 7481.521470]  [] ? lock_extent_bits+0x5e/0xa8 [btrfs]
> [ 7481.521496]  [] btrfs_endio_direct_write+0x171/0x29a 
> [btrfs]
> [ 7481.521511]  [] ? end_workqueue_fn+0xf6/0x10e [btrfs]
> [ 7481.521516]  [] bio_endio+0x2d/0x2f
> [ 7481.521539]  [] end_workqueue_fn+0x101/0x10e [btrfs]
> [ 7481.521565]  [] worker_loop+0x193/0x4ca [btrfs]
> [ 7481.521581]  [] ? btrfs_queue_worker+0x214/0x214 [btrfs]
> [ 7481.521586]  [] kthread+0x82/0x8a
> [ 7481.521591]  [] kernel_thread_helper+0x4/0x10
> [ 7481.521596]  [] ? kthread_worker_fn+0x14b/0x14b
> [ 7481.521601]  [] ? gs_change+0x13/0x13
> [ 7481.521604] ---[ end trace abb147a5624a0a25 ]---
> [ 7481.521639] [ cut here ]
> 

> Stack dumps for  more warnings
> --
> 
> [21983.399906] WARNING: at fs/btrfs/extent-tree.c:3832

3829 static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
3830 {
3831 block_rsv_re

Re: [3.0-rc1] kernel BUG at fs/btrfs/relocation.c:4285!

2011-06-01 Thread liubo
On 06/01/2011 04:12 PM, liubo wrote:
> On 06/01/2011 03:44 PM, liubo wrote:
>> > On 05/31/2011 08:27 AM, Tsutomu Itoh wrote:
 >> > The panic occurred when 'btrfs fi bal /test5' was executed.
 >> > 
 >> > /test5 is as follows:
 >> > # mount -o space_cache,compress=lzo /dev/sdc3 /test5
 >> > #
 >> > # btrfs fi sh /dev/sdc3
 >> > Label: none  uuid: 38ec48b2-a64b-4225-8cc6-5eb08024dc64
 >> > Total devices 5 FS bytes used 7.87MB
 >> > devid1 size 10.00GB used 2.02GB path /dev/sdc3
 >> > devid2 size 15.01GB used 3.00GB path /dev/sdc5
 >> > devid3 size 15.01GB used 3.00GB path /dev/sdc6
 >> > devid4 size 20.01GB used 2.01GB path /dev/sdc7
 >> > devid5 size 10.00GB used 2.01GB path /dev/sdc8
 >> > 
 >> > Btrfs v0.19-50-ge6bd18d
 >> > # btrfs fi df /test5
 >> > Data, RAID0: total=10.00GB, used=3.52MB
 >> > Data: total=8.00MB, used=1.60MB
 >> > System, RAID1: total=8.00MB, used=4.00KB
 >> > System: total=4.00MB, used=0.00
 >> > Metadata, RAID1: total=1.00GB, used=216.00KB
 >> > Metadata: total=8.00MB, used=0.00
 >> > 
>> > 
>> > Hi, Itoh san, 
>> > 
>> > I've come up with a patch aiming to fix this bug.
>> > The problems is that the inode allocator stores one inode cache per root,
>> > which is at least not good for relocation tree, cause we only find
>> > new inode number from fs tree or file tree (subvol/snapshot).
>> > 
>> > I've tested with your run.sh and it works well on my box, so you can try 
>> > this:
>> > 

I've tested the following patch for about 1.5 hour, and nothing happened.
And would you please test this patch?

thanks,

From: Liu Bo 

[PATCH] Btrfs: fix save ino cache bug

We just get new inode number from fs root or subvol/snap root,
so we'd like to save fs/subvol/snap root's inode cache into disk.

Signed-off-by: Liu Bo 
---
 fs/btrfs/inode-map.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index 0009705..8c0c25b 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -372,6 +372,12 @@ int btrfs_save_ino_cache(struct btrfs_root *root,
int prealloc;
bool retry = false;
 
+   /* only fs tree and subvol/snap needs ino cache */
+   if (root->root_key.objectid != BTRFS_FS_TREE_OBJECTID &&
+   (root->root_key.objectid < BTRFS_FIRST_FREE_OBJECTID ||
+root->root_key.objectid > BTRFS_LAST_FREE_OBJECTID))
+   return 0;
+
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
-- 
1.6.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs hang on brd

2011-06-01 Thread Adrian Hunter

On 01/06/11 11:54, David Sterba wrote:

On Tue, May 31, 2011 at 10:03:12AM +0300, Adrian Hunter wrote:

Hi

I seem to be able to get btrfs reproducibly to
produce warnings and finally hang when running
a stress test on a ramdisk.

Testing was done using the "integration-test"
branch of btrfs-unstable.  Note that I also tested
v2.6.39 and "integration-test" took much longer to
hang i.e. it is an improvement

The test script and stack dumps are below.

Is this a valid test?

Is it worth me investigating these?


I've tried to reproduce myself, but the fsstress utility (taken from
latest LTP suite) crashes sometimes and I cannot take it as a proper
reproduction. Can you point me to the exact version you used?


The LTP version does not compile properly:

make[4]: Entering directory 
`/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress'
gcc -g -O2 -g -O2 -fno-strict-aliasing -pipe -Wall  -DNO_XFS 
-I/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress 
-D_LARGEFILE64_SOURCE -D_GNU_SOURCE -Wno-error -I../../../../include 
-I../../../../include   -L../../../../lib  fsstress.c   -o fsstress

fsstress.c: In function 'dread_f':
fsstress.c:1829:2: warning: implicit declaration of function 'memalign'
fsstress.c:1829:6: warning: assignment makes pointer from integer 
without a cast

fsstress.c: In function 'dwrite_f':
fsstress.c:1912:6: warning: assignment makes pointer from integer 
without a cast
fsstress.c:1844:17: warning: 'diob.d_miniosz' may be used uninitialized 
in this function
fsstress.c:1844:17: warning: 'diob.d_maxiosz' may be used uninitialized 
in this function
fsstress.c:1844:17: warning: 'diob.d_mem' may be used uninitialized in 
this function

fsstress.c: In function 'dread_f':
fsstress.c:1750:17: warning: 'diob.d_miniosz' may be used uninitialized 
in this function
fsstress.c:1750:17: warning: 'diob.d_maxiosz' may be used uninitialized 
in this function
fsstress.c:1750:17: warning: 'diob.d_mem' may be used uninitialized in 
this function



I hacked a couple of changes but I need to check them before
mailing to the ltp-list:


From: Adrian Hunter 
Date: Wed, 1 Jun 2011 13:01:48 +0300
Subject: [PATCH] fsstress: quick fix for compile errors

Signed-off-by: Adrian Hunter 
---
 testcases/kernel/fs/fsstress/fsstress.c |2 ++
 testcases/kernel/fs/fsstress/global.h   |1 +
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/testcases/kernel/fs/fsstress/fsstress.c 
b/testcases/kernel/fs/fsstress/fsstress.c

index e3b48ea..83c23ed 100644
--- a/testcases/kernel/fs/fsstress/fsstress.c
+++ b/testcases/kernel/fs/fsstress/fsstress.c
@@ -1757,6 +1757,7 @@ dread_f(int opno, long r)
struct stat64   stb;
int v;

+   memset(&diob, 0, sizeof(struct dioattr));
init_pathname(&f);
if (!get_fname(FT_REGFILE, r, &f, NULL, NULL, &v)) {
if (v)
@@ -1851,6 +1852,7 @@ dwrite_f(int opno, long r)
struct stat64   stb;
int v;

+   memset(&diob, 0, sizeof(struct dioattr));
init_pathname(&f);
if (!get_fname(FT_REGFILE, r, &f, NULL, NULL, &v)) {
if (v)
diff --git a/testcases/kernel/fs/fsstress/global.h 
b/testcases/kernel/fs/fsstress/global.h

index f788395..5ab5d56 100644
--- a/testcases/kernel/fs/fsstress/global.h
+++ b/testcases/kernel/fs/fsstress/global.h
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifndef O_DIRECT
 #define O_DIRECT 04
--
1.7.4.4



(But no warning or hang observed, on top of 3.0-rc1 + cmason/for-linus)


I will try it tonight.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs hang on brd

2011-06-01 Thread ajh mls

On 01/06/11 13:07, Adrian Hunter wrote:

On 01/06/11 11:54, David Sterba wrote:

On Tue, May 31, 2011 at 10:03:12AM +0300, Adrian Hunter wrote:

Hi

I seem to be able to get btrfs reproducibly to
produce warnings and finally hang when running
a stress test on a ramdisk.

Testing was done using the "integration-test"
branch of btrfs-unstable. Note that I also tested
v2.6.39 and "integration-test" took much longer to
hang i.e. it is an improvement

The test script and stack dumps are below.

Is this a valid test?

Is it worth me investigating these?


I've tried to reproduce myself, but the fsstress utility (taken from
latest LTP suite) crashes sometimes and I cannot take it as a proper
reproduction. Can you point me to the exact version you used?


The LTP version does not compile properly:

make[4]: Entering directory
`/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress'

gcc -g -O2 -g -O2 -fno-strict-aliasing -pipe -Wall -DNO_XFS
-I/home/ahunter/Desktop/Projects/ltp/ltp-full-20110228/testcases/kernel/fs/fsstress
-D_LARGEFILE64_SOURCE -D_GNU_SOURCE -Wno-error -I../../../../include
-I../../../../include -L../../../../lib fsstress.c -o fsstress
fsstress.c: In function 'dread_f':
fsstress.c:1829:2: warning: implicit declaration of function 'memalign'
fsstress.c:1829:6: warning: assignment makes pointer from integer
without a cast
fsstress.c: In function 'dwrite_f':
fsstress.c:1912:6: warning: assignment makes pointer from integer
without a cast
fsstress.c:1844:17: warning: 'diob.d_miniosz' may be used uninitialized
in this function
fsstress.c:1844:17: warning: 'diob.d_maxiosz' may be used uninitialized
in this function
fsstress.c:1844:17: warning: 'diob.d_mem' may be used uninitialized in
this function
fsstress.c: In function 'dread_f':
fsstress.c:1750:17: warning: 'diob.d_miniosz' may be used uninitialized
in this function
fsstress.c:1750:17: warning: 'diob.d_maxiosz' may be used uninitialized
in this function
fsstress.c:1750:17: warning: 'diob.d_mem' may be used uninitialized in
this function


I hacked a couple of changes but I need to check them before
mailing to the ltp-list:



In fact there is already a fix here:

http://sourceforge.net/mailarchive/message.php?msg_id=27212868
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.0-rc1] kernel BUG at fs/btrfs/relocation.c:4285!

2011-06-01 Thread Tsutomu Itoh
Hi, liubo,

(2011/06/01 18:42), liubo wrote:
> On 06/01/2011 04:12 PM, liubo wrote:
>> On 06/01/2011 03:44 PM, liubo wrote:
 On 05/31/2011 08:27 AM, Tsutomu Itoh wrote:
 The panic occurred when 'btrfs fi bal /test5' was executed.

 /test5 is as follows:
 # mount -o space_cache,compress=lzo /dev/sdc3 /test5
 #
 # btrfs fi sh /dev/sdc3
 Label: none  uuid: 38ec48b2-a64b-4225-8cc6-5eb08024dc64
  Total devices 5 FS bytes used 7.87MB
  devid1 size 10.00GB used 2.02GB path /dev/sdc3
  devid2 size 15.01GB used 3.00GB path /dev/sdc5
  devid3 size 15.01GB used 3.00GB path /dev/sdc6
  devid4 size 20.01GB used 2.01GB path /dev/sdc7
  devid5 size 10.00GB used 2.01GB path /dev/sdc8

 Btrfs v0.19-50-ge6bd18d
 # btrfs fi df /test5
 Data, RAID0: total=10.00GB, used=3.52MB
 Data: total=8.00MB, used=1.60MB
 System, RAID1: total=8.00MB, used=4.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID1: total=1.00GB, used=216.00KB
 Metadata: total=8.00MB, used=0.00


 Hi, Itoh san,

 I've come up with a patch aiming to fix this bug.
 The problems is that the inode allocator stores one inode cache per root,
 which is at least not good for relocation tree, cause we only find
 new inode number from fs tree or file tree (subvol/snapshot).

 I've tested with your run.sh and it works well on my box, so you can try 
 this:

> 
> I've tested the following patch for about 1.5 hour, and nothing happened.
> And would you please test this patch?

Thank you for your investigation.

I will also test again. but, I cannot test until next week because I
will go to LinuxCon tomorrow and the day after tomorrow.

Thanks,
Tsutomu


> 
> thanks,
> 
> From: Liu Bo
> 
> [PATCH] Btrfs: fix save ino cache bug
> 
> We just get new inode number from fs root or subvol/snap root,
> so we'd like to save fs/subvol/snap root's inode cache into disk.
> 
> Signed-off-by: Liu Bo
> ---
>   fs/btrfs/inode-map.c |6 ++
>   1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
> index 0009705..8c0c25b 100644
> --- a/fs/btrfs/inode-map.c
> +++ b/fs/btrfs/inode-map.c
> @@ -372,6 +372,12 @@ int btrfs_save_ino_cache(struct btrfs_root *root,
>   int prealloc;
>   bool retry = false;
> 
> + /* only fs tree and subvol/snap needs ino cache */
> + if (root->root_key.objectid != BTRFS_FS_TREE_OBJECTID&&
> + (root->root_key.objectid<  BTRFS_FIRST_FREE_OBJECTID ||
> +  root->root_key.objectid>  BTRFS_LAST_FREE_OBJECTID))
> + return 0;
> +
>   path = btrfs_alloc_path();
>   if (!path)
>   return -ENOMEM;

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs error after using kernel 3.0-rc1

2011-06-01 Thread Fajar A. Nugraha
On Wed, Jun 1, 2011 at 6:06 AM, Fajar A. Nugraha  wrote:
> While using btrfs as root on kernel 3.0-rc1, there was some errors (I
> wasn't able to capture the error) that forced me to do hard reset.
>
> Now during startup system drops to busybox shell because it's unable
> to mount root partition.
> Is there a way to recover the data, as at least grub2 was still happy
> enough to load kernel and initrd (both of which located on the same
> btrfs partition)?
>
> This is what dmesg says
>
> [    4.536798] device label SSD-ROOT devid 1 transid 38245 /dev/sda2
> [    9.552086] device label SSD-ROOT devid 1 transid 38245
> /dev/disk/by-label/SSD-ROOT
> [    9.554563] btrfs: disk space caching is enabled
> [    9.564301] parent transid verify failed on 44040192 wanted 38240 found 
> 32526
> [    9.564535] parent transid verify failed on 44040192 wanted 38240 found 
> 32526
> [    9.564778] parent transid verify failed on 44040192 wanted 38240 found 
> 32526
> [    9.575679] parent transid verify failed on 44052480 wanted 38240 found 
> 31547
> [    9.575904] parent transid verify failed on 44052480 wanted 38240 found 
> 31547
> [    9.576176] parent transid verify failed on 44052480 wanted 38240 found 
> 31547
> [    9.586121] parent transid verify failed on 44064768 wanted 38240 found 
> 34145
> [    9.586319] parent transid verify failed on 44064768 wanted 38240 found 
> 34145
> [    9.586515] parent transid verify failed on 44064768 wanted 38240 found 
> 34145
> [    9.587027] parent transid verify failed on 44068864 wanted 38240 found 
> 34476
> [    9.589732] Btrfs detected SSD devices, enabling SSD mode
> [    9.592923] block group 29360128 has an wrong amount of free space
> [    9.592959] btrfs: failed to load free space cache for block group 29360128


For anyone who got the same problem,

I was finally able to mount the fs using Ubuntu Natty's
2.6.38-8-generic (the one on live CD).
Previously I tried using 2.6.38-9-generic and and 3.0-rc1, none works.
Now I'm copying the files somewhere else before reinstalling this
system.

On another note, does anybody know how btrfs allocates ID for subvols?
It doesn't seem to reuse deleted subvol's ID. What happens when the
last subvol ID is 999?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] scrub fix for -rc2

2011-06-01 Thread Arne Jansen
Hi Chris,

please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/arne/btrfs-unstable-arne.git
for-chris

the bio-reuse fix. I also included the small fix for the false BUG_ON
in volumes.c. Here's the shortlog:

Arne Jansen (2):
  btrfs: scrub: don't reuse bios and pages
  btrfs: false BUG_ON when degraded

Thanks,
Arne
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: don't save the inode cache if we are deleting this root

2011-06-01 Thread David Sterba
Hi,

On Tue, May 31, 2011 at 03:33:33PM -0400, Josef Bacik wrote:
> Signed-off-by: Josef Bacik 
Tested-by: David Sterba 

really needed in order to run xfstests, thanks.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: build warninga in Linus' tree

2011-06-01 Thread Mitch Harder
On Tue, May 31, 2011 at 12:57 PM, David Sterba  wrote:
> Hi,
>
> On Mon, May 30, 2011 at 11:36:53AM +1000, Stephen Rothwell wrote:
>> After merging the Linus' tree, today's linux-next build (powerpc
>> ppc64_defconfig) produced these warnings:
>>
>> fs/btrfs/sysfs.c:76:26: warning: 'btrfs_root_attrs' defined but not used
>> fs/btrfs/sysfs.c:97:26: warning: 'btrfs_super_attrs' defined but not used
>> fs/btrfs/sysfs.c:153:13: warning: 'btrfs_super_release' defined but not used
>> fs/btrfs/sysfs.c:160:13: warning: 'btrfs_root_release' defined but not used
>>
>> I have started using gcc v4.5.2 (instead of v4.4.4) if that makes a
>> difference.
>
> the warning probably started to show up after one of my cleanup patches,
> removing unused functions (f2a97a9dbd86eb1ef956bdf20e05c507b32beb96).
> The sysfs interface is not being used right now, but there's a unmerged
> patchset which adds the interesting bits like info about available btrfs
> filesystems and devices. I don't know what are the intentions regarding
> sysfs.
>
>
> david

I've been playing around with resurrecting the basic sysfs
capabilities that had been previously incorporated into btrfs.

As it stands right now, it was relatively easy to re-implement sysfs
as it was originally.  However, that implementation of sysfs wasn't
populated with much information (only total_blocks, blocks_used, and
blocksize).

I also had to reverse a small portion of code that was in the last clean-up.

If a CONFIG_BTRFS_DEBUG type configuration flag is ever introduced, it
would be interesting to resurrect btrfs' sysfs capabilities.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs kernel module crash

2011-06-01 Thread Jacques
Hi all,

I have been using btrfs as my / partition on my fedora 14 box for a
while now and until yesterday it was working a treat.

Had an unexpected power fail during the night and now my os crashes when
mounting /

Most of my data is backed up, but there is some data which I don't want
to lose on the partition...

I put the drive in a SATA - USB cradle and hooked it to my fedora 13
laptop which promptly became unresponsive.

So I upgraded the laptop to fc15 and installed the latest 2.6.39.1
kernel from rawhide. When I try to mount the first time I get the
following crash: http://pastebin.com/qA3kztzh but at least the system
stays alive. When I manually try to 'mount /dev/sdb5 /mnt' the process
just stalls, can't even crtl+c to quit.

On recommendations from the IRC folks I compiled the git
btrfs-progs-unstable including the 'btrfs-zero-logs' tool. I have tried
running the tool against the affected partition to no avail, no error
messages either.

btrfsck reports no errors either: http://pastebin.com/P4uXteLC

I have no idea where to go from here, if anyone can help it would be
much appreciated!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs error after using kernel 3.0-rc1

2011-06-01 Thread Chris Mason
Excerpts from Fajar A. Nugraha's message of 2011-06-01 08:22:40 -0400:
> On Wed, Jun 1, 2011 at 6:06 AM, Fajar A. Nugraha  wrote:
> > While using btrfs as root on kernel 3.0-rc1, there was some errors (I
> > wasn't able to capture the error) that forced me to do hard reset.
> >
> > Now during startup system drops to busybox shell because it's unable
> > to mount root partition.
> > Is there a way to recover the data, as at least grub2 was still happy
> > enough to load kernel and initrd (both of which located on the same
> > btrfs partition)?
> >
> > This is what dmesg says
> >
> > [    4.536798] device label SSD-ROOT devid 1 transid 38245 /dev/sda2
> > [    9.552086] device label SSD-ROOT devid 1 transid 38245
> > /dev/disk/by-label/SSD-ROOT
> > [    9.554563] btrfs: disk space caching is enabled
> > [    9.564301] parent transid verify failed on 44040192 wanted 38240 found 
> > 32526
> > [    9.564535] parent transid verify failed on 44040192 wanted 38240 found 
> > 32526
> > [    9.564778] parent transid verify failed on 44040192 wanted 38240 found 
> > 32526
> > [    9.575679] parent transid verify failed on 44052480 wanted 38240 found 
> > 31547
> > [    9.575904] parent transid verify failed on 44052480 wanted 38240 found 
> > 31547
> > [    9.576176] parent transid verify failed on 44052480 wanted 38240 found 
> > 31547
> > [    9.586121] parent transid verify failed on 44064768 wanted 38240 found 
> > 34145
> > [    9.586319] parent transid verify failed on 44064768 wanted 38240 found 
> > 34145
> > [    9.586515] parent transid verify failed on 44064768 wanted 38240 found 
> > 34145
> > [    9.587027] parent transid verify failed on 44068864 wanted 38240 found 
> > 34476
> > [    9.589732] Btrfs detected SSD devices, enabling SSD mode
> > [    9.592923] block group 29360128 has an wrong amount of free space
> > [    9.592959] btrfs: failed to load free space cache for block group 
> > 29360128
> 
> 
> For anyone who got the same problem,
> 
> I was finally able to mount the fs using Ubuntu Natty's
> 2.6.38-8-generic (the one on live CD).
> Previously I tried using 2.6.38-9-generic and and 3.0-rc1, none works.
> Now I'm copying the files somewhere else before reinstalling this
> system.

The tools have a command to zero out the btrfs log tree, that would have
allowed you to mount.  Do you still have the busted FS?

Thanks a lot for this bug report, I'll try to reproduce it.

> 
> On another note, does anybody know how btrfs allocates ID for subvols?
> It doesn't seem to reuse deleted subvol's ID. What happens when the
> last subvol ID is 999?
> 

We don't reuse the ids for subvols or snapshots, but we can have a
little less than 2^64 of them.  An id can be reused as long as there are
no blocks with refs for it in the extent allocation tree, but that needs
to be checked before we reuse it.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Announcing btrfs-gui

2011-06-01 Thread Hugo Mills
   Over the last few weeks, I've been playing with a foolish idea,
mostly triggered by a cluster of people being confused by btrfs's free
space reporting (df vs btrfs fi df vs btrfs fi show). I also wanted an
excuse, and some code, to mess around in the depths of the FS data
structures.

   Like all silly ideas, this one got a bit out of hand, and seems to
have turned into something vaguely useful. I'm therefore pleased to
announce the first major public release of btrfs-gui[1]: a point-and-
click tool for managing btrfs filesystems.

   The tool currently can scan for and list btrfs filesystems and the
volumes they live on. It can show the allocation and usage of data in
a selected filesystem, categorised by use, replication, and device. It
can show and manipulate subvolumes and snapshots: creation, deletion,
and setting the default. For those with servers to manage, it also has
the ability to ssh into a remote machine, and manage its filesystems
remotely (so you don't have to have X installed on your servers just
to use btrfs-gui on them).

   You can get the latest version from git[2] (or gitweb[3]), or
tarball download of the sources from [4]. To install and run the GUI,
you will need python3 and the python tk libraries (package python3-tk
on my Debian system). The root helper component (which can be
installed independently on an X-less server) will run under python2 or
python3, depending on how it's installed. Installation instructions
can be found on the main web page, and in the README file.

   Unless the traffic gets too high-volume, or unless someone
important objects, I'm going to suggest that bug reports should go to
this list for now (cc'd me, if you like). Note that this isn't an
"offical" btrfs project -- it's just something I knocked together on
my own.

   Finally, I'd like to thank David Sterba for testing the pre-release
versions, reporting bugs, and making many good suggestions for
improvements. Any deviation from his instructions is entirely my
fault. :)

   Hugo.

[1] http://carfax.org.uk/btrfs-gui/
[2] http://git.darksatanic.net/repo/btrfs-gui.git/
[3] http://git.darksatanic.net/cgi/gitweb.cgi?p=btrfs-gui.git;a=summary
[4] http://carfax.org.uk/node/79

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Great oxymorons of the world, no. 7: The Simple Truth ---  


signature.asc
Description: Digital signature


btrfs-progs-unstable tmp branch build error

2011-06-01 Thread Fajar A. Nugraha
When building from tmp branch I got this error:

mkfs.c: In function ‘main’:
mkfs.c:730:6: error: ‘ret’ may be used uninitialized in this function
mkfs.c:841:43: error: ‘parent_dir_entry’ may be used uninitialized in
this function
make: *** [mkfs.o] Error 1


"git blame" shows the last commit for both lines was
commit e3736c698e8b490bea1375576b718a2de6e89603
Author: Donggeun Kim 
Date:   Thu Jul 8 09:17:59 2010 +

btrfs-progs: Add new feature to mkfs.btrfs to make file system
image file from source directory


Removing "-Werror" flag from Makefile made it compile succesfully though.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs error after using kernel 3.0-rc1

2011-06-01 Thread liubo
On 06/01/2011 08:22 PM, Fajar A. Nugraha wrote:
> On Wed, Jun 1, 2011 at 6:06 AM, Fajar A. Nugraha  wrote:
>> While using btrfs as root on kernel 3.0-rc1, there was some errors (I
>> wasn't able to capture the error) that forced me to do hard reset.
>>
>> Now during startup system drops to busybox shell because it's unable
>> to mount root partition.
>> Is there a way to recover the data, as at least grub2 was still happy
>> enough to load kernel and initrd (both of which located on the same
>> btrfs partition)?
>>
>> This is what dmesg says
>>
>> [4.536798] device label SSD-ROOT devid 1 transid 38245 /dev/sda2
>> [9.552086] device label SSD-ROOT devid 1 transid 38245
>> /dev/disk/by-label/SSD-ROOT
>> [9.554563] btrfs: disk space caching is enabled
>> [9.564301] parent transid verify failed on 44040192 wanted 38240 found 
>> 32526
>> [9.564535] parent transid verify failed on 44040192 wanted 38240 found 
>> 32526
>> [9.564778] parent transid verify failed on 44040192 wanted 38240 found 
>> 32526
>> [9.575679] parent transid verify failed on 44052480 wanted 38240 found 
>> 31547
>> [9.575904] parent transid verify failed on 44052480 wanted 38240 found 
>> 31547
>> [9.576176] parent transid verify failed on 44052480 wanted 38240 found 
>> 31547
>> [9.586121] parent transid verify failed on 44064768 wanted 38240 found 
>> 34145
>> [9.586319] parent transid verify failed on 44064768 wanted 38240 found 
>> 34145
>> [9.586515] parent transid verify failed on 44064768 wanted 38240 found 
>> 34145
>> [9.587027] parent transid verify failed on 44068864 wanted 38240 found 
>> 34476
>> [9.589732] Btrfs detected SSD devices, enabling SSD mode
>> [9.592923] block group 29360128 has an wrong amount of free space
>> [9.592959] btrfs: failed to load free space cache for block group 
>> 29360128
> 
> 
> For anyone who got the same problem,
> 
> I was finally able to mount the fs using Ubuntu Natty's
> 2.6.38-8-generic (the one on live CD).
> Previously I tried using 2.6.38-9-generic and and 3.0-rc1, none works.
> Now I'm copying the files somewhere else before reinstalling this
> system.
> 
> On another note, does anybody know how btrfs allocates ID for subvols?
> It doesn't seem to reuse deleted subvol's ID. What happens when the
> last subvol ID is 999?
> 

Yes, no reuse.

a new subvol will be 1000, one large than 999.

thanks,
liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


different st_dev's in one subvolume

2011-06-01 Thread Stephane Chazelas
Hiya,

please consider this:

~# truncate -s1G ./a
~# mkfs.btrfs ./a
~# sudo mount -o loop ./a /mnt/1
~# cd /mnt/1
/mnt/1# ls
/mnt/1# btrfs sub c A
Create subvolume './A'
/mnt/1# btrfs sub c A/B
Create subvolume 'A/B'
/mnt/1# touch A/inA A/B/inB
/mnt/1# btrfs sub snap A A.snap
Create a snapshot of 'A' in './A.snap'
/mnt/1# zmodload zsh/stat
/mnt/1# zstat +device ./**/*
. 25
A 26
A/B 27
A/B/inB 27
A/inA 26
A.snap 28
A.snap/B 23
A.snap/inA 28

Why does A.snap/B have a different st_dev from A.snap's?

Also:

/mnt/1# touch A.snap/B/foo
touch: cannot touch `A.snap/B/foo': Permission denied

I can rmdir that directory OK though.

Also note that the permissions are different:

/mnt/1# ll A
total 0
drwx-- 1 root root 6 Jun  2 00:54 B/
-rw-r--r-- 1 root root 0 Jun  2 00:54 inA
/mnt/1# ll A.snap
total 0
drwxr-xr-x 1 root root 0 Jun  2 01:29 B/
-rw-r--r-- 1 root root 0 Jun  2 00:54 inA

If I create another snap of A or A.snap, the "B" in there gets
the same st_dev (23).

/mnt/1# btrfs sub create A.snap/B/C
Create subvolume 'A.snap/B/C'
ERROR: cannot create subvolume
# btrfs sub snap A.snap/B B.snap
ERROR: 'A.snap/B' is not a subvolume

-- 
Stephane
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: different st_dev's in one subvolume

2011-06-01 Thread Stephane Chazelas
2011-06-02 01:39:41 +0100, Stephane Chazelas:
[...]
> /mnt/1# zstat +device ./**/*
> . 25
> A 26
> A/B 27
> A/B/inB 27
> A/inA 26
> A.snap 28
> A.snap/B 23
> A.snap/inA 28
> 
> Why does A.snap/B have a different st_dev from A.snap's?
[...]
> If I create another snap of A or A.snap, the "B" in there gets
> the same st_dev (23).
[...]

And same inode, ctime, mtime, atime... And when I create a new
snapshot, all those (regardless of where they are) have their
times updated at once.

I also noticed the st_nlink is always one but then came accross
http://thread.gmane.org/gmane.comp.file-systems.btrfs/4580

-- 
Stephane
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG 3.0-rc1] oops during file removal, severe lock contention

2011-06-01 Thread Dave Chinner
Hi Folks,

Running on 3.0-rc1 on an 8p/4G RAM VM with a 16TB filesystem (12
disk DM stripe) a 50 million inode 8-way fsmark creation workload
via:

$ /usr/bin/time ./fs_mark -D 1 -S0 -n 10 -s 0 -L 63 \
> -d /mnt/scratch/0 -d /mnt/scratch/1 \
> -d /mnt/scratch/2 -d /mnt/scratch/3 \
> -d /mnt/scratch/4 -d /mnt/scratch/5 \
> -d /mnt/scratch/6 -d /mnt/scratch/7

followed by an 8-way rm -rf on the result via:

$ for i in /mnt/scratch/*; do /usr/bin/time rm -rf $i 2>&1 & done

resulted in this oops:

[ 2671.052861] device fsid 84f7a99b2f193c6-c3228aae4c5a2f8a devid 1 transid 7 
/dev/vda
[ 8626.879250] BUG: unable to handle kernel paging request at 88012000
[ 8626.880020] IP: [] chksum_update+0x23/0x50
[ 8626.880020] PGD 1ef2063 PUD 11fffa067 PMD 0
[ 8626.880020] Oops:  [#1] SMP
[ 8626.880020] CPU 5
[ 8626.880020] Modules linked in:
[ 8626.880020]
[ 8626.880020] Pid: 3326, comm: btrfs-transacti Not tainted 3.0.0-rc1-dgc+ 
#1272 Bochs Bochs
[ 8626.880020] RIP: 0010:[]  [] 
chksum_update+0x23/0x50
[ 8626.880020] RSP: 0018:88010fba7a30  EFLAGS: 00010283
[ 8626.880020] RAX: 2dda3ac0 RBX: 88010fba7a50 RCX: 88012001
[ 8626.880020] RDX: 009d RSI: 88012000 RDI: 88010fba7a50
[ 8626.880020] RBP: 88010fba7a30 R08: 880217846000 R09: 1025
[ 8626.880020] R10: 88011affe0c0 R11: dead00200200 R12: 88010fba7bd0
[ 8626.880020] R13: 880117846025 R14:  R15: 0001
[ 8626.880020] FS:  () GS:88011fd4() 
knlGS:
[ 8626.880020] CS:  0010 DS:  ES:  CR0: 8005003b
[ 8626.880020] CR2: 88012000 CR3: 01ef1000 CR4: 06e0
[ 8626.880020] DR0:  DR1:  DR2: 
[ 8626.880020] DR3:  DR6: 0ff0 DR7: 0400
[ 8626.880020] Process btrfs-transacti (pid: 3326, threadinfo 88010fba6000, 
task 88011affe0c0)
[ 8626.880020] Stack:
[ 8626.880020]  88010fba7a40 81651688 88010fba7a90 
81688967
[ 8626.880020]  880119b065c0 0008 8801 
005005818de0
[ 8626.880020]  88010fba7bd0 880105818de0 88010fba7aa0 
8800cc6c63a0
[ 8626.880020] Call Trace:
[ 8626.880020]  [] crypto_shash_update+0x18/0x30
[ 8626.880020]  [] crc32c+0x47/0x60
[ 8626.880020]  [] btrfs_csum_data+0x12/0x20
[ 8626.880020]  [] __btrfs_write_out_cache+0x601/0xc70
[ 8626.880020]  [] ? __btrfs_prealloc_file_range+0x196/0x220
[ 8626.880020]  [] ? _raw_spin_lock+0xe/0x20
[ 8626.880020]  [] btrfs_write_out_ino_cache+0x62/0xb0
[ 8626.880020]  [] btrfs_save_ino_cache+0x11e/0x210
[ 8626.880020]  [] commit_fs_roots+0xad/0x180
[ 8626.880020]  [] ? mutex_lock+0x1e/0x50
[ 8626.880020]  [] ? btrfs_free_path+0x2a/0x40
[ 8626.880020]  [] btrfs_commit_transaction+0x375/0x7b0
[ 8626.880020]  [] ? wake_up_bit+0x40/0x40
[ 8626.880020]  [] transaction_kthread+0x293/0x2b0
[ 8626.880020]  [] ? btrfs_bio_wq_end_io+0x90/0x90
[ 8626.880020]  [] kthread+0x96/0xa0
[ 8626.880020]  [] kernel_thread_helper+0x4/0x10
[ 8626.880020]  [] ? kthread_worker_fn+0x190/0x190
[ 8626.880020]  [] ? gs_change+0x13/0x13
[ 8626.880020] Code: ea ff ff ff c9 c3 66 90 55 48 89 e5 66 66 66 66 90 8b 47 
10 85 d2 74 2d 48 8d 4e 01 44 8d 42 ff 4e 8d 04 01 eb 05 66 90 48 ff c1 <0f> b6 
16 48 89
[ 8626.880020] RIP  [] chksum_update+0x23/0x50
[ 8626.880020]  RSP 
[ 8626.880020] CR2: 88012000
[ 8626.880020] ---[ end trace dad2f8b74a28cc71 ]---

Also, there is massive lock contention while running these workloads.
perf top shows this for the create after about 5m inodes have been
created:

   samples  pcnt function  DSO
 ___ _ _ _

20626.00 25.6% __ticket_spin_lock[kernel.kallsyms]
 5148.00  6.4% _raw_spin_unlock_irqrestore   [kernel.kallsyms]
 3769.00  4.7% test_range_bit[kernel.kallsyms]
 2239.00  2.8% chksum_update [kernel.kallsyms]
 2143.00  2.7% finish_task_switch[kernel.kallsyms]
 1912.00  2.4% inode_tree_add[kernel.kallsyms]
 1825.00  2.3% radix_tree_lookup [kernel.kallsyms]
 1449.00  1.8% generic_bin_search[kernel.kallsyms]
 1205.00  1.5% btrfs_search_slot [kernel.kallsyms]
 1198.00  1.5% btrfs_tree_lock   [kernel.kallsyms]
 1104.00  1.4% mutex_spin_on_owner   [kernel.kallsyms]
 1023.00  1.3% kmem_cache_alloc  [kernel.kallsyms]
 1016.00  1.3% map_private_extent_buffer [kernel.kallsyms]
  931.00  1.2% verify_parent_transid [kernel.kallsyms]
  895.00  1.1% find_extent_buffer[kernel.kallsyms]
  785.00  1.0% kmem_cache_free 

Re: btrfs error after using kernel 3.0-rc1

2011-06-01 Thread Fajar A. Nugraha
On Thu, Jun 2, 2011 at 4:48 AM, Chris Mason  wrote:
> Excerpts from Fajar A. Nugraha's message of 2011-06-01 08:22:40 -0400:
>> On Wed, Jun 1, 2011 at 6:06 AM, Fajar A. Nugraha  wrote:
>> > While using btrfs as root on kernel 3.0-rc1, there was some errors (I
>> > wasn't able to capture the error) that forced me to do hard reset.
>> >
>> > Now during startup system drops to busybox shell because it's unable
>> > to mount root partition.

>> For anyone who got the same problem,
>>
>> I was finally able to mount the fs using Ubuntu Natty's
>> 2.6.38-8-generic (the one on live CD).

> The tools have a command to zero out the btrfs log tree, that would have
> allowed you to mount.

Do you mean btrfs-zero-log?
It's not compiled by default, is it? I didn't know about that until I
read another thread that mentions it, and by that time I was already
able to mount it.

>  Do you still have the busted FS?

Yup. Made an image, put it in an external disk (which also use btrfs),
and created a snapshot.

Here's what I get using btrfs-progs-unstable tmp branch:

$ btrfsck sda2.img
parent transid verify failed on 44040192 wanted 38240 found 32526
parent transid verify failed on 44040192 wanted 38240 found 32526
parent transid verify failed on 44052480 wanted 38240 found 31547
parent transid verify failed on 44052480 wanted 38240 found 31547
parent transid verify failed on 44064768 wanted 38240 found 34145
parent transid verify failed on 44064768 wanted 38240 found 34145
parent transid verify failed on 44068864 wanted 38240 found 34476
parent transid verify failed on 44068864 wanted 38240 found 34476
leaf parent key incorrect 44032000
bad block 44032000
warning, start mismatch 10833383424 10833408000
Aborted


$ btrfs-zero-log sda2.img
parent transid verify failed on 44040192 wanted 38240 found 32526
parent transid verify failed on 44040192 wanted 38240 found 32526
parent transid verify failed on 44052480 wanted 38240 found 31547
parent transid verify failed on 44052480 wanted 38240 found 31547
parent transid verify failed on 44064768 wanted 38240 found 34145
parent transid verify failed on 44064768 wanted 38240 found 34145
parent transid verify failed on 44068864 wanted 38240 found 34476
parent transid verify failed on 44068864 wanted 38240 found 34476


After that the filesystem is mountable again, although syslog still
shows this entry:
Jun  2 07:50:26 HP kernel: [ 2095.290057] parent transid verify failed
on 44032000 wanted 38240 found 24586


When copying some of the files, these logs appear on syslog (the same
logs appear whether I use the image mounted on kernel
2.6.38-9-generic, or the one fixed with btrfs-zero-log):
Jun  2 07:50:26 HP kernel: [ 2095.756842] btrfs no csum found for
inode 61485 start 743616512
Jun  2 07:50:26 HP kernel: [ 2095.756950] btrfs csum failed ino 61485
extent 23713038336 csum 1645309641 wanted 0 mirror 1

What does "wanted 0" mean here?

During the copy of that particular file, the system would consistently
lockup at some point (there was no call trace availabe). I was able to
copy it with the help of "mount -o nodatasum,ro" and "rsync --append".
This particular file also appears undamaged (it's a Virtualbox disk
image, and the OS & application on it ran fine).


It'd be great if we can find out what's causing these errors, but for
the time being I'm happy enough to get my data back :D

Thanks,

Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] Btrfs-progs: Backref walking utilities

2011-06-01 Thread Liu Bo
This patch comes from one of project ideas on btrfs's wiki:
Quote:
Given a block number on a disk, the Btrfs metadata can find all the files and
directories that use or care about that block. Some utilities to walk these
back refs and print the results would help debug corruptions.

Given an inode, the Btrfs metadata can find all the directories that point to
the inode. We should have utils to walk these back refs as well.
end quote.

And the patch brings us:
1) -i aaa
   This indicates to walk inode ref belonged to 'aaa' ('aaa' is an inode 
number).
2) -b aaa
   This indicates to walk extent backref who started at 'aaa' ('aaa' is a 
logical
   address).
3) -s aaa -i bbb
   This is similar to 1), and '-s aaa' stands for which snapshot we will search
   thorough, while '-i bbb' still point to an inode number.

Here are some results:
===
$ btrfs-walk-backref -i 257 /dev/sda10
FS tree
file (inode: 257):
inode ref index 2 namelen 3 name: tmp
inode ref index 4 namelen 4 name: foo1
|
`---dir (inode: 256):
inode ref index 0 namelen 2 name: ..

file (inode: 257):
inode ref index 2 namelen 4 name: foo2
|
`---dir (inode: 258):
inode ref index 5 namelen 3 name: dir

file tree (256)
file (inode: 257):
inode ref index 2 namelen 3 name: tmp
|
`---dir (inode: 256):
inode ref index 0 namelen 2 name: ..

file tree (257)
file (inode: 257):
inode ref index 2 namelen 3 name: tmp
|
`---dir (inode: 256):
inode ref index 0 namelen 2 name: ..

file tree (258)
file (inode: 257):
inode ref index 2 namelen 3 name: tmp
|
`---dir (inode: 256):
inode ref index 0 namelen 2 name: ..

Btrfs v0.19-36-g96dbd42
===

Here we track a file, whose ino is 257, and the file is in 4 trees,
the sole FS tree and three snapshots.

Signed-off-by: Liu Bo 
---
 Makefile   |5 +-
 disk-io.c  |6 +-
 print-tree.c   |4 +-
 print-tree.h   |3 +
 walk-backref.c |  434 
 5 files changed, 448 insertions(+), 4 deletions(-)
 create mode 100644 walk-backref.c

diff --git a/Makefile b/Makefile
index 6e6f6c6..b3808b2 100644
--- a/Makefile
+++ b/Makefile
@@ -18,7 +18,7 @@ LIBS=-luuid
 
 progs = btrfsctl mkfs.btrfs btrfs-debug-tree btrfs-show btrfs-vol btrfsck \
btrfs \
-   btrfs-map-logical
+   btrfs-map-logical btrfs-walk-backref
 
 # make C=1 to enable sparse
 ifdef C
@@ -59,6 +59,9 @@ mkfs.btrfs: $(objects) mkfs.o
 btrfs-debug-tree: $(objects) debug-tree.o
gcc $(CFLAGS) -o btrfs-debug-tree $(objects) debug-tree.o $(LDFLAGS) 
$(LIBS)
 
+btrfs-walk-backref: $(objects) walk-backref.o
+   gcc $(CFLAGS) -o btrfs-walk-backref $(objects) walk-backref.o 
$(LDFLAGS) $(LIBS)
+
 btrfs-zero-log: $(objects) btrfs-zero-log.o
gcc $(CFLAGS) -o btrfs-zero-log $(objects) btrfs-zero-log.o $(LDFLAGS) 
$(LIBS)
 
diff --git a/disk-io.c b/disk-io.c
index a6e1000..342a884 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -407,7 +407,11 @@ static int find_and_setup_root(struct btrfs_root 
*tree_root,
 root, fs_info, objectid);
ret = btrfs_find_last_root(tree_root, objectid,
   &root->root_item, &root->root_key);
-   BUG_ON(ret);
+   if (ret) {
+   if (ret == 1)
+   ret = -ENOENT;
+   return ret;
+   }
 
blocksize = btrfs_level_size(root, btrfs_root_level(&root->root_item));
generation = btrfs_root_generation(&root->root_item);
diff --git a/print-tree.c b/print-tree.c
index ac575d5..7d02f9f 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -55,7 +55,7 @@ static int print_dir_item(struct extent_buffer *eb, struct 
btrfs_item *item,
return 0;
 }
 
-static int print_inode_ref_item(struct extent_buffer *eb, struct btrfs_item 
*item,
+int print_inode_ref_item(struct extent_buffer *eb, struct btrfs_item *item,
struct btrfs_inode_ref *ref)
 {
u32 total;
@@ -159,7 +159,7 @@ static void print_file_extent_item(struct extent_buffer *eb,
   btrfs_file_extent_compression(eb, fi));
 }
 
-static void print_extent_item(struct extent_buffer *eb, int slot)
+void print_extent_item(struct extent_buffer *eb, int slot)
 {
struct btrfs_extent_item *ei;
struct btrfs_extent_inline_ref *iref;
diff --git a/print-tree.h b/print-tree.h
index 495b81a..2b4664c 100644
--- a/print-tree.h
+++ b/print-tree.h
@@ -21,4 +21,7 @@
 void btrfs_print_leaf(struct btrfs_root *root, struct extent_buffer *l);
 void btrfs_print_tree(struct btrfs_root *root, struct extent_buffer *t, int 
follow);
 void btrfs_print_key(struct btrfs_disk_key *disk_key);
+int print_inode_ref_item(struct extent_buffer *eb, struct btrfs_item *item,
+ 

[PATCH] make "btrfs filesystem label" command actually work

2011-06-01 Thread Fajar A. Nugraha
This simple patch makes "btrfs filesystem label" command actually work.

On tmp branch, commit d1dc6a9, "btrfs filesystem label" functionality
was introduced. However the commit lacks one component that lets
"btrfs" accept "filesystem label" command.
Test case:

#===

# truncate -s 1G /tmp/dev.img

# losetup -f
/dev/loop0

# losetup /dev/loop0 /tmp/dev.img

# mkfs.btrfs -L old /dev/loop0

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label old on /dev/loop0
nodesize 4096 leafsize 4096 sectorsize 4096 size 1.00GB
Btrfs Btrfs v0.19

# btrfs fi la /dev/loop0
old

# btrfs fi la /dev/loop0 new

# btrfs fi la /dev/loop0
new

# mount /dev/disk/by-label/new /mnt/tmp

# btrfs fi la /dev/loop0
FATAL: the filesystem has to be unmounted

# umount /dev/loop0

# btrfs fi la /dev/loop0
new

#===

Not sure if you need if you need a signoff for something as trivial as
this, but here it is just in case.

Signed-off-by: Fajar A. Nugraha 
---
 btrfs.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 4cd4210..84c2337 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -95,6 +95,12 @@ static struct Command commands[] = {
  "filesystem balance", "\n"
"Balance the chunks across the device."
},
+   { do_change_label, -1,
+ "filesystem label", " []\n"
+   "With one argument, get the label of filesystem on .\n"
+   "If  is passed, set the filesystem label to
.\n"
+   "The filesystem must be unmounted.\n"
+   },
{ do_scan, 999,
  "device scan", "[...]\n"
"Scan all device for or the passed device for a btrfs\n"
---

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: strange btrfs sub list output

2011-06-01 Thread C Anthony Risinger
On Tue, May 31, 2011 at 2:32 PM, C Anthony Risinger  wrote:
> On Tue, May 31, 2011 at 1:50 PM, Andreas Philipp
>  wrote:
>>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> On 31.05.2011 19:40, C Anthony Risinger wrote:
>>> On Tue, May 31, 2011 at 5:00 AM, Stephane Chazelas
>>>  wrote:
 2011-05-27 13:49:52 +0200, Andreas Philipp: [...]
>> Thanks, I can understand that. What I don't get is how one
>> creates a subvol with a top-level other than 5. I might be
>> missing the obvious, though.
>>
>> If I do:
>>
>> btrfs sub create A btrfs sub create A/B btrfs sub snap A
>> A/B/C
>>
>> A, A/B, A/B/C have their top-level being 5. How would I get a
>> new snapshot to be a child of A/B for instance?
>>
>> In my case, 285, was not appearing in the btrfs sub list
>> output, 287 was a child of 285 with path "data" while all I
>> did was create a snapshot of 284 (path
>> u6:10022/vm+xfs@u8/xvda1/g8/v3/data in vol 5) in
>> u6:10022/vm+xfs@u8/xvda1/g8/v3/snapshots/2011-03-30
>>
>> So I did manage to get a volume with a parent other than 5,
>> but I did not ask for it.
 [...]
> Reconsidering the explanations on btrfs subvolume list in this
> thread I get the impression that a line in the output of btrfs
> subvolume list with top level other than 5 indicates that the
> backrefs from one subvolume to its parent are broken.
>
> What's your opinion on this?
 [...]

 Given that I don't really get what the parent-child relationship
 means in that context, I can't really comment.

 In effect, the snapshot had been created and was attached to the
 right directory (but didn't appear in the sub list), and there
 was an additional "data" volume that I had not asked for nor
 created that had the snapshot above as parent and that did appear
 in the sub list.

 It pretty much looks like a bug to me, I'd like to understand
 more so that I can maybe try and avoid running into it again.
>>>
>>> i'm actually really interested in the conclusion to this thread
>>> because i _want_ to create subvols with a new parent ... i didn't
>>> realize this wasn't possible (nor the mount option) until reading
>>> this thread. this would give me a little more flexibility with
>>> initcpio hooks and the like vs. packing the btrfs root with tons of
>>> hidden files [subvols] or using IDs directly ...
>>>
>>> i tried absolutely everything i could think of to reproduce this
>>> but all subvols ended up having a top level id of `5`.
>>>
>>> ... so, is there any known way to _purposefully_ create parented
>>> subvols with the current tools?
>>
>> Hopefully, I can help clarify this a little bit. In fact, this is the
>> 'usual' case. With the attached patch to the master branch of
>> btrfs-progs-unstable you can 'watch' how the btrfs subvolume list
>> command builds the full path of the listed subvolumes. Additionally,
>> it gives you the IDs of the parent subvolumes. See the following example.
>>
>> ID 256 top level 5 path test1
>> ID 257 top level 256 path test1.1
>> ID 257 top level 5 path test1/test1.1
>> ID 258 top level 5 path test2
>> ID 259 top level 258 path test2.1
>> ID 259 top level 5 path test2/test2.1
>>
>> - From the second line you see that subvolume ID 256 really is ID 257's
>> parent. Additionally, only test1 and test2 have parent ID 5 or in your
>> terminology are in the btrfs root.
>
> aaah, ok ... this is what i thought was happening too after taking a
> peek at the sources (albeit i don't write any C) and seems to match
> what Hugo was saying if i understand him correctly.
>
> this also makes sense what you said about a broken link ... since
> normally the `btrfs` tool will not let you remove a subvol that has
> other subvols nested within it ... though *technically* it does not
> seem to matter, yes?  must have been a fluke/bug in the `btrfs` tool
> where a higher level subvol was removed before it's child somehow, is
> this correct?  or is the FS itself slightly broken when this happens?
>
> yeah i know that's kind of "my terminology" :-) ... i've spent a lot
> of time explaining btrfs concepts to others and that term always
> seemed to makes the most sense to people ... `top-level` can change,
> `default` can change, etc, etc ... but `the btrfs root` can only mean
> one thing -- the most "bottomest" of the bottom (or top, if you prefer
> :-)
>
> i'll try this out later tonight, thanks.

after booting the correct kernel in KVM, this works exactly as
advertised by the commit that added it, and by your explanation --
thanks -- this will be of much use wrt designing "sub-root" layouts
for advanced initramfs recovery options ... i always felt limited by
the requirement to be in the "btrfs root", and mounting by id looses
some flexibility, eg. when trying to use names like pointers/symlinks.

... now i can put subvols anywhere, and user/script only needs to
determine the st