Re: Crash in io_ctl_drop_pages after mount with csum errors

2012-01-06 Thread David Sterba
On Fri, Jan 06, 2012 at 03:17:59PM +0800, Li Zefan wrote:
> > [ 1499.946409] BUG: unable to handle kernel NULL pointer dereference at 
> > 0001
> > [ 1499.946437] IP: [] io_ctl_drop_pages+0x37/0x70 [btrfs]
> 
> 0x01 is weired, don't know how it occured. Nevertheless we need this fix:
> 
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index ec23d43..81771ca 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -319,9 +319,11 @@ static void io_ctl_drop_pages(struct io_ctl *io_ctl)
>   io_ctl_unmap_page(io_ctl);
>  
>   for (i = 0; i < io_ctl->num_pages; i++) {
> - ClearPageChecked(io_ctl->pages[i]);
> - unlock_page(io_ctl->pages[i]);
> - page_cache_release(io_ctl->pages[i]);
> + if (io_ctl->pages[i]) {
> + ClearPageChecked(io_ctl->pages[i]);
> + unlock_page(io_ctl->pages[i]);
> + page_cache_release(io_ctl->pages[i]);
> + }
>   }
>  }

mount did not crash with this fix, though anything that touches files
causes the crash. umount is still stuck the same way as before. I'll not
touch the partitions in case you have patches to test.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Crash in io_ctl_drop_pages after mount with csum errors

2012-01-05 Thread Li Zefan
David Sterba wrote:
> I mounted a multi-folume fs created not-so-long ago in a 3.1 based
> kernel and mounted with v3.2-rc7-83-g115e8e7 , it crashed immediately.
> It's quite possible that the disk is to blame, it's an old 160G
> SP1614C, but syslog does not contain any error messages. I'm not sure
> whether the fs was cleanly unmounted, seems not, but anyway I do not
> expect a crash.
> 
> Label: none  uuid: 5f06f9eb-9736-49f7-91a2-2f45522512ef
> Total devices 4 FS bytes used 1.38GB
> devid4 size 34.00GB used 34.00GB path /dev/sdg8
> devid3 size 34.00GB used 34.00GB path /dev/sdg7
> devid2 size 34.00GB used 34.00GB path /dev/sdg6
> devid1 size 34.00GB used 34.00GB path /dev/sdg5
> 
> mount options: compress-force=lzo,space_cache,autodefrag,inode_cache
> 
> [ 1461.732855] btrfs: force lzo compression
> [ 1461.732876] btrfs: enabling auto defrag
> [ 1461.732893] btrfs: enabling inode map caching
> [ 1461.732907] btrfs: disk space caching is enabled
> [ 1499.796181] btrfs: csum mismatch on free space cache
> [ 1499.796266] btrfs: failed to load free space cache for block group 29360128
> [ 1499.888699] btrfs csum failed ino 18446744073709551604 off 65536 csum 
> 2566472073 private 1925235876
> [ 1499.26] btrfs csum failed ino 18446744073709551604 off 327680 csum 
> 2566472073 private 1925235876
> [ 1499.906229] btrfs csum failed ino 18446744073709551604 off 0 csum 
> 1695430581 private 1170642078
> [ 1499.906345] btrfs csum failed ino 18446744073709551604 off 262144 csum 
> 2566472073 private 1925235876
> [ 1499.906446] btrfs csum failed ino 18446744073709551604 off 524288 csum 
> 2566472073 private 1925235876
> [ 1499.924469] btrfs csum failed ino 18446744073709551604 off 196608 csum 
> 2566472073 private 1925235876
> [ 1499.924574] btrfs csum failed ino 18446744073709551604 off 458752 csum 
> 2566472073 private 1925235876
> [ 1499.946076] btrfs csum failed ino 18446744073709551604 off 131072 csum 
> 2566472073 private 1925235876
> [ 1499.946217] btrfs csum failed ino 18446744073709551604 off 393216 csum 
> 2566472073 private 1925235876
> [ 1499.946318] btrfs csum failed ino 18446744073709551604 off 0 csum 
> 1695430581 private 1170642078
> [ 1499.946362] btrfs: error reading free space cache

We have inconsitent data on disk with both free space cache and free ino cache.

> [ 1499.946409] BUG: unable to handle kernel NULL pointer dereference at 
> 0001
> [ 1499.946437] IP: [] io_ctl_drop_pages+0x37/0x70 [btrfs]

0x01 is weired, don't know how it occured. Nevertheless we need this fix:

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index ec23d43..81771ca 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -319,9 +319,11 @@ static void io_ctl_drop_pages(struct io_ctl *io_ctl)
io_ctl_unmap_page(io_ctl);
 
for (i = 0; i < io_ctl->num_pages; i++) {
-   ClearPageChecked(io_ctl->pages[i]);
-   unlock_page(io_ctl->pages[i]);
-   page_cache_release(io_ctl->pages[i]);
+   if (io_ctl->pages[i]) {
+   ClearPageChecked(io_ctl->pages[i]);
+   unlock_page(io_ctl->pages[i]);
+   page_cache_release(io_ctl->pages[i]);
+   }
}
 }

I'll resend the patch along with my other pending patches for 3.3.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Crash in io_ctl_drop_pages after mount with csum errors

2012-01-05 Thread David Sterba
I mounted a multi-folume fs created not-so-long ago in a 3.1 based
kernel and mounted with v3.2-rc7-83-g115e8e7 , it crashed immediately.
It's quite possible that the disk is to blame, it's an old 160G
SP1614C, but syslog does not contain any error messages. I'm not sure
whether the fs was cleanly unmounted, seems not, but anyway I do not
expect a crash.

Label: none  uuid: 5f06f9eb-9736-49f7-91a2-2f45522512ef
Total devices 4 FS bytes used 1.38GB
devid4 size 34.00GB used 34.00GB path /dev/sdg8
devid3 size 34.00GB used 34.00GB path /dev/sdg7
devid2 size 34.00GB used 34.00GB path /dev/sdg6
devid1 size 34.00GB used 34.00GB path /dev/sdg5

mount options: compress-force=lzo,space_cache,autodefrag,inode_cache

[ 1461.732855] btrfs: force lzo compression
[ 1461.732876] btrfs: enabling auto defrag
[ 1461.732893] btrfs: enabling inode map caching
[ 1461.732907] btrfs: disk space caching is enabled
[ 1499.796181] btrfs: csum mismatch on free space cache
[ 1499.796266] btrfs: failed to load free space cache for block group 29360128
[ 1499.888699] btrfs csum failed ino 18446744073709551604 off 65536 csum 
2566472073 private 1925235876
[ 1499.26] btrfs csum failed ino 18446744073709551604 off 327680 csum 
2566472073 private 1925235876
[ 1499.906229] btrfs csum failed ino 18446744073709551604 off 0 csum 1695430581 
private 1170642078
[ 1499.906345] btrfs csum failed ino 18446744073709551604 off 262144 csum 
2566472073 private 1925235876
[ 1499.906446] btrfs csum failed ino 18446744073709551604 off 524288 csum 
2566472073 private 1925235876
[ 1499.924469] btrfs csum failed ino 18446744073709551604 off 196608 csum 
2566472073 private 1925235876
[ 1499.924574] btrfs csum failed ino 18446744073709551604 off 458752 csum 
2566472073 private 1925235876
[ 1499.946076] btrfs csum failed ino 18446744073709551604 off 131072 csum 
2566472073 private 1925235876
[ 1499.946217] btrfs csum failed ino 18446744073709551604 off 393216 csum 
2566472073 private 1925235876
[ 1499.946318] btrfs csum failed ino 18446744073709551604 off 0 csum 1695430581 
private 1170642078
[ 1499.946362] btrfs: error reading free space cache
[ 1499.946409] BUG: unable to handle kernel NULL pointer dereference at 
0001
[ 1499.946437] IP: [] io_ctl_drop_pages+0x37/0x70 [btrfs]
[ 1499.946515] PGD 125ce4067 PUD 126941067 PMD 0
[ 1499.946539] Oops: 0002 [#1] PREEMPT SMP
[ 1499.946560] CPU 0
[ 1499.946569] Modules linked in: btrfs zlib_deflate aoe nfs lockd fscache 
auth_rpcgss nfs_acl sunrpc af_packet cpufreq_conservative cpufreq_userspace 
cpufreq_powersave powernow_k8 mperf snd_hda_codec_analog snd_hda_intel snd
_hda_codec sg sp5100_tco snd_hwdep snd_pcm amd64_edac_mod snd_timer pcspkr 
edac_core snd edac_mce_amd firewire_ohci firewire_core crc_itu_t i2c_piix4 
k8temp asus_atk0110 soundcore snd_page_alloc sky2 autofs4 nouveau ttm drm_k
ms_helper drm processor i2c_algo_bit mxm_wmi wmi video thermal_sys button 
pata_via sata_promise sata_via ata_generic sata_sil pata_atiixp
[ 1499.946832]
[ 1499.946843] Pid: 2799, comm: rm Not tainted 3.2.0-rc7-1-desktop #1
[ 1499.946880] RIP: 0010:[]  [] 
io_ctl_drop_pages+0x37/0x70 [btrfs]
[ 1499.946936] RSP: 0018:880127c6bc48  EFLAGS: 00010202
[ 1499.946951] RAX: 0001 RBX: 880127c6bcf0 RCX: 88012ffa3000
[ 1499.946971] RDX:  RSI: ea0003ec0c80 RDI: ea0003ec0c80
[ 1499.946989] RBP: 0001 R08: 6400 R09: a8000fb03200
[ 1499.947008] R10: 57ffda4fd1ec0c80 R11:  R12: 0001
[ 1499.947028] R13: 880126d519b0 R14: 0002005a R15: 0001
[ 1499.947052] FS:  7f6a9aa1c700() GS:88012fc0() 
knlGS:
[ 1499.947078] CS:  0010 DS:  ES:  CR0: 8005003b
[ 1499.947097] CR2: 0001 CR3: 0001275e5000 CR4: 06f0
[ 1499.947120] DR0:  DR1:  DR2: 
[ 1499.947143] DR3:  DR6: 0ff0 DR7: 0400
[ 1499.947167] Process rm (pid: 2799, threadinfo 880127c6a000, task 
880126378280)
[ 1499.947551] Stack:
[ 1499.947551]   880127c6bcf0  
a0457e2e
[ 1499.947551]  0020 ea0003ec0c80 880126d51980 
880127c6bd48
[ 1499.947551]  880126d51980 0de0 880125d13720 
8801267e6600
[ 1499.947551] Call Trace:
[ 1499.947551]  [] io_ctl_prepare_pages.isra.31+0x9e/0x150 
[btrfs]
[ 1499.947551]  [] __load_free_space_cache+0x1ff/0x610 [btrfs]
[ 1499.947551]  [] load_free_ino_cache+0xd4/0x100 [btrfs]
[ 1499.947551]  [] start_caching+0x86/0x130 [btrfs]
[ 1499.947551]  [] btrfs_return_ino+0xb5/0x170 [btrfs]
[ 1499.947551]  [] btrfs_evict_inode+0x2cb/0x320 [btrfs]
[ 1499.947551]  [] evict+0x9f/0x1a0
[ 1499.947551]  [] do_unlinkat+0x15f/0x1d0
[ 1499.947551]  [] system_call_fastpath+0x16/0x1b
[ 1499.947551]  [<7f6a9a5539b7>] 0x7f6a9a5539b6
[ 1499.94