Hi,

today I had run cleanerd on 2 clean partitions.

One worked flawlessly. On the other one this error occured:

BUG: unable to handle kernel NULL pointer dereference at 00000ccd
IP: [<f8341fcc>] nilfs_gc_iget+0x4c/0x130 [nilfs2]
*pdpt = 0000000013d32001 *pde = 0000000000000000
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1f.0/resource
Modules linked in: nvidia(P) vmnet vmblock vmci vmmon fcpci(P) capi 
capifs kernelcapi nilfs2 scsi_wait_scan

Pid: 8551, comm: nilfs_cleanerd Tainted: P           (2.6.29.2server #1) 
P5QL-E
EIP: 0060:[<f8341fcc>] EFLAGS: 00010202 CPU: 3
EIP is at nilfs_gc_iget+0x4c/0x130 [nilfs2]
EAX: 00000ccd EBX: 00000000 ECX: 00000002 EDX: f6897c00
ESI: 0000004e EDI: 00000002 EBP: 00000000 ESP: c3801ca0
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process nilfs_cleanerd (pid: 8551, ti=c3800000 task=d4d10330 
task.ti=c3800000)
Stack:
 e11854c0 f6857a00 f6897c3c 00000000 00000000 00000000 e1185500 f8342e06
 00000002 00000000 c3801d60 00000044 f7450990 d4d10484 00000001 00000001
 00000000 00000000 00020050 00000202 00000000 00000000 00000000 c3801d58
Call Trace:
 [<f8342e06>] nilfs_ioctl_do_move_blocks+0x76/0x3e0 [nilfs2]
 [<f8342379>] nilfs_ioctl_wrap_copy+0x169/0x1f0 [nilfs2]
 [<f834292e>] nilfs_ioctl_prepare_clean_segments+0x6e/0x130 [nilfs2]
 [<f8342d90>] nilfs_ioctl_do_move_blocks+0x0/0x3e0 [nilfs2]
 [<f833deb3>] nilfs_clean_segments+0x83/0x200 [nilfs2]
 [<f83423c6>] nilfs_ioctl_wrap_copy+0x1b6/0x1f0 [nilfs2]
 [<f8342810>] nilfs_ioctl+0x3d0/0x480 [nilfs2]
 [<f8342c90>] nilfs_ioctl_do_get_bdescs+0x0/0xb0 [nilfs2]
 [<c0312ddf>] ehci_irq+0x17f/0x340
 [<c0168a78>] page_add_new_anon_rmap+0x28/0x60
 [<c013ddfe>] getnstimeofday+0x4e/0x120
 [<f8342440>] nilfs_ioctl+0x0/0x480 [nilfs2]
 [<c017f50b>] vfs_ioctl+0x2b/0x90
 [<c017f87b>] do_vfs_ioctl+0x1eb/0x530
 [<c012d45b>] run_timer_softirq+0x15b/0x190
 [<c0128d74>] __do_softirq+0x94/0x140
 [<c017fbfd>] sys_ioctl+0x3d/0x70
 [<c0103131>] sysenter_do_call+0x12/0x25
 [<c0400000>] pci_read_bridge_bases+0x20/0x350
Code: f8 69 c0 01 00 37 9e c1 e8 18 c1 e0 02 89 44 24 08 8b 92 dc 00 00 
00 01 d0 89 44 24 08 8b 00 85 c0 75 08 eb 2b 85 c9 74 27 89 c8 <8b> 08 
0f 18 01 90 3b 70 20 89 c3 75 ed 8b 50 9c 8b 40 98 31 ea
EIP: [<f8341fcc>] nilfs_gc_iget+0x4c/0x130 [nilfs2] SS:ESP 0068:c3801ca0
---[ end trace 573da78de6d7c815 ]---

Bye,
Arendt David

Ryusuke Konishi wrote:
> Hi,
> On Tue, 05 May 2009 21:32:27 +0200, David Arendt wrote:
>   
>> Hi,
>>
>> after cleaner was running for 2 hours and freeing up 200gbytes of space 
>> I had the following crash:
>>
>> nilfs_cpfile_delete_checkpoints: cannot delete block: cno=76377, range = 
>> [75980, 76972)
>> NILFS: GC failed during preparation: cannot delete checkpoints: err=-2
>> NILFS_PAGE_BUG(c10d67e0): cnt=2 index#=74049180 flags=0x40000835 
>> mapping=f71d10d4 ino=0
>>  BH[0] d3cbdb30: cnt=2 block#=74049180 state=0x2002b
>> ------------[ cut here ]------------
>> kernel BUG at /home/admin/x/nilfs-2.0.12/fs/btnode.c:233!
>>     
>
> The log shows a btree routine, nilfs_btree_propagate() has detected an
> orphan btree node in the page cache.  Looks another inconsistency.
>
> I'd like to know if this is a regression of the previous patch or not
> ( I guess it's not ). If you see this for new volumes, please let me
> know.
>
> I'll digging into the btree code to hunt this later.
>
> Thanks,
> Ryusuke Konishi
>
>   
>> invalid opcode: 0000 [#1] PREEMPT SMP
>> last sysfs file: /sys/devices/pci0000:00/0000:00:1f.0/resource
>> Modules linked in: nvidia(P) vmnet vmblock vmci vmmon fcpci(P) capi 
>> capifs kernelcapi nilfs2 scsi_wait_scan
>>
>> Pid: 2285, comm: segctord Tainted: P           (2.6.29.2server #1) P5QL-E
>> EIP: 0060:[<f8331680>] EFLAGS: 00010282 CPU: 2
>> EIP is at nilfs_btnode_prepare_change_key+0x170/0x180 [nilfs2]
>> EAX: 00000038 EBX: 003ba23a ECX: 00000092 EDX: 0307b000
>> ESI: 00000000 EDI: 00000000 EBP: f2783afc ESP: f6c13ce0
>>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> Process segctord (pid: 2285, ti=f6c12000 task=f75d5cc0 task.ti=f6c12000)
>> Stack:
>>  f83366b8 00000001 f2783af8 00000000 f71d10d4 d3cbdb30 003ba248 00000000
>>  f833184d 00000000 f2783ac8 f2783ad4 f71d1044 f83328c9 f2783ae8 f83436a4
>>  00000000 f2783a78 f71d1044 f83342fe 00000001 00000001 02783a78 f2783ac8
>> Call Trace:
>>  [<f83366b8>] nilfs_dat_prepare_entry+0x18/0x20 [nilfs2]
>>  [<f833184d>] nilfs_bmap_prepare_update+0x2d/0x60 [nilfs2]
>>  [<f83328c9>] nilfs_btree_prepare_update_v+0xe9/0x100 [nilfs2]
>>  [<f83342fe>] nilfs_btree_propagate_v+0x17e/0x210 [nilfs2]
>>  [<f833538a>] nilfs_btree_propagate+0xba/0x160 [nilfs2]
>>  [<f8331aa6>] nilfs_bmap_propagate+0x26/0x40 [nilfs2]
>>  [<f833e42e>] nilfs_collect_file_node+0x1e/0x50 [nilfs2]
>>  [<f833a5a1>] nilfs_segctor_apply_buffers+0x51/0xb0 [nilfs2]
>>  [<f833a975>] nilfs_segctor_scan_file+0x125/0x1f0 [nilfs2]
>>  [<f833e410>] nilfs_collect_file_node+0x0/0x50 [nilfs2]
>>  [<c019177b>] __getblk+0x7b/0x210
>>  [<f8339a5c>] nilfs_segbuf_extend_segsum+0x1c/0x50 [nilfs2]
>>  [<f833cb5d>] nilfs_segctor_do_construct+0x166d/0x18c0 [nilfs2]
>>  [<f8341898>] nilfs_palloc_commit_free_entry+0xc8/0x100 [nilfs2]
>>  [<c011c25b>] update_curr+0x7b/0xe0
>>  [<c011f9bb>] finish_task_switch+0x2b/0xa0
>>  [<f833199f>] nilfs_bmap_test_and_clear_dirty+0x2f/0x40 [nilfs2]
>>  [<f8330e2e>] nilfs_mdt_fetch_dirty+0xe/0x30 [nilfs2]
>>  [<f833a4c3>] nilfs_test_metadata_dirty+0x93/0xb0 [nilfs2]
>>  [<f833a534>] nilfs_segctor_confirm+0x54/0x70 [nilfs2]
>>  [<f833d009>] nilfs_segctor_construct+0x99/0xb0 [nilfs2]
>>  [<f833d7ba>] nilfs_segctor_thread+0x11a/0x2b0 [nilfs2]
>>  [<f833d310>] nilfs_construction_timeout+0x0/0x10 [nilfs2]
>>  [<f833d6a0>] nilfs_segctor_thread+0x0/0x2b0 [nilfs2]
>>  [<c0136e92>] kthread+0x42/0x70
>>  [<c0136e50>] kthread+0x0/0x70
>>  [<c010391b>] kernel_thread_helper+0x7/0x1c
>> Code: ff ff ff 8b 54 24 14 8b 42 08 e8 1c b8 e1 c7 89 f8 83 c4 24 5b 5e 
>> 5f 5d c3 e8 3d 78 0d c8 eb b4 0f 0b eb fe 89 d0 e8 40 e7 ff ff <0f> 0b 
>> eb fe 89 d0 e8 25 b7 e1 c7 e9 2d ff ff ff 53 b9 ff ff ff
>> EIP: [<f8331680>] nilfs_btnode_prepare_change_key+0x170/0x180 [nilfs2] 
>> SS:ESP 0068:f6c13ce0
>> ---[ end trace 0a4368694028129d ]---
>> note: segctord[2285] exited with preempt_count 1
>>
>> Bye,
>> David Arendt
>>
>> David Arendt wrote:
>>     
>>> Hi,
>>>
>>> I have applied your patch now. Also the garbage collector didn't crash 
>>> until now. I have chosen to not reformat for further testing as there 
>>> are only temporary files on this partition where loosing them would not 
>>> be a big problem.
>>>
>>> Bye,
>>> David Arendt
>>>
>>> Ryusuke Konishi wrote:
>>>   
>>>       
>>>> Hi!
>>>> On Tue,  5 May 2009 17:26:48 +0200, [email protected] wrote:
>>>>   
>>>>     
>>>>         
>>>>> Thank you.
>>>>> I will try this patch in a few hours.  If I see it correctly the
>>>>> patch will prevent this error in future and will not correct the
>>>>> current error, so I suppose that after applying the patch I will
>>>>> need to reformat the volume.
>>>>>     
>>>>>       
>>>>>           
>>>> I expect the patch will even fix the current error on the next GC, but
>>>> you had better reformat the volume for safety.
>>>>
>>>> Ryusuke Konishi
>>>>   
>>>>     
>>>>         
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://www.nilfs.org/mailman/listinfo/users
>>>   
>>>       

_______________________________________________
users mailing list
[email protected]
https://www.nilfs.org/mailman/listinfo/users

Reply via email to