On Wednesday January 24, [EMAIL PROTECTED] wrote:
> Here you go Neil:
> 
> p34:~# echo 512 > /sys/block/md3/md/stripe_cache_size
> p34:~# echo 1024 > /sys/block/md3/md/stripe_cache_size
> p34:~# echo 2048 > /sys/block/md3/md/stripe_cache_size
> p34:~# echo 4096 > /sys/block/md3/md/stripe_cache_size
> p34:~# echo 8192 > /sys/block/md3/md/stripe_cache_size
> <...... FROZEN ........> 
> 
> I ran echo t > /proc/sysrq-trigger and then copied the relevant parts of 
> kern.log and I am attaching them to this e-mail.
> 
> Please confirm this is what you needed.

Perfect.  Thanks.

This bit:

   574  Jan 24 18:22:21 p34 kernel: [273475.825645] bash          D C7BEBAAC    
 0 16821  16820                     (NOTLB)
   575  Jan 24 18:22:21 p34 kernel: [273475.825653]        c7bebac0 00000082 
00000002 c7bebaac c7bebaa8 00000000 5b48e428 c6cdc560 
   576  Jan 24 18:22:21 p34 kernel: [273475.825665]        c7bebad8 00010b03 
00000011 00000009 cb093a53 0000f8b2 00017216 c6cdc66c 
   577  Jan 24 18:22:21 p34 kernel: [273475.825838]        c1fe3280 00000001 
c20c70c0 c3272058 f75c4a80 c7bebad8 c016a258 f7b12520 
   578  Jan 24 18:22:21 p34 kernel: [273475.825850] Call Trace:
   579  Jan 24 18:22:21 p34 kernel: [273475.825853]  [<c016a258>] 
dput+0x18/0x150
   580  Jan 24 18:22:21 p34 kernel: [273475.825857]  [<c0161f84>] 
__link_path_walk+0xb04/0xc90
   581  Jan 24 18:22:21 p34 kernel: [273475.825862]  [<c03600ad>] 
md_write_start+0x8d/0x120
   582  Jan 24 18:22:21 p34 kernel: [273475.825867]  [<c012eac0>] 
autoremove_wake_function+0x0/0x50
   583  Jan 24 18:22:21 p34 kernel: [273475.825871]  [<c03557a8>] 
make_request+0x38/0x560
   584  Jan 24 18:22:21 p34 kernel: [273475.825876]  [<c02409ce>] 
xfs_log_move_tail+0x3e/0x1b0
   585  Jan 24 18:22:21 p34 kernel: [273475.825881]  [<c023c9fa>] 
xfs_iomap+0x2ca/0x720
   586  Jan 24 18:22:21 p34 kernel: [273475.825885]  [<c026d77a>] 
generic_make_request+0xda/0x150
   587  Jan 24 18:22:21 p34 kernel: [273475.825890]  [<c026fe32>] 
submit_bio+0x72/0x110
   588  Jan 24 18:22:21 p34 kernel: [273475.825895]  [<c013da6b>] 
mempool_alloc+0x2b/0xf0
   589  Jan 24 18:22:21 p34 kernel: [273475.825899]  [<c034f1a0>] 
raid5_mergeable_bvec+0x0/0x90
   590  Jan 24 18:22:21 p34 kernel: [273475.825904]  [<c017c052>] 
__bio_add_page+0x102/0x190
   591  Jan 24 18:22:21 p34 kernel: [273475.825909]  [<c017c117>] 
bio_add_page+0x37/0x50
   592  Jan 24 18:22:21 p34 kernel: [273475.826073]  [<c025be8b>] 
xfs_submit_ioend_bio+0x1b/0x30
   593  Jan 24 18:22:21 p34 kernel: [273475.826078]  [<c025c10e>] 
xfs_page_state_convert+0x26e/0xff0
   594  Jan 24 18:22:21 p34 kernel: [273475.826082]  [<c0155509>] 
slab_destroy+0x59/0x90
   595  Jan 24 18:22:21 p34 kernel: [273475.826088]  [<c025d102>] 
xfs_vm_writepage+0x62/0x100
   596  Jan 24 18:22:21 p34 kernel: [273475.826092]  [<c014396d>] 
shrink_inactive_list+0x5dd/0x8a0
   597  Jan 24 18:22:21 p34 kernel: [273475.826097]  [<c0143cd1>] 
shrink_zone+0xa1/0x100
   598  Jan 24 18:22:21 p34 kernel: [273475.826102]  [<c01447e0>] 
try_to_free_pages+0x140/0x260
   599  Jan 24 18:22:21 p34 kernel: [273475.826106]  [<c013fb4f>] 
__alloc_pages+0x13f/0x2f0
   600  Jan 24 18:22:21 p34 kernel: [273475.826111]  [<c0350dd3>] 
grow_one_stripe+0x93/0x100
   601  Jan 24 18:22:21 p34 kernel: [273475.826115]  [<c0350ee6>] 
raid5_store_stripe_cache_size+0xa6/0xc0
   602  Jan 24 18:22:21 p34 kernel: [273475.826120]  [<c0361a83>] 
md_attr_store+0x73/0x90
   603  Jan 24 18:22:21 p34 kernel: [273475.826125]  [<c0192302>] 
sysfs_write_file+0xa2/0x100
   604  Jan 24 18:22:21 p34 kernel: [273475.826129]  [<c01595f6>] 
vfs_write+0xa6/0x160
   605  Jan 24 18:22:21 p34 kernel: [273475.826134]  [<c0192260>] 
sysfs_write_file+0x0/0x100
   606  Jan 24 18:22:21 p34 kernel: [273475.826138]  [<c0159d31>] 
sys_write+0x41/0x70
   607  Jan 24 18:22:21 p34 kernel: [273475.826303]  [<c0103138>] 
syscall_call+0x7/0xb
   608  Jan 24 18:22:21 p34 kernel: [273475.826307]  =======================

Tells me what is happening.
We try to allocate memory to increase the stripe cache (__alloc_pages)
which requires memory to be freed, so shrink_zone gets called which
calls into the 'xfs' filesystem which eventually trying to write to
the raid5 array.  The raid5 array is currently 'clean' so we need to
mark the superblock as dirty first (md_write_start), but that needs a
lock that is being held while we grow the stripe cache.  Deadlock.

So the patch I posted (changing GFP_KERNEL to GFP_NOIO) will avoid
this as it will then fail the allocation rather than initiate IO.
However it might be better if I can find a way to avoid the
deadlock....

I'll see what I can come up with.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to