Hello Matthew,

Wednesday, October 25, 2006, 7:08:56 AM, you wrote:

MA> Robert Milkowski wrote:
>> Hi.
>> 
>>     On nfs clients which are mounting file system f3-1/d611 I can see 3-5s 
>> periods of 100% busy (iostat) and almost no IOs issued to nfs server, on nfs 
>> server at the same time disk activity is almost 0 (both iostat and zpool 
>> iostat). However CPU activity increases in SYS during that periods.

>> Different time period when disk activity is small:
>> 
>> # lockstat -kgIw sleep 10 | less

MA> Did you happen to get 'lockstat -kgIW' output while the problem was 
MA> occurring?  (note the capital W)  I'm not sure how to interpret the -w
MA> output... (and sorry I gave you the wrong flags before).


Sorry, just blindly copied (changed i to I), next time I'll try.

>> Now during another period when disk activity is low and nfs clients see 
>> problem:
>> 
>> # dtrace -n fbt:::entry'{self->vt=vtimestamp;}' -n 
>> fbt:::return'/self->vt/[EMAIL PROTECTED](vtimestamp-self->vt);self->vt=0;}' 
>> -n tick-5s'{printa(@);exit(0);}'
>> [...]
>>   page_next_scan_large                                       23648600
>>   generic_idle_cpu                                           69234100
>>   disp_getwork                                              139261800
>>   avl_walk                                                  669424900

MA> Hmm, that's a possibility, but the method you're using to gather this 
MA> information (tracing *every* function entry and exit) is a bit 
MA> heavy-handed, and it may be distorting the results.

That could be the case - however during normal periods avl_walk when
measured the same way is not on the top.


>> Heh, I'm sure I have seen avl_walk consuming lot of CPU before...
>> 
>> So wait for another such period and (6-7seconds):
>> 
>> # dtrace -n fbt::avl_walk:entry'[EMAIL PROTECTED]()]=count();}'
>> [...]

>>               zfs`metaslab_ff_alloc+0x9c
>>               zfs`space_map_alloc+0x10
>>               zfs`metaslab_group_alloc+0x1e4
>>               zfs`metaslab_alloc_dva+0x114
>>               zfs`metaslab_alloc+0x2c
>>               zfs`zio_alloc_blk+0x70
>>               zfs`zil_lwb_write_start+0x8c
>>               zfs`zil_lwb_commit+0x1ac
>>               zfs`zil_commit+0x1b0
>>               zfs`zfs_fsync+0xa8
>>               genunix`fop_fsync+0x14
>>               nfssrv`rfs3_create+0x700
>>               nfssrv`common_dispatch+0x444
>>               rpcmod`svc_getreq+0x154
>>               rpcmod`svc_run+0x198
>>               nfs`nfssys+0x1c4
>>               unix`syscall_trap32+0xcc
>>           1415957

MA> Hmm, assuming that avl_walk() is actually consuming most of our CPU 
MA> (which the lockstat -kgIW will confirm), this seems to indicate that 
MA> we're taking a long time trying to find free chunks of space.  This may
MA> happen if you have lots of small fragments of free space, but no chunks
MA> large enough to hold the block we're trying to allocate.  We try to 
MA> avoid this situation by trying to allocate from the metaslabs with the
MA> most free space, but it's possible that there's an error in this algorithm.


>> So lets destroy oldest snapshot:
>> 
>> # zfs destroy f3-1/[EMAIL PROTECTED]
>> [it took about 4 minutes!]

>> After snapshot was destroyed problem is completly gone.

MA> FYI, destroying the snapshot probably helped simply by (a) returning 
MA> some big chunks of space to the pool and/or (b) perturbing the system 
MA> enough so that we try different metaslabs which aren't so fragmented.


Well, that's scaring - it means when I'll fill up the pool a little
bit more I could get big problem even without using snapshots which is
much worse as there would be no easy and quick method to solve
symptoms like just destroying snapshot.

-- 
Best regards,
 Robert                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to