Re: [OpenAFS] Re: accessing R/O volume becomes slow

2014-11-28 Thread Hans-Werner Paulsen


On 11/27/2014 01:11 PM, Stephan Wiesand wrote:

On 27 Nov 2014, at 11:26, Hans-Werner Paulsen h...@mpa-garching.mpg.de wrote:

Yesterday, on another machine I created and deleted 4 million files on AFS. The 
number of afs_inode_cache slabs grew from 1 million to 5 million. Today there 
are still 5 million entries.

It should shrink when there's memory pressure. If you're still worried, there's 
the -disable-dynamic-vcaches switch for afsd.
On my desktop PC (Linux 3.16.5 x86_64, OpenAFS 1.6.10) I set the 
-disable-dynamic-vcaches option, the -stat option has a value of 65536. 
When I create 100,000 files, I see 100,000 more afs_inode_cache slab 
objects. But, the fileserver is seeing this option, there are only 65253 
nFEs, 65253 nCBs (4194304 nblks). Without -disable-dynamic-vcaches the 
number of CBs is about the number of created files. And if I try to 
create more files than nCBs on the fileserver, the fileserver 
(dafileserver) hangs for about 15 minutes (dafileserver 100-120% cpu!), 
and I get a connection timeout on the client.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: accessing R/O volume becomes slow

2014-11-27 Thread Hans-Werner Paulsen


On 11/26/2014 09:15 PM, Andrew Deason wrote:

On Wed, 26 Nov 2014 10:51:00 +0100
Hans-Werner Paulsen h...@mpa-garching.mpg.de wrote:


Checking the machine I see more than 5 million of afs_inode_cache slab
entries. Is this normal? Any hint how to proceed?

That's not unusual if you are accessing a lot of files (say, about 5
million recently accessed). But having a lot of vcaches in memory can
cause certain operations to be slow; there was a fix just added in
1.6.10 to improve speed for a background cleanup process with lots of
files (well, and PAGs): 94f1d4.
Yesterday, on another machine I created and deleted 4 million files on 
AFS. The number of afs_inode_cache slabs grew from 1 million to 5 
million. Today there are still 5 million entries.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: accessing R/O volume becomes slow

2014-11-27 Thread Stephan Wiesand

 On 27 Nov 2014, at 11:26, Hans-Werner Paulsen h...@mpa-garching.mpg.de 
 wrote:
 On 11/26/2014 09:15 PM, Andrew Deason wrote:
 On Wed, 26 Nov 2014 10:51:00 +0100
 Hans-Werner Paulsen h...@mpa-garching.mpg.de wrote:
 
 Checking the machine I see more than 5 million of afs_inode_cache slab
 entries. Is this normal? Any hint how to proceed?
 That's not unusual if you are accessing a lot of files (say, about 5
 million recently accessed). But having a lot of vcaches in memory can
 cause certain operations to be slow; there was a fix just added in
 1.6.10 to improve speed for a background cleanup process with lots of
 files (well, and PAGs): 94f1d4.
 Yesterday, on another machine I created and deleted 4 million files on AFS. 
 The number of afs_inode_cache slabs grew from 1 million to 5 million. Today 
 there are still 5 million entries.

It should shrink when there's memory pressure. If you're still worried, there's 
the -disable-dynamic-vcaches switch for afsd.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: accessing R/O volume becomes slow

2014-11-26 Thread Andrew Deason
On Wed, 26 Nov 2014 10:51:00 +0100
Hans-Werner Paulsen h...@mpa-garching.mpg.de wrote:

 this is on Linux 3.14.8 x86_64, and OpenAFS 1.6.9. The machine is 
 running normally for several months, and then accessing a specific R/O 
 volume (e.g. ls -lR large_volume) becomes slow.

Do you mean it's slow when you hit the net, or even when you expect
everything to be cached? (That is, if you run ls -lR twice in a row,
does it still remain slow?)

I also echo Ben's suggestion to try other volumes on the same server.
Try to isolate if it's stuff on the server that's causing the problem,
or the specific partition, or just that one volume. Or maybe it could be
a specific dir somewhere in the volume.

 Checking the machine I see more than 5 million of afs_inode_cache slab 
 entries. Is this normal? Any hint how to proceed?

That's not unusual if you are accessing a lot of files (say, about 5
million recently accessed). But having a lot of vcaches in memory can
cause certain operations to be slow; there was a fix just added in
1.6.10 to improve speed for a background cleanup process with lots of
files (well, and PAGs): 94f1d4.

Other information that could be gathered: fstrace data (but if data is
going by too quickly, it can be hard to get useful data out of this), or
'strace' syscall timing information (to see what syscalls are slow), or
a network dump, if you are hitting the net in the cases you're talking
about; that could help show if it's the client or server that's being
slow (when comparing a 'success' run vs a 'slow' run).

Traces like that are hard to look at when you have a ton of data to sort
through, but it's still feasible to compare timings from a 'slow' run to
a 'fast' run to try to see if the speed difference is coming from a
particular place.

-- 
Andrew Deason
adea...@sinenomine.net

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info