Hello folks

I'm having an issue with QFS and, I'm all out of ideas. In short, the 
performance on multiple files in the QFS server is 3 times faster than 
on the QFS clients.

The details.

I'm having 4 T2000 in a Sun cluster configuration that will run some 
services and, QFS is there (as a Cluster service) for making a shared 
directory available to all the systems. No SAM is configured, there is 
no archiving being done in this file system.
samcmd f gives:
ty     eq  state           device_name      status high low mountpoint 
server
ma     10      on                 qfs1  m----2c---d 80% 70% /tibco     caos
 mm    11      on    /dev/did/dsk/d5s0
 md    12      on    /dev/did/dsk/d6s0

Where both the mm and the md device are EMC DMX4 LUNs (yes, I know that 
having an mm device over a monolitic storage LUN isn't the best scenario 
but, it's what I have and can't change that)

On the QFS metadata server, the operation of copying 5000 (each with 
1.5KB) files in /tmp/jaimec into the QFS file system (measured with the 
comand time) gives:
caos#./testCP.ksh

real    0m10.30s
user    0m0.25s
sys     0m4.53s

the same thing on a metadata client
cronos#./testCP.ksh

real    0m34.03s
user    0m0.26s
sys     0m4.60s

real time is 3x the one on the server.

Further tests showed that:
- instead of 5000, doing it with 10000 files the proportion maintains 
(clients are 3x slower)

- copying 5 files of big size (some GB each) is working OK with the 
server having the same performance as any client.

- The sharing of QFS is being made by the cluster interconnects 
(10/100/1000). Thinking there might be an issue with the switch card 
causing an high latency, I replaced the interconect network with a small 
switch (cisco 2960 100 mbps) but the results weren't all that different 
(slightly worst I'd say on a first approach).

- Instead of using "cp" in a script, using other commands like "rm" or 
"touch" brings the same differences between the master and the client.

- using sls -D on a few files created sequencially, I can see the inode 
numbers are totally off. This sugests we may now have inode 
fragmentation. I'm going to do a sammkfs before this goes out into 
production but, while this is enough to delay the inode assignement, 
doesn't explain the performance differences between a client and a 
server (or does it?)

Using dtraceToolkit-0.99 shows that doing an operation on the master or 
on the server is exactly the same in terms of syscalls, CPU time sor 
whatever I could think of. Only the write operation Elapsed times are 
significantly slower on the clients than on the server.

Sun already tuned several parameters and, so far, the best results were 
found with the values:
minallocsz -256
maxallocsz -8192

- rd, ap and wrlease are now back at 30 after testing values ranging 
from 15 till 300 (with no significant changes)
-mh_write, qwrite and forcedirectio were removed from the configuration 
and improved performance from 4x to 3x (where we are now)


The only thing I can think that justifies having a slower performance on 
the metadata clients than on the servers is an high network latency but, 
I went as far as a dedicated switch and, nothing. Still don't know 
enough about Solaris 10's TCP stack to see if there is a magical tunable 
for QFS performance (ndd -get /dev/tcp
make_qfs_fly ? comes up with nothing).

Oh, I lamost forgot, ...
pkginfo -l SUNWqfsr |grep VERSION
   VERSION:  4.6.5,REV=5.10.2007.03.12

Any ideas?

Reply via email to