Re: [CentOS] waiting IOs...

2009-09-10 Thread James Pearson
John Doe wrote:
 Hi,
 
 We have a storage server (HP DL360G5 + MSA20 (12 disks in RAID 6) on a 
 SmartArray6400).
 10 directories are exported through nfs to 10 clients 
 (rsize=32768,wsize=32768,soft,intr,nosuid,proto=udp,vers=3).
 The server is apparently not doing much but... we have very high waiting IOs.

Probably not connected, but personally I would use 'hard' and 
'proto=tcp' instead of 'soft' and 'proto=udp' on the clients

James Pearson
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] waiting IOs...

2009-09-10 Thread Les Mikesell
James Pearson wrote:
 John Doe wrote:
 Hi,

 We have a storage server (HP DL360G5 + MSA20 (12 disks in RAID 6) on a 
 SmartArray6400).
 10 directories are exported through nfs to 10 clients 
 (rsize=32768,wsize=32768,soft,intr,nosuid,proto=udp,vers=3).
 The server is apparently not doing much but... we have very high waiting IOs.
 
 Probably not connected, but personally I would use 'hard' and 
 'proto=tcp' instead of 'soft' and 'proto=udp' on the clients


I'd usually blame disk seek time first.  If your raid level requires several 
drives to move their heads together and/or the data layout lands on the same 
drive set, consider what happens when your 10 users all want the disk head(s) 
to 
be in different places.  Disk drives allow random access but they really aren't 
that good at it if they have to spend most of their time seeking.  Raid6 is 
particularly bad at write performance so a different raid level might help - or 
if you know the data access pattern you might split the drives into different 
volumes that don't affect each other and arrange the data accordingly.  And 
mounting with the async option might give much better performance - I'm not 
sure 
what the default is these days.

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] waiting IOs...

2009-09-10 Thread John Doe
# iostat -m -x 10
Linux 2.6.18-8.1.6.el5 (data1.iol)  09/10/2009

. . .

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   0.200.000.318.790.00   90.70

Device: rrqm/s   wrqm/s   r/s   w/srMB/swMB/s avgrq-sz avgqu-sz 
  await  svctm  %util
cciss/c0d00.00 0.20  0.92  0.51 0.00 0.0010.29 0.04 
  25.29  25.93   3.71
cciss/c0d10.00 0.20  3.68  2.45 0.02 0.0111.07 0.06 
   9.87   7.27   4.45
cciss/c0d20.00 0.20  0.41  2.76 0.00 0.01 8.52 0.03 
   9.97   2.81   0.89
cciss/c0d31.23 0.51  3.98  1.53 0.03 0.0114.52 0.05 
   9.69   8.07   4.45
cciss/c0d40.00 0.00  0.00  0.00 0.00 0.00 0.00 0.00 
   0.00   0.00   0.00
cciss/c0d50.00 0.00  1.02  0.10 0.00 0.00 8.00 0.01 
   9.36   9.36   1.05
cciss/c0d62.45 0.20  0.92  0.51 0.06 0.0087.43 0.01 
   9.64   7.21   1.03
cciss/c0d70.00 0.00  0.31  0.10 0.00 0.00 8.00 0.01 
  14.25  14.25   0.58
cciss/c0d80.00 0.00  0.10  1.84 0.00 0.01 8.00 0.01 
   7.26   1.42   0.28
cciss/c0d90.00 0.10  0.41  3.78 0.00 0.02 9.56 0.05 
  12.24   1.59   0.66
cciss/c1d00.00 6.03  0.00  1.74 0.00 0.0226.94 0.04 
  25.35  12.59   2.19

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   0.050.000.36   25.770.00   73.82

Device: rrqm/s   wrqm/s   r/s   w/srMB/swMB/s avgrq-sz avgqu-sz 
  await  svctm  %util
cciss/c0d00.00 0.82  4.60  0.20 0.03 0.0013.11 0.06 
  13.55  13.00   6.25
cciss/c0d11.23 1.43  1.94  0.20 0.02 0.0129.33 0.02 
  11.48  11.48   2.46
cciss/c0d20.00 0.00  0.82  0.00 0.00 0.00 8.00 0.01 
  14.00  14.00   1.15
cciss/c0d32.45 1.43 11.25  0.20 0.07 0.0114.43 0.12 
  10.62  10.52  12.04
cciss/c0d40.00 1.64  7.98  0.20 0.03 0.0110.00 0.08 
   9.24   9.10   7.44
cciss/c0d55.93 0.20 22.09  0.20 2.19 0.00   201.03 0.58 
  26.06   1.88   4.19
cciss/c0d62.45 1.12  5.62  0.41 0.08 0.0129.15 0.06 
  10.66   9.66   5.83
cciss/c0d70.00 1.23  5.42  0.20 0.02 0.0110.76 0.05 
   8.87   8.73   4.91
cciss/c0d80.00 0.72  3.17  0.20 0.02 0.0013.09 0.04 
  12.33  12.21   4.12
cciss/c0d90.92 0.82  3.17  0.20 0.03 0.0018.67 0.04 
  13.12  12.67   4.27
cciss/c1d00.00 2.66  0.41  3.07 0.00 0.01 8.29 0.28 
  81.91  10.94   3.80

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   0.100.000.20   15.950.00   83.74

Device: rrqm/s   wrqm/s   r/s   w/srMB/swMB/s avgrq-sz avgqu-sz 
  await  svctm  %util
cciss/c0d00.00 4.19 14.72  0.51 0.10 0.0215.52 0.17 
  10.98   9.05  13.79
cciss/c0d10.00 0.20  0.31  0.20 0.00 0.0011.20 0.01 
  10.20  10.20   0.52
cciss/c0d20.00 0.31  0.41  0.41 0.00 0.0013.00 0.01 
   7.88   7.88   0.64
cciss/c0d30.00 0.31  4.50  0.20 0.02 0.00 9.74 0.05 
  10.61  10.37   4.88
cciss/c0d41.23 0.31  2.76  0.41 0.28 0.00   182.71 0.06 
  19.97   4.00   1.27
cciss/c0d50.00 0.82  3.17  0.20 0.02 0.0011.64 0.04 
  11.30  11.30   3.81
cciss/c0d62.45 0.51  3.68  0.41 0.28 0.00   143.60 0.07 
  16.27   5.30   2.17
cciss/c0d71.23 0.10  1.94  0.20 0.01 0.0015.24 0.03 
  13.33  13.33   2.86
cciss/c0d80.00 0.31  0.51  0.41 0.00 0.0011.56 0.01 
   8.00   8.00   0.74
cciss/c0d90.00 0.10  0.61  0.20 0.00 0.0010.00 0.01 
  11.88  11.75   0.96
cciss/c1d00.00 3.07  0.00  1.33 0.00 0.0119.08 0.02 
  16.77  13.31   1.77

I tried nmon but I did not see anything out of the ordinary...
But when I try to see the NFS stats, nmon (11f-1.el5.rf) coredumps.
I tried the iostat nfs option (kernel 2.6.18-8.1.6.el5, should support it) but 
it did not show anything.
nfsstat shows no apparent activity...
nfs is normaly only used to put files or modify small files (1K) from times 
to times, while http is used to get files.

From: Les Mikesell lesmikes...@gmail.com
 I'd usually blame disk seek time first.  If your raid level requires several 
 drives to move their heads together and/or the data layout lands on the same 
 drive set, consider what happens when your 10 users all want the disk head(s) 
 to 
 be in different places.  Disk drives allow random access but they really 
 aren't 
 that good at it if they have to spend most of their time seeking.

That could be it...
It's always a 

Re: [CentOS] waiting IOs...

2009-09-10 Thread Les Mikesell
John Doe wrote:
 # iostat -m -x 10
 Linux 2.6.18-8.1.6.el5 (data1.iol)  09/10/2009
 

Are you also following the 'Excessive NFS operations' thread?  There's 
some interesting information there about buggy kernels and apps.

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] waiting IOs...

2009-09-09 Thread John Doe
Hi,

We have a storage server (HP DL360G5 + MSA20 (12 disks in RAID 6) on a 
SmartArray6400).
10 directories are exported through nfs to 10 clients 
(rsize=32768,wsize=32768,soft,intr,nosuid,proto=udp,vers=3).
The server is apparently not doing much but... we have very high waiting IOs.

dstat show very little activity, but high 'wai'...

# dstat 
total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   0  88  12   0   0| 413k   98k|   0 0 |   0 0 | 188   132 
  0   1  46  53   0   0| 716k   48k|  19k  420k|   0 0 |1345   476 
  0   1  49  50   0   1| 492k   32k|  12k  181k|   0 0 |1269   482 
  0   1  63  37   0   0| 316k  159k|  58k  278k|   0 0 |1789  1562 
  0   0  74  26   0   0|  84k  512k|1937B 6680B|   0 0 |1200   106 
  0   1  44  55   0   1| 612k   80k|  14k  221k|   0 0 |1378   538 
  1   1  52  47   0   0| 628k0 |  17k  318k|   0 0 |1327   520 
  0   1  50  49   0   0| 484k   60k|  14k  178k|   0 0 |1303   494 
  0   0  87  13   0   0| 124k0 |7745B  116k|   0 0 |1083   139 
  0   1  59  41   0   0| 316k   60k|4828B   67k|   0 0 |1179   346 

top shows that one nfsd is usualy in state 'D' (waiting).

# top -i(sorted by cpu usage)
top - 18:11:28 up 207 days,  7:13,  2 users,  load average: 0.99, 1.07, 1.00
Tasks: 124 total,   1 running, 123 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.2%sy,  0.0%ni, 54.3%id, 45.3%wa,  0.2%hi,  0.0%si,  0.0%st
Mem:   3089252k total,  3068112k used,21140k free,   928468k buffers
Swap:  2008116k total,  164k used,  2007952k free,   293716k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ 
COMMAND 

  
16571 root  15   0 12708 1076  788 R1  0.0   0:00.02
top 

  
 2580 root  15   0 000 D0  0.0   2:36.70 nfsd 

# cat /proc/net/rpc/nfsd
rc 8872 34768207 38630969
fh 142 0 0 0 0
io 2432226534 884662242
th 32 394 4851.311 2437.416 370.949 238.432 542.241 4.942 2.239 1.000 0.427 
0.541
ra 64 3876274 5025 3724 2551 2030 2036 1506 1607 1219 1154 1136249
net 73410453 73261524 0 0
rpc 73408119 0 0 0 0
proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
proc3 22 33 9503937 1315066 11670859 7139862 0 5033349 28129122 3729031 0 0 0 
487614 0 1116215 0 0 2054329 21225 66 0 2351744
proc4 2 0 0
proc4ops 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0

Do you think nfs is the problem here?
If so, is there something wrong with our config?
Is it too much to have 10 dir x 10 clients, even if there is almost no traffic?

Thx,
JD


  

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] waiting IOs...

2009-09-09 Thread nate
John Doe wrote:
 Hi,

 We have a storage server (HP DL360G5 + MSA20 (12 disks in RAID 6) on a
 SmartArray6400).
 10 directories are exported through nfs to 10 clients
 (rsize=32768,wsize=32768,soft,intr,nosuid,proto=udp,vers=3).
 The server is apparently not doing much but... we have very high waiting
 IOs.

How about running iostat -x ? Sounds like the system is doing a lot
more than you think it is..

nate


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] waiting IOs...

2009-09-09 Thread Alan McKay
 How about running iostat -x ? Sounds like the system is doing a lot
 more than you think it is..

You might want to set yourself up with a performance monitoring system
like Munin to give you more extensive data, as well.

If you get that far, you'll find the iostat plugin to be a bit lacking
- I've written a more useful one that I'd be happy to share.


-- 
“Don't eat anything you've ever seen advertised on TV”
 - Michael Pollan, author of In Defense of Food
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos