[storage-discuss] COMSTAR again / timeouts and write bursts only

Ville Ojamo Thu, 06 May 2010 23:40:00 -0700

Dear All,

Again I face some problems with COMSTAR iSCSI, this time with fresh install of 
snv_134. I do not believe it is a hardware problem or performance problem of 
the OpenSolaris box, because I have fought with this before. Since 2009.06 had 
problems with iSCSI and VMWare compatibility, I did image update to latest 
(forgot which dev build, possibly few months ago) and the iSCSI write speed was 
blazing fast.


Target (whitebox) has 6 x 1.5 TB SATA disks in RAIDZ, 2.4 GB RAM allocated to 
it, OpenSolaris snv_134 as ESXi 4.0.1 guest. Initiator (proper HP server) is 
ESXi 4.0.0.

CIFS performace is as expected but iSCSI writes are very slow. Or to be more 
precise, they are not slow, they are "bursty" and a lot of times the initiator 
loses connection due to timeout. But sometimes there are no problems: I run a 
script to backup all VM to the iSCSI LUN and on initial test run I had very bad 
results. No VM was transferred successfully due to timeout at each VM cloning 
(sizes ranging from 8 GB to 50 GB), initiator lost timed out connection to 
target. Eventually, on one run out of 14 VM only 4 failed due to timeout, total 
size being around 400 GB. I have not ruled out problem at initiator side at 
this stage, maybe ESXi has a problem.... At the moment the same 3 VM being 
copied (cloned) timeout after multiple attempts (~10!).

At first I tried with dedup turned on, but the initiator constantly losed 
connection to the target due to timeouts and was completely unusable - maybe 
the CPU at target (Core 2 Duo E8400) could not keep up with it. After Dedup 
turned off for this volume, at least I can have writes somewhat working.
On a related note, I tried in ESXi to increase the CPU amount for OpenSolaris 
from 1 CPU to 2 but for some reason that totally bogged down the system when 
testing this, the whole system became hung and extremely lagged at times due to 
both cores at 100%.

Tried with cross-over cable to eliminate problem at switch. MTU 1500, same 
network settings as in the test few months ago when I got the speed I was 
expecting.

I am very new to OpenSolaris but have plenty of experience with other Unix 
systems so any other debugging pointers are appreciated. I have done some 
testing in hopes that someone could make sense out of this, below is some 
output of zpool iostat and iostat..:

@fs1:~$ zpool iostat pool 2
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
--- this is slightly after the writes start, numbers were similar to this 
before ---
pool         389G  6.43T      4  17.8K  9.61K  41.6M
pool         390G  6.43T     22  14.8K  51.0K  84.2M
pool         390G  6.43T     76  1.30K   187K  3.05M
--- at least at this point the write is still working but very "bursty" as you 
can see ---
pool         389G  6.43T      0  8.82K      0  69.1M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.77K      0  68.1M
pool         389G  6.43T      0     41      0  76.5K                            
                                                                    [0,336/1203]
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  6.48K      0  51.7M
pool         389G  6.43T      0  1.98K      0  15.0M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.46K      0  66.5M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.38K      0  66.3M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  7.90K      0  62.8M
pool         389G  6.43T      0    114      0   310K
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.37K      0  66.2M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.37K      0  66.3M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  6.74K      0  53.6M
pool         389G  6.43T      0     43      0  83.2K
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.42K      0  66.9M
pool         389G  6.43T      0     35      0  71.9K
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  7.34K      0  58.4M
pool         389G  6.43T      0     44      0  84.6K
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.15K      0  64.8M
pool         389G  6.43T      0     43      0  81.7K
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0    537      0  4.15M
pool         389G  6.43T      0  7.81K      0  61.9M                            
                                                                    [0,294/1203]
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.36K      0  66.2M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.39K      0  66.5M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0  4.00K      0
pool         389G  6.43T      0  7.10K      0  56.5M
pool         389G  6.43T      0    820      0  6.09M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  9.06K      0  71.8M
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  7.35K      0  58.5M
pool         389G  6.43T      0     43      0  84.7K
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0      0      0      0
pool         389G  6.43T      0  8.46K      0  67.1M
--- at this point "something" happens and only reads happen until the initiator 
eventually times out. The ESXi initiator does not /always/ lose connection to 
the target, just the cloning of VM says "timed out" ---
pool         389G  6.43T      2      0  15.0K      0
pool         389G  6.43T     13      0  81.0K      0
pool         389G  6.43T     15      0  96.0K      0
pool         389G  6.43T     24  8.41K  98.5K  66.9M
pool         389G  6.43T    103      0   296K      0
pool         389G  6.43T     98      0   311K      0
pool         389G  6.43T    109      0   327K      0
pool         389G  6.43T    106      0   280K      0
pool         389G  6.43T     94      0   220K      0
pool         389G  6.43T     85      0   203K      0
pool         389G  6.43T     91      0   219K      0
pool         389G  6.43T     88      0   214K      0
pool         389G  6.43T     85      0   201K      0
pool         389G  6.43T     84      0   201K      0
pool         389G  6.43T     91      0   214K      0
pool         389G  6.43T     82      0   193K      0
pool         389G  6.43T     82      0   196K      0
pool         389G  6.43T     86      0   209K      0
pool         389G  6.43T     93      0   215K      0
pool         389G  6.43T    111      0   252K      0
pool         389G  6.43T    137      0   304K      0
pool         389G  6.43T     98      0   225K      0
pool         389G  6.43T    102      0   238K      0
pool         389G  6.43T     85      0   204K      0
pool         389G  6.43T     85      0   203K      0
pool         389G  6.43T     77      0   182K      0
pool         389G  6.43T    100      0   229K      0
pool         389G  6.43T     92      0   215K      0
pool         389G  6.43T     89      0   209K      0
pool         389G  6.43T     91      0   219K      0
pool         389G  6.43T     86      0   204K      0
pool         389G  6.43T     87      0   209K      0
pool         389G  6.43T     91      0   209K      0
pool         389G  6.43T    112      0   256K      0
pool         389G  6.43T     89      0   212K      0
pool         389G  6.43T     98      0   226K      0
pool         389G  6.43T    108      0   246K      0
pool         389G  6.43T     90      0   213K      0
pool         389G  6.43T    150      0   341K      0
pool         389G  6.43T     98      0   227K      0
pool         389G  6.43T     92      0   215K      0
pool         389G  6.43T     89      0   211K      0
pool         389G  6.43T     88      0   211K      0
pool         389G  6.43T    105      0   240K      0
pool         389G  6.43T    128      0   292K      0
pool         389G  6.43T     82      0   198K      0
pool         389G  6.43T     86      0   204K      0
pool         389G  6.43T     96      0   224K      0
pool         389G  6.43T     91      0   214K      0
pool         389G  6.43T    102      0   235K      0

>From a separate attempt the iostat output:
@fs1:~$ iostat 2 1000
   tty        sd0           sd1           sd2           sd3            cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
--- everything goes fine for a while ---
   0    2  34   1   11    0   0    0  1699  36    6  1698  36    6    7 12  0 81
   0  118   0   0    0    0   0    0  18589 172    5  18589 172    5    8 86  0 
 6
   0   41   0   0    0    0   0    0  10409 224   10  9897 202   11   10 86  0  
4
   0   42   1   1    8    0   0    0  12153 214   11  12619 236   13   11 83  0 
 6
   0   42   0   0    0    0   0    0  18153 231    5  18090 231    5    9 85  0 
 6
   0   42   0   0    0    0   0    0  17643 213    5  17963 215    5    7 88  0 
 5
   0   42   0   0    0    0   0    0  3374  56    4  1799  42    7   13 77  0 10
   0   40   0   0    0    0   0    0  15884 193    3  17128 203    2   10 85  0 
 5
   0   44   0   0    0    0   0    0  18951 181    3  18952 184    3    9 86  0 
 5
^C
@fs1:~$ iostat 5 1000
   tty        sd0           sd1           sd2           sd3            cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
   0    2  34   1   11    0   0    0  1702  36    6  1702  36    6    7 12  0 81
   0   41  39  10    2    0   0    0  16486 160    6  16155 157    5   10 84  0 
 6
   0   20   0   0    1    0   0    0  9062 121    5  9017 120    4   12 53  0 35
--- at this point it seems the write has stalled ---
   0   16   0   0    0    0   0    0   87  89   13   78  84   11   12  3  0 85
   0   16   0   0    0    0   0    0   65  81    9   55  81    9   12  4  0 84
   0   16   0   0    0    0   0    0  1598 136   15  1576 138   12   12 26  0 62
   0   16   0   0    0    0   0    0  7549 138    7  7577 133    6   11 32  0 57
   0   16   0   0    0    0   0    0   46  78    7   47  82    7   12  4  0 84
   0   16  10   0   14    0   0    0   44  78    7   44  78    7   11  4  0 84
   0   16   0   0    0    0   0    0   45  80    7   45  80    6   11  3  0 86
   0   16   0   0    0    0   0    0   43  75    7   41  74    7   10  4  0 86
   0   16   0   0    0    0   0    0   47  80    7   45  80    7   10  4  0 86
   0   16   0   0    0    0   0    0   44  80    7   45  82    7    9  4  0 87


I have also tried to see if it comes from the vmkfstools that does the VM 
cloning to the iSCSI LUN:
# time dd if=/dev/zero of=/vmfs/volumes/fs1-vmbackup/test bs=65536 count=10000

Same thing, in the beginning the speed is good and eventually few bursts 
between 15-12 seconds, I believe the time between bursts is due to the 
initiator timing out on the connection and then re-establishing the connection. 
For those who understand ESXi..:
May  7 06:14:04 vmkernel: 1:00:17:17.510 cpu7:4179)NMP: 
nmp_CompleteCommandForPath: Command 0x2a (0x410005074f40) to NMP device 
"naa.600144f065f30c0000004be283b
a0003" failed on physical path "vmhba34:C0:T3:L1" H:0x5 D:0x40 P:0x0 Possible 
sense data:
May  7 06:14:05 0x2 0x3a 0x1.
May  7 06:14:04 vmkernel: 1:00:17:17.510 cpu7:4179)WARNING: NMP: 
nmp_DeviceRequestFastDeviceProbe: NMP device 
"naa.600144f065f30c0000004be283ba0003" state in do
ubt; requested fast path state update...
May  7 06:14:04 vmkernel: 1:00:17:17.510 cpu7:4179)ScsiDeviceIO: 747: Command 
0x2a to device "naa.600144f065f30c0000004be283ba0003" failed H:0x5 D:0x40 P:0x0 
Po
ssible sense data: 0x2 0x3a 0x1.
-- 
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

[storage-discuss] COMSTAR again / timeouts and write bursts only

Reply via email to