Hi Kenneth,
Thanks so much for your reply. For COMSTAR, as I understand, it is a
software running under solaris, which can convert a SUN server into
a Storage Array.
For the problem of vdbench, you should try wider block size coverage,
the problem will appear. The configure file I used lists bellow, and the
test is always interrupted. Also, I have tried to run vdbench with the
configuration of one block size in the field forxfersize, it ran smoothly
all the time. That's very interesting.
At present, the most crucial problem I'm facing is the incorrect test data
I got. It might be resulted from the effect of the file system's cache. In
order to bypass the cache, I created some raw disks and bound them to
LUN devices. I edited the rules file of udev, too.
Following is the info of raw disk:
#ll /dev/raw/raw3
crwxrwxrwx. 1 root disk 162, 3 Mar 26 06:48 /dev/raw/raw3
# ll /dev/sdc
brw-rw----. 1 root disk 8, 32 Mar 26 06:48 /dev/sdc
# raw -qa
/dev/raw/raw3: bound to major 8, minor 32
...
When I executed dd on raw3, it turned out something was wrong:
# time -p dd if=/dev/raw/raw3 of=/dev/null bs=1024k count=1024
dd: opening `/dev/raw/raw3': Device or resource busy
real 0.00
user 0.00
sys 0.00
However, dd on sdc appeared correct:
# time -p dd if=/dev/sdc of=/dev/null bs=1024k count=1024
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0453987 s, 2.3 GB/s
real 0.05
user 0.00
sys 0.04
I googled lots of sollution, but none was effective. Did you know what was
wrong or how to bypass the cache?
configure file of vdbench:
# cat fcoe-perf.conf
* SD: Storage Definition
* fcoe <-> ramdisk
sd=sd0,lun=/dev/sdc
sd=sd1,lun=/dev/sdd
sd=sd2,lun=/dev/sde
sd=sd3,lun=/dev/sdf
sd=sd4,lun=/dev/sdg
sd=sd5,lun=/dev/sdh
sd=sd6,lun=/dev/sdi
sd=sd7,lun=/dev/sdj
sd=sd8,lun=/dev/sdk
sd=sd9,lun=/dev/sdl
sd=sd10,lun=/dev/sdm
sd=sd11,lun=/dev/sdn
sd=sd12,lun=/dev/sdo
sd=sd13,lun=/dev/sdp
sd=sd14,lun=/dev/sdq
sd=sd15,lun=/dev/sdr
sd=sd16,lun=/dev/sds
sd=sd17,lun=/dev/sdt
sd=sd18,lun=/dev/sdu
sd=sd19,lun=/dev/sdv
sd=sd20,lun=/dev/sdw
sd=sd21,lun=/dev/sdx
* WD: Workload Definition
wd=wd1,sd=(sd0,sd21),seekpct=seq,rdpct=100,rhpct=100
wd=wd2,sd=(sd1-sd20),seekpct=seq,rdpct=0,whpct=100
* RD: Run Definition
rd=IOPS_RO_CACHE,wd=wd1,forthreads=(10),forxfersize=(1k,2k,4k,8k),iorate=max,elapsed=160,interval=3
rd=IOPS_WO_CACHE,wd=wd2,forthreads=(1),forxfersize=(1k,2k,4k,8k),iorate=max,elapsed=160,interval=3
On 03/27/12 02:44 AM, Zhang, Kenneth wrote:
There should be no problem running vdbench with open-fcoe. Even though we do
not routinely run vdbench, we never encountered issue you reported. I just run
a simple vdbench (v5.02) test on a RHEL6.2 system and the IO appears
successfully from 512 bytes to 1024Kb.
I am not sure whether your target setup is correct, not sure what is COMSTAR.
Have your tried other IO tools (not necessary performance bench mark tool) to
see whether you can run IO successfully?
-----Original Message-----
From: devel-boun...@open-fcoe.org [mailto:devel-boun...@open-fcoe.org] On
Behalf Of Zou, Yi
Sent: Monday, March 26, 2012 10:25 AM
To: Jevon
Cc: carol....@oracle.com; devel@open-fcoe.org
Subject: Re: [Open-FCoE] Ask help about performance test on Open-FCoE
Hi Yi and Experts,
Currently, I am trying to do a performance test by running a benchmark
tool vdbench on Open-FCoE. For this, I established an environment with
one initiator based on Ixgbe under Linux 2.6.32 and 22 COMSTAR LUNs in
the target under Solaris 11. The COMSTAR LUNS are based on the SCSI
target. The block size used in the test covers 1k, 2k, 4k, 8k, ..., 128k.
During the test I encountered two problems. One was the vdbench was
always interruptted while executing the little-size block I/O such as
1k,2k.
After starting the size from 32k, the test can successfully finish.
I never used vdbench, fio/dd/iozone are what I normally use. Someone in the
list who have the experience may know better about vdbench.
The other was about the test results. Take the 64k block size I/O test
for example, the mean throughput was about 57000MB/sec, that was
indeed impossible under current technology. For this problem, my
colleague doubted it was due to the effect of the file system's cache.
Then, I googled how to bypass the cache, but unfortunately, I could
not find an useful one. At last, I moved to use 'ram' to bind the LUNs
to some raw disks. Also, some error came out, it said "opening
`/devices/virtual/raw/raw3': No such file or directory", even if I
configured the raw disk by editing /etc/udev/rules.d/60-raw.rules.
The throughput number you have here does not look right to me. I don't know
your setup, what devices are there in you 'fcoeadm -t' output? Can you simply
do dd over those devices before doing vdbench? FCoE would be in '
/sys/devices/virtual/net/ethX' where ethX is what you pass to fcoeadm to create
FCoE instance on, also available from 'fcoeadm -i'
Make sure vdbench is pointing to the actual LUNs discovered by fcoe. For Intel
82599 nics, do 'ethtool -S | grep fcoe' would give you statistics, Observe the
counters while run your i/o to verify fcoe traffic is actually happing on that
nic port (or alternative run tshark)
For the above problems, did any expert have any solutions to them? Had
anyone tested the performance of Open-FCoE or known any benchmark tool
under Linux?
I am looking forward to your expert's reply, thanks so much in advance.
Best Regards,
Jevon
Following is the log corresponding to the above problems:
Problem1: the error msg.
01:07:30.979 Slave aborted. Abort message received:
01:07:30.980 Task WG_task stopped after 3 minutes of trying to get it
to terminate itself. Unpredictable results may occur.
01:07:30.980
01:07:30.980 Look at file localhost-0.stdout.html for more information.
Have you found anything from the above output file?
yi
01:07:30.980
01:07:30.981 Slave localhost-0 prematurely terminated.
01:07:30.981
java.lang.RuntimeException: Slave localhost-0 prematurely terminated.
at Vdb.common.failure(common.java:234)
at Vdb.SlaveStarter.startSlave(SlaveStarter.java:185)
at Vdb.SlaveStarter.run(SlaveStarter.java:68)
01:07:31.984 common.exit(): -99
common.exit(): -99
Problem 2: incorrect results
19:18:30.881 All slaves are now connected
19:18:32.001 Starting RD=IOPS_RO_CACHE; I/O rate: Uncontrolled MAX;
Elapsed=160; For loops: xfersize=65536 threads=10
Mar 25, 2012 interval i/o MB/sec bytes read resp
resp resp cpu% cpu%
rate 1024**2 i/o pct time
max stddev sys+usr sys
19:18:38.074 1 915879.83 57242.49 65536 100.00 0.017
3.208 0.004 74.2 58.5
19:18:44.048 2 923004.00 57687.75 65536 100.00 0.017
3.121 0.003 86.5 68.5
19:18:50.049 3 914239.33 57139.96 65536 100.00 0.017
1.994 0.003 86.6 68.6
19:18:56.050 4 916271.67 57266.98 65536 100.00 0.017
0.983 0.003 86.6 68.5
19:19:02.056 5 916500.33 57281.27 65536 100.00 0.017
1.566 0.003 86.5 68.6
19:19:08.046 6 921123.83 57570.24 65536 100.00 0.017
0.617 0.003 86.7 68.8
19:19:14.046 7 921845.17 57615.32 65536 100.00 0.017
2.264 0.003 86.9 68.9
19:19:20.047 8 921880.83 57617.55 65536 100.00 0.017
1.999 0.003 86.6 68.7
19:19:26.066 9 918347.00 57396.69 65536 100.00 0.017
0.955 0.003 86.3 68.4
19:19:32.074 10 919466.50 57466.66 65536 100.00 0.017
1.490 0.003 86.6 68.6
19:19:38.075 11 915937.50 57246.09 65536 100.00 0.017
1.302 0.003 86.4 68.3
19:19:44.048 12 922523.50 57657.72 65536 100.00 0.017
2.417 0.003 87.0 68.8
19:19:50.066 13 924455.00 57778.44 65536 100.00 0.017
4.046 0.003 86.9 69.0
19:19:56.067 14 923076.83 57692.30 65536 100.00 0.017
1.041 0.003 87.0 69.2
19:20:02.048 15 917671.33 57354.46 65536 100.00 0.017
1.271 0.003 86.9 68.8
19:20:08.046 16 918901.00 57431.31 65536 100.00 0.017
1.287 0.003 86.6 68.7
19:20:14.048 17 923838.67 57739.92 65536 100.00 0.017
1.059 0.003 86.6 68.4
19:20:20.051 18 921362.17 57585.14 65536 100.00 0.017
3.675 0.003 86.8 68.8
19:20:26.047 19 922135.00 57633.44 65536 100.00 0.017
0.820 0.003 86.7 68.7
19:20:32.056 20 926624.33 57914.02 65536 100.00 0.017
1.014 0.003 86.7 69.0
19:20:38.046 21 925666.83 57854.18 65536 100.00 0.017
1.620 0.003 86.7 68.9
19:20:44.057 22 925109.33 57819.33 65536 100.00 0.017
1.010 0.003 86.6 68.8
19:20:50.049 23 914419.83 57151.24 65536 100.00 0.017
1.953 0.003 86.6 68.7
19:20:56.051 24 918441.33 57402.58 65536 100.00 0.017
0.615 0.003 86.7 68.7
19:21:02.051 25 919283.00 57455.19 65536 100.00 0.017
1.987 0.003 86.8 68.9
19:21:08.048 26 920740.50 57546.28 65536 100.00 0.017
1.588 0.003 86.6 68.7
19:21:08.050 avg_2-26 920514.59 57532.16 65536 100.00 0.017
4.046 0.003 86.7 68.7
19:21:08.050 *
19:21:08.050 host=localhost
19:21:08.050 * Warning: average processor utilization 86.69%
19:21:08.050 * Any processor utilization over 80% could mean that your
system
19:21:08.050 * does not have enough cycles to run the highest rate
possible
19:21:08.050 *
19:21:09.001 Starting RD=IOPS_RO_CACHE; I/O rate: Uncontrolled MAX;
Elapsed=160; For loops: xfersize=131072 threads=10
19:28:59.067 avg_2-26 467986.12 58498.26 131072 0.00 0.041
8.290 0.009 87.0 79.4
19:28:59.067 *
19:28:59.067 host=localhost
19:28:59.068 * Warning: average processor utilization 87.04%
19:28:59.068 * Any processor utilization over 80% could mean that your
system
19:28:59.068 * does not have enough cycles to run the highest rate
possible
19:28:59.068 *
19:28:59.508 Slave localhost-0 terminated
19:28:59.508 Slave localhost-6 terminated
19:28:59.509 Slave localhost-4 terminated
19:28:59.514 Slave localhost-1 terminated
19:28:59.515 Slave localhost-5 terminated
19:28:59.520 Vdbench execution completed successfully. Output directory:
/usr/fcoe-perf/vdb5/output
19:28:59.520 Slave localhost-2 terminated
19:28:59.522 Slave localhost-7 terminated
19:28:59.523 Slave localhost-3 terminated
_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel