Hi Yi,
Thanks for your quick reply!
On 03/27/12 01:24 AM, Zou, Yi wrote:
Hi Yi and Experts,
Currently, I am trying to do a performance test by running a benchmark
tool vdbench on Open-FCoE. For this, I established an environment with
one initiator based on Ixgbe under Linux 2.6.32 and 22 COMSTAR LUNs
in the target under Solaris 11. The COMSTAR LUNS are based on the SCSI
target. The block size used in the test covers 1k, 2k, 4k, 8k, ..., 128k.
During the test I encountered two problems. One was the vdbench was
always interruptted while executing the little-size block I/O such as
1k,2k.
After starting the size from 32k, the test can successfully finish.
I never used vdbench, fio/dd/iozone are what I normally use. Someone in
the list who have the experience may know better about vdbench.
The other was about the test results. Take the 64k block size I/O test
for
example, the mean throughput was about 57000MB/sec, that was indeed
impossible under current technology. For this problem, my colleague
doubted
it was due to the effect of the file system's cache. Then, I googled how
to
bypass the cache, but unfortunately, I could not find an useful one. At
last,
I moved to use 'ram' to bind the LUNs to some raw disks. Also, some
error
came out, it said "opening `/devices/virtual/raw/raw3': No such file or
directory",
even if I configured the raw disk by editing
/etc/udev/rules.d/60-raw.rules.
The throughput number you have here does not look right to me. I don't
know your setup, what devices are there in you 'fcoeadm -t' output? Can
you simply do dd over those devices before doing vdbench? FCoE would
be in ' /sys/devices/virtual/net/ethX' where ethX is what you pass to
fcoeadm to create FCoE instance on, also available from 'fcoeadm -i'
# fcoeadm -i
Description: 82599EB 10-Gigabit SFI/SFP+ Network Connection
Revision: 01
Manufacturer: Intel Corporation
Serial Number: 001B21832970
Driver: ixgbe 3.4.24-NAPI
Number of Ports: 1
Symbolic Name: fcoe v0.1 over eth5
OS Device Name: host20
Node Name: 0x1000001B21832973
Port Name: 0x2000001B21832973
FabricName: 0x2007000DECA39341
Speed: 10 Gbit
Supported Speed: 10 Gbit
MaxFrameSize: 2112
FC-ID (Port ID): 0xE0002B
State: Online
# fcoeadm -t
Interface: eth5
Roles: FCP Target
Node Name: 0x200000C0DD15DDBB
Port Name: 0x210000C0DD15DDBB
Target ID: 0
MaxFrameSize: 2112
OS Device Name: rport-20:0-2
FC-ID (Port ID): 0xE0001C
State: Online
LUN ID Device Name Capacity Block Size Description
------ ----------- ---------- ----------
----------------------------
0 /dev/sdc 100.00 MB 512 SUN COMSTAR (rev 1.0)
1 /dev/sdd 100.00 MB 512 SUN COMSTAR (rev 1.0)
2 /dev/sde 100.00 MB 512 SUN COMSTAR (rev 1.0)
3 /dev/sdf 100.00 MB 512 SUN COMSTAR (rev 1.0)
4 /dev/sdg 100.00 MB 512 SUN COMSTAR (rev 1.0)
5 /dev/sdh 100.00 MB 512 SUN COMSTAR (rev 1.0)
6 /dev/sdi 100.00 MB 512 SUN COMSTAR (rev 1.0)
7 /dev/sdj 100.00 MB 512 SUN COMSTAR (rev 1.0)
8 /dev/sdk 100.00 MB 512 SUN COMSTAR (rev 1.0)
9 /dev/sdl 100.00 MB 512 SUN COMSTAR (rev 1.0)
10 /dev/sdm 100.00 MB 512 SUN COMSTAR (rev 1.0)
11 /dev/sdn 100.00 MB 512 SUN COMSTAR (rev 1.0)
12 /dev/sdo 100.00 MB 512 SUN COMSTAR (rev 1.0)
13 /dev/sdp 100.00 MB 512 SUN COMSTAR (rev 1.0)
14 /dev/sdq 100.00 MB 512 SUN COMSTAR (rev 1.0)
15 /dev/sdr 100.00 MB 512 SUN COMSTAR (rev 1.0)
16 /dev/sds 100.00 MB 512 SUN COMSTAR (rev 1.0)
17 /dev/sdt 100.00 MB 512 SUN COMSTAR (rev 1.0)
18 /dev/sdu 100.00 MB 512 SUN COMSTAR (rev 1.0)
19 /dev/sdv 100.00 MB 512 SUN COMSTAR (rev 1.0)
20 /dev/sdw 100.00 MB 512 SUN COMSTAR (rev 1.0)
21 /dev/sdx 100.00 MB 512 SUN COMSTAR (rev 1.0)
'dd' turns out correct:
# dd if=/dev/sdc of=/dev/null bs=1024k count=1024
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0349097 s, 3.0 GB/s
Make sure vdbench is pointing to the actual LUNs discovered by fcoe. For
Intel 82599 nics, do 'ethtool -S | grep fcoe' would give you statistics,
Observe the counters while run your i/o to verify fcoe traffic is actually
happing on that nic port (or alternative run tshark)
I am sure the actual LUNs discovered by fcoe are used in the test. I run
the vdbench with one configure file. The devices corresponding to the LUNs
are added in the file.
For the above problems, did any expert have any solutions to them? Had
anyone tested the performance of Open-FCoE or known any benchmark tool
under Linux?
I am looking forward to your expert's reply, thanks so much in advance.
Best Regards,
Jevon
Following is the log corresponding to the above problems:
Problem1: the error msg.
01:07:30.979 Slave aborted. Abort message received:
01:07:30.980 Task WG_task stopped after 3 minutes of trying to get it to
terminate itself. Unpredictable results may occur.
01:07:30.980
01:07:30.980 Look at file localhost-0.stdout.html for more information.
Have you found anything from the above output file?
yi
From that output file, the new info besides them I posted is :
19:30:31.109 19:30:31.109 Trying to terminate tasks: WG_task
19:30:31.110 19:30:31.109 Interrupts issued for tasks WG_task: 984
.....
In the configuration of my test, ten threads will be started for read and
one thread will be used for write. I am not sure whether this is due to
that the multi-threads exhaust the CPU cycles. Because after one block
test finished, there are always some warnings are outputted:
19:14:49.059 *
19:14:49.059 host=localhost
19:14:49.059 * Warning: average processor utilization 92.83%
19:14:49.059 * Any processor utilization over 80% could mean that your
system
19:14:49.059 * does not have enough cycles to run the highest rate possible
19:14:49.059 *
01:07:30.980
01:07:30.981 Slave localhost-0 prematurely terminated.
01:07:30.981
java.lang.RuntimeException: Slave localhost-0 prematurely terminated.
at Vdb.common.failure(common.java:234)
at Vdb.SlaveStarter.startSlave(SlaveStarter.java:185)
at Vdb.SlaveStarter.run(SlaveStarter.java:68)
01:07:31.984 common.exit(): -99
common.exit(): -99
Problem 2: incorrect results
19:18:30.881 All slaves are now connected
19:18:32.001 Starting RD=IOPS_RO_CACHE; I/O rate: Uncontrolled MAX;
Elapsed=160; For loops: xfersize=65536 threads=10
Mar 25, 2012 interval i/o MB/sec bytes read resp
resp resp cpu% cpu%
rate 1024**2 i/o pct time
max stddev sys+usr sys
19:18:38.074 1 915879.83 57242.49 65536 100.00 0.017
3.208 0.004 74.2 58.5
19:18:44.048 2 923004.00 57687.75 65536 100.00 0.017
3.121 0.003 86.5 68.5
19:18:50.049 3 914239.33 57139.96 65536 100.00 0.017
1.994 0.003 86.6 68.6
19:18:56.050 4 916271.67 57266.98 65536 100.00 0.017
0.983 0.003 86.6 68.5
19:19:02.056 5 916500.33 57281.27 65536 100.00 0.017
1.566 0.003 86.5 68.6
19:19:08.046 6 921123.83 57570.24 65536 100.00 0.017
0.617 0.003 86.7 68.8
19:19:14.046 7 921845.17 57615.32 65536 100.00 0.017
2.264 0.003 86.9 68.9
19:19:20.047 8 921880.83 57617.55 65536 100.00 0.017
1.999 0.003 86.6 68.7
19:19:26.066 9 918347.00 57396.69 65536 100.00 0.017
0.955 0.003 86.3 68.4
19:19:32.074 10 919466.50 57466.66 65536 100.00 0.017
1.490 0.003 86.6 68.6
19:19:38.075 11 915937.50 57246.09 65536 100.00 0.017
1.302 0.003 86.4 68.3
19:19:44.048 12 922523.50 57657.72 65536 100.00 0.017
2.417 0.003 87.0 68.8
19:19:50.066 13 924455.00 57778.44 65536 100.00 0.017
4.046 0.003 86.9 69.0
19:19:56.067 14 923076.83 57692.30 65536 100.00 0.017
1.041 0.003 87.0 69.2
19:20:02.048 15 917671.33 57354.46 65536 100.00 0.017
1.271 0.003 86.9 68.8
19:20:08.046 16 918901.00 57431.31 65536 100.00 0.017
1.287 0.003 86.6 68.7
19:20:14.048 17 923838.67 57739.92 65536 100.00 0.017
1.059 0.003 86.6 68.4
19:20:20.051 18 921362.17 57585.14 65536 100.00 0.017
3.675 0.003 86.8 68.8
19:20:26.047 19 922135.00 57633.44 65536 100.00 0.017
0.820 0.003 86.7 68.7
19:20:32.056 20 926624.33 57914.02 65536 100.00 0.017
1.014 0.003 86.7 69.0
19:20:38.046 21 925666.83 57854.18 65536 100.00 0.017
1.620 0.003 86.7 68.9
19:20:44.057 22 925109.33 57819.33 65536 100.00 0.017
1.010 0.003 86.6 68.8
19:20:50.049 23 914419.83 57151.24 65536 100.00 0.017
1.953 0.003 86.6 68.7
19:20:56.051 24 918441.33 57402.58 65536 100.00 0.017
0.615 0.003 86.7 68.7
19:21:02.051 25 919283.00 57455.19 65536 100.00 0.017
1.987 0.003 86.8 68.9
19:21:08.048 26 920740.50 57546.28 65536 100.00 0.017
1.588 0.003 86.6 68.7
19:21:08.050 avg_2-26 920514.59 57532.16 65536 100.00 0.017
4.046 0.003 86.7 68.7
19:21:08.050 *
19:21:08.050 host=localhost
19:21:08.050 * Warning: average processor utilization 86.69%
19:21:08.050 * Any processor utilization over 80% could mean that your
system
19:21:08.050 * does not have enough cycles to run the highest rate
possible
19:21:08.050 *
19:21:09.001 Starting RD=IOPS_RO_CACHE; I/O rate: Uncontrolled MAX;
Elapsed=160; For loops: xfersize=131072 threads=10
19:28:59.067 avg_2-26 467986.12 58498.26 131072 0.00 0.041
8.290 0.009 87.0 79.4
19:28:59.067 *
19:28:59.067 host=localhost
19:28:59.068 * Warning: average processor utilization 87.04%
19:28:59.068 * Any processor utilization over 80% could mean that your
system
19:28:59.068 * does not have enough cycles to run the highest rate
possible
19:28:59.068 *
19:28:59.508 Slave localhost-0 terminated
19:28:59.508 Slave localhost-6 terminated
19:28:59.509 Slave localhost-4 terminated
19:28:59.514 Slave localhost-1 terminated
19:28:59.515 Slave localhost-5 terminated
19:28:59.520 Vdbench execution completed successfully. Output directory:
/usr/fcoe-perf/vdb5/output
19:28:59.520 Slave localhost-2 terminated
19:28:59.522 Slave localhost-7 terminated
19:28:59.523 Slave localhost-3 terminated
stdout/stderr for slave=localhost-0
Links to Run Definitions: IOPS_RO_CACHE For loops: xfersize=1024
threads=10 <#_15472292IOPS_RO_CACHE For loops: xfersize=1024 threads=10>
IOPS_RO_CACHE For loops: xfersize=2048
threads=10 <#_29530802IOPS_RO_CACHE For loops: xfersize=2048 threads=10>
IOPS_RO_CACHE For loops: xfersize=4096
threads=10 <#_20591247IOPS_RO_CACHE For loops: xfersize=4096 threads=10>
IOPS_RO_CACHE For loops: xfersize=8192
threads=10 <#_30246505IOPS_RO_CACHE For loops: xfersize=8192 threads=10>
IOPS_WO_CACHE For loops: xfersize=1024
threads=1 <#_3373112IOPS_WO_CACHE For loops: xfersize=1024 threads=1>
IOPS_WO_CACHE For loops: xfersize=2048
threads=1 <#_18468004IOPS_WO_CACHE For loops: xfersize=2048 threads=1>
IOPS_WO_CACHE For loops: xfersize=4096
threads=1 <#_13249998IOPS_WO_CACHE For loops: xfersize=4096 threads=1>
IOPS_WO_CACHE For loops: xfersize=8192
threads=1 <#_25516242IOPS_WO_CACHE For loops: xfersize=8192 threads=1>
19:27:40.804 19:27:40.803 SlaveJvm execution parameter: '-m 10.113.197.235'
19:27:40.804 19:27:40.803 SlaveJvm execution parameter: '-n
localhost-10-120326-19.27.40.352'
19:27:40.804 19:27:40.803 SlaveJvm execution parameter: '-l localhost-0'
19:27:40.805 19:27:40.804 SlaveJvm execution parameter: '-p 5570'
19:27:40.805 19:27:40.804 SlaveJvm positional parameter: 'SlaveJvm'
19:27:40.829 19:27:40.828 successfully connected to master 10.113.197.235
19:27:40.829 19:27:40.828 Connection to 10.113.197.235 using port 5570
successful
19:27:41.220 19:27:41.220 execute(): /usr/fcoe-perf/vdb5/linux/linux_clock.suse
19:27:41.303 19:27:41.303 ticks_per_second: 100
19:27:41.782 19:27:41.782 Opening sd=sd0,lun=/dev/sdc; write: false; flags: 0
19:27:41.791 19:27:41.790 Started 1 Workload Generator threads.
19:27:41.791 19:27:41.790 work.use_waiter: false
19:27:41.797 19:27:41.796 Started 10 i/o threads
19:27:41.797 19:27:41.797 Waiting for task synchronization
19:27:41.797 19:27:41.797 task_wait_start_complete() end
/*19:27:42.005 Starting RD=IOPS_RO_CACHE; I/O rate: Uncontrolled MAX;
Elapsed=160; For loops: xfersize=1024 threads=10*/
19:27:42.008 19:27:42.007 task_run_all(): 11 tasks
19:30:31.109 19:30:31.109 Trying to terminate tasks: WG_task
19:30:31.110 19:30:31.109 Interrupts issued for tasks WG_task: 984
19:30:41.128 19:30:41.127 Trying to terminate tasks: WG_task
19:30:41.128 19:30:41.128 Interrupts issued for tasks WG_task: 1969
19:30:51.143 19:30:51.142 Trying to terminate tasks: WG_task
19:30:51.143 19:30:51.143 Interrupts issued for tasks WG_task: 2955
19:31:01.155 19:31:01.154 Trying to terminate tasks: WG_task
19:31:01.155 19:31:01.155 Interrupts issued for tasks WG_task: 3941
19:31:11.167 19:31:11.167 Trying to terminate tasks: WG_task
19:31:11.168 19:31:11.167 Interrupts issued for tasks WG_task: 4927
19:31:21.179 19:31:21.178 Trying to terminate tasks: WG_task
19:31:21.179 19:31:21.179 Interrupts issued for tasks WG_task: 5913
19:31:31.190 19:31:31.190 Trying to terminate tasks: WG_task
19:31:31.191 19:31:31.190 Interrupts issued for tasks WG_task: 6899
19:31:41.202 19:31:41.201 Trying to terminate tasks: WG_task
19:31:41.202 19:31:41.202 Interrupts issued for tasks WG_task: 7885
19:31:51.213 19:31:51.213 Trying to terminate tasks: WG_task
19:31:51.214 19:31:51.213 Interrupts issued for tasks WG_task: 8871
19:32:01.225 19:32:01.224 Trying to terminate tasks: WG_task
19:32:01.225 19:32:01.225 Interrupts issued for tasks WG_task: 9857
19:32:11.236 19:32:11.236 Trying to terminate tasks: WG_task
19:32:11.237 19:32:11.236 Interrupts issued for tasks WG_task: 10843
19:32:21.247 19:32:21.247 Trying to terminate tasks: WG_task
19:32:21.248 19:32:21.247 Interrupts issued for tasks WG_task: 11829
19:32:31.269 19:32:31.268 Trying to terminate tasks: WG_task
19:32:31.269 19:32:31.269 Interrupts issued for tasks WG_task: 12816
19:32:41.289 19:32:41.289 Trying to terminate tasks: WG_task
19:32:41.290 19:32:41.289 Interrupts issued for tasks WG_task: 13803
19:32:51.301 19:32:51.300 Trying to terminate tasks: WG_task
19:32:51.301 19:32:51.301 Interrupts issued for tasks WG_task: 14789
19:33:01.312 19:33:01.312 Trying to terminate tasks: WG_task
19:33:01.312 19:33:01.312 Interrupts issued for tasks WG_task: 15775
19:33:11.324 19:33:11.324 Trying to terminate tasks: WG_task
19:33:11.324 19:33:11.324 Interrupts issued for tasks WG_task: 16761
19:33:21.114 19:33:21.113
19:33:21.114 19:33:21.114 Task WG_task stopped after 3 minutes of trying to get
it to terminate itself. Unpredictable results may occur.
19:33:21.114 19:33:21.114
19:33:21.115 java.lang.RuntimeException: Task WG_task stopped after 3 minutes
of trying to get it to terminate itself. Unpredictable results may occur.
19:33:21.115 at Vdb.common.failure(common.java:234)
19:33:21.115 at Vdb.Task_num.interrupt_tasks(Task_num.java:334)
19:33:21.115 at Vdb.SlaveWorker.doRegularWorkload(SlaveWorker.java:185)
19:33:21.115 at Vdb.SlaveWorker.run(SlaveWorker.java:106)
19:33:22.121 19:33:22.120 common.exit(): -99
19:33:22.121 common.exit(): -99
_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel