Darren J Moffat wrote:
Jeff Anderson-Lee wrote:
Rick McNeal wrote:
On Jan 18, 2007, at 12:00 PM, Mark A. Carlson wrote:
I would imagine the argument is that of software drivers
for the storage stack consume much less overhead by
cutting out the IP part - less CPU consumed, perhaps
better throughput. As far as cost, it leverages the NIC
commodity pricing curve without requiring TCP offload.
The amount of CPU speed being consumed is really only valid for
underpowered machines. Any modern desktop has more then enough
horsepower to completely fill a 1GbE link with traffic at 4KB packet
sizes.
I don't have OpenSolaris numbers at hand, but under Linux x86_64 on a
server-class motherboard that doesn't seem to be the case.
In a recent "echo"-style test, 4KiB UDP pegged one CPU of a dual
3.6GHz Xeon EM64T but only obtained 90MB/s for UDP and 63MB/s for
TCP/IP (with no IPSEC). It didn't saturate the network until sending
16KiB packets for UDP and never did for TCP. Perhaps with a TCP
offload engine under Solaris one might do better but... that's a lot
of CPU power devoted just to flinging the bits.
Giving Linux network performance numbers and assuming things about
OpenSolaris network performance isn't actually helpful they have very
very different network stack implementations.
It also isn't useful info until you give full details of the hardware,
especially the NIC being used and the type of network hardware
involved for switches (unless this was back to back).
The original statement was: "Any modern desktop has more then enough
horsepower to completely fill a 1GbE link with traffic at 4KB packet
sizes." That's not exactly a hardware or O/S specific statement.
My reply acknowledged that the numbers were from Linux, not OpenSolaris,
and from a server versus a desktop. I made no assumptions about
OpenSolaris performance, but merely contested the original statement. I
would hope that this group would not be so closed as to ignore what the
"other guy" can or cannot do. (Besides, Sun sell servers with RHEL
too.) I was hoping to stimulate discussion.
As for the configuration:
The data passes between two identical servers routed by two lightly
loaded Extreme Networks Summit5i routers and two lightly loaded Asante
35516-T switches. All of which have more than enough bandwidth to
stream the data (and do for larger packet sizes). The setup can sustain
an "echo" throughput of 114.2MiB/s in each direction with 16KB UDP
packets. With 4KB packets from Linux to Linux however, I only see 90MB/s.
The servers have Supermicro X6DHE-XG2 motherboards:
CPU: Dual Intel Xeon - 3.6GHz - EM64T - 2MB Cache - 800 FSB
RAM: 4GB (4 x 1GB) DDR2-400 Registered ECC - Interleaved
NIC: Dual 10/100/1000 Mbps NIC (Intel 82541GI) - Integrated
The systems were not otherwise idle during the test (nor would they
likely be in practice), however the other jobs were running at lowered
priorities.
Linux 2.6.17-1.2142_FC4smp, UDP
ns/B B/packet ns /packet MB/s
373.99 64 23935 2.550
90.14 256 23076 10.580
51.80 512 26523 18.410
27.80 1024 28471 34.300
18.07 2048 37014 52.767
10.62 4096 43483 89.833
8.60 8192 70415 110.950
7.09 16384 116,187 134.482
6.04 65536 395,837 157.893 [extrapolated]
5.78 262144 1,514,435 165.078 [extrapolated]
5.71 1048576 5,988,830 166.978 [extrapolated]
5.69 4194304 23,886,410 167.459 [extrapolated]
5.69 16777216 95,476,730 167.580 [extrapolated]
M 5.69 ns/byte
C 22,970 ns/packet0
1/M 167.62 MB/s
1/C 43,534 packets/s
When graphed, the performance is a good fit to ns/packet =
bytes/packet*M+C where M and C are linear interpolations from the
measured data.
Unfortunately I do not currently have two OpenSolaris systems running to
try the test between, however I have now run a test between a Linux
client and a Solaris-x86 echo server on the same hardware (but a shorter
network route):
ns/B B ns MB/s
223.34 64 14294 4.270
57.80 256 14796 16.500
30.08 512 15403 31.700
16.08 1024 16468 59.300
11.82 2048 24202 80.700
8.63 4096 35351 110.500
8.40 8192 68772 113.600
7.42 16384 121,561 128.536 [extrapolated]
6.89 65536 451,735 138.355 [extrapolated]
6.76 262144 1,772,434 141.049 [extrapolated]
6.73 1048576 7,055,228 141.739 [extrapolated]
6.72 4194304 28,186,403 141.912 [extrapolated]
6.72 16777216 112,711,106 141.956 [extrapolated]
M 6.72 ns/byte
C 11,503 ns/packet
1/M 141.97 MB/s
1/C 86,937 packets/s
The numbers for this case show a significant improvement on both
throughput for a given packet size and packets/s. Furthermore, the echo
server was running at about 60%-70% idle depending on the packet size.
Thus the true capability for packets per second may be somewhat higher
(at least for smaller packets where the network is not saturated).
[Note however that 30%-40% of a dual processor system is 60%-80% of a
single processor.]
It therefore does appear that the Solaris network stack runs
significantly faster than the Linux network stack (at least for UDP).
I will see if I can get a second Solaris server running. so that I can
test Solaris to Solaris
As for the contention that "Any modern desktop has more then enough
horsepower to completely fill a 1GbE link with traffic at 4KB packet
sizes," that appears to depend on the O/S that said desktop is running.
With OpenSolaris that might be true, but at a price: Using 30% to 40%
of a dual processor plus hyper-threaded 3.6GHz Xeon server for an "echo"
service (nearly as bare-bones as a service can get) seems to challenge
the contention that "the amount of CPU speed being consumed is really
only valid for underpowered machines."
Jeff Anderson-Lee
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss