[Openstack] question on the GRE Performance

2013-03-15 Thread 小包
Hi Guys,

*in my test, i found OVS GRE performance so lower, for example:*
*100Mbits Switch, GRE just 26Mbits speed,but use linux bridge 95Mbits,*
*
*
*so, my question is: why GRE speed low, or may be my config not right,*
*
*
*
*
*Thanks,*
*Tommy*
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] question on the GRE Performance

2013-03-15 Thread Rick Jones

On 03/15/2013 08:05 AM, tommy(小包) wrote:

Hi Guys,

in my test, i found OVS GRE performance so lower, for example:
100Mbits Switch, GRE just 26Mbits speed,but use linux bridge
95Mbits,

so, my question is: why GRE speed low, or may be my config not right,/


95 and 26 Mbit/s measured at what level?  On the wire (including all 
the protocol headers) or to user level (after all the protocol headers)? 
That you were seeing 95 Mbit/s suggests user level but I'd like to make 
certain.


GRE adds header overhead, but I wouldn't think enough to take one from 
95 down to 26 Mbit/s to user level.  I would suggest looking at, in no 
particular order:


*) Netstat stats on your sender - is it retransmitting in one case and 
not the other?


*) per-CPU CPU utilization - is any one CPU on the sending, receiving or 
intervening iron saturating in one case and not the other?


and go from there. I'm guessing your tests are all bulk-transfer - you 
might want to consider adding some latency and/or aggregate small-packet 
performance tests.


happy benchmarking,

rick jones
the applicability varies,  but attached is some boilerplate I've 
built-up over time, on the matter of why is my network performance 
slow?  PS - the beforeafter utility mentioned is no longer available 
via ftp.cup.hp.com because ftp.cup.hp.com no longer exists.  I probably 
aught to put it up on ftp.netperf.org...



Some of my checklist items when presented with assertions of poor
network performance, in no particular order, numbered only for
convenience of reference:

1) Is *any one* CPU on either end of the transfer at or close to 100%
   utilization?  A given TCP connection cannot really take advantage
   of more than the services of a single core in the system, so
   average CPU utilization being low does not a priori mean things are
   OK.

2) Are there TCP retransmissions being registered in netstat
   statistics on the sending system?  Take a snapshot of netstat -s -t
   from just before the transfer, and one from just after and run it
   through beforeafter from
   ftp://ftp.cup.hp.com/dist/networking/tools:

   netstat -s -t  before
   transfer or wait 60 or so seconds if the transfer was already going
   netstat -s -t  after
   beforeafter before after  delta

3) Are there packet drops registered in ethtool -S statistics on
   either side of the transfer?  Take snapshots in a manner similar to
   that with netstat.

4) Are there packet drops registered in the stats for the switch(es)
   being traversed by the transfer?  These would be retrieved via
   switch-specific means.

5) What is the latency between the two end points.  Install netperf on
   both sides, start netserver on one side and on the other side run:

   netperf -t TCP_RR -l 30 -H remote

   and invert the transaction/s rate to get the RTT latency.  There
   are caveats involving NIC interrupt coalescing settings defaulting
   in favor of throughput/CPU util over latency:

   ftp://ftp.cup.hp.com/dist/networking/briefs/nic_latency_vs_tput.txt

   but when the connections are over a WAN latency is important and
   may not be clouded as much by NIC settings.

   This all leads into:

6) What is the *effective* TCP (or other) window size for the
   connection.  One limit to the performance of a TCP bulk transfer
   is:

   Tput = W(eff)/RTT

   The effective window size will be the lesser of:

   a) The classic TCP window advertised by the receiver. This is the
  value in the TCP header's window field shifted by the window
  scaling factor which was exchanged during connection
  establishment. The window scale factor is why one wants to get
  traces including the connection establishment.
   
  The size of the classic window will depend on whether/what the
  receiving application has requested via a setsockopt(SO_RCVBUF)
  call and the sysctl limits set in the OS.  If the receiving
  application does not call setsockopt(SO_RCVBUF) then under Linux
  the stack will autotune the advertised window based on other
  sysctl limits in the OS.  Other stacks may or may not autotune.
 
   b) The computed congestion window on the sender - this will be
  affected by the packet loss rate over the connection, hence the
  interest in the netstat and ethtool stats.

   c) The quantity of data to which the sending TCP can maintain a
  reference while waiting for it to be ACKnowledged by the
  receiver - this will be akin to the classic TCP window case
  above, but on the sending side, and concerning
  setsockopt(SO_SNDBUF) and sysctl settings.

   d) The quantity of data the sending application is willing/able to
  send at any one time before waiting for some sort of
  application-level acknowledgement.  FTP and rcp will just blast
  all the data of the file into the socket as fast as the socket
  will take it.  Scp has some application-layer windowing which
  may cause it to put less data out onto the connection