Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-06-25 Thread Harry Edmon
I understand the saying beggars can't be choosers, but I have heard nothing on 
this issue since June 19th.  Does anyone have any ideas on what is going on?  Is 
there more information I can collect that would help diagnose this problem?  And 
again, thanks for any and all help!

--
 Dr. Harry EdmonE-MAIL: [EMAIL PROTECTED]
 206-543-0547   [EMAIL PROTECTED]
 Dept of Atmospheric Sciences   FAX:206-543-0308
 University of Washington, Box 351640, Seattle, WA 98195-1640
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-06-19 Thread Harry Edmon

Stephen Hemminger wrote:


Does this fix it?
   # sysctl -w net.ipv4.tcp_abc=0


That did not help.  I have 1 minute outputs from tcpdump under both 2.6.11.12 
and 2.6.16.20.  You will see a large size difference between the files.  Since 
the 2.6.11.12 one is 2 MBytes, I thought I would post them via the web instead 
of via attachments.   Look at:


http://www.atmos.washington.edu/~harry/linux/2.6.11.12.out.1min
http://www.atmos.washington.edu/~harry/linux/2.6.16.20.out.1min

And again, thank to all of you for looking into this.

--
 Dr. Harry EdmonE-MAIL: [EMAIL PROTECTED]
 206-543-0547   [EMAIL PROTECTED]
 Dept of Atmospheric Sciences   FAX:206-543-0308
 University of Washington, Box 351640, Seattle, WA 98195-1640
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-06-19 Thread Harry Edmon



Jesper Dangaard Brouer wrote:



Harry Edmon [EMAIL PROTECTED] wrote:

I have a system with a strange network performance degradation from 
2.6.11.12 to most recent kernels including 2.6.16.20 and 
2.6.17-rc6. The system is has Dual single core Xeons with 
hyperthreading on.

cut

Hi Harry

Can you check which high-res timesource you are using?

In the kernel log look for:
 kernel: Using tsc for high-res timesource
 kernel: Using pmtmr for high-res timesource

I have experinced some network performance degradation when using the 
pmtmr timesource, on a Opteron AMD system.  It seems that the 
default timesource change between 2.6.15 to 2.6.16.


If you use pmtmr try to reboot with kernel option clock=tsc.

On my Opteron AMD system i normally can route 400 kpps, but with 
timesource pmtmr i could only route around 83 kpps.  (I found the 
timer to be the issue by using oprofile).




We have CONFIG_HPET_TIMER=y, so we do not see these messages.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-06-18 Thread Harry Edmon

Stephen Hemminger wrote:

  

Does this fix it?
   # sysctl -w net.ipv4.tcp_abc=0


Thanks for the suggestion.  I will give it a try later tonight.  Also Andrew - 
sorry for the incorrect placement of my follow-up comments.  I do appreciate 
everyone's help in figuring this out.


--
 Dr. Harry EdmonE-MAIL: [EMAIL PROTECTED]
 206-543-0547   [EMAIL PROTECTED]
 Dept of Atmospheric Sciences   FAX:206-543-0308
 University of Washington, Box 351640, Seattle, WA 98195-1640
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-06-17 Thread Harry Edmon
I assume you are talking about using TCP_NODELAY as a socket option within the 
LDM software.  I could give that a try.


There is a lot of traffic on this node, on the order of 2000 packets in and out 
per second, so the tcpdump output will grow pretty fast.  How long a tcpdump 
would be useful, and what options would you suggest?


I should also note that my network interfaces are Intel, using the latest e1000 
driver.



Andrew Morton wrote:

On Fri, 16 Jun 2006 09:01:23 -0700
Harry Edmon [EMAIL PROTECTED] wrote:

I have a system with a strange network performance degradation from 
2.6.11.12 to most recent kernels including 2.6.16.20 and 2.6.17-rc6.   
The system is has Dual single core Xeons with hyperthreading on.   The 
application is the LDM system from UCAR/Unidata 
(http://www.unidata.ucar.edu/software/ldm).   This system requests 
weather data from a variety of systems using RPC calls over a reserved 
TCP port (388), puts them into a memory mapped queue file, and then 
sends the data out to a variety of downstream requesting systems, again 
using RPC calls.  When the load is heavy, the 2.6.16.20 kernel falls way 
behind with the data ingestion.  The 2.6.11.12 kernel does not.   I have 
tried an experiment with a 2.6.17-rc6 system where it just does the 
ingestion, and not the downstream distribution, and it is able to keep 
up.   I would really appreciate any pointers as to where the problem may 
be and how to diagnose it.  I have attached the config files from both 
kernels and the sysctl.conf file I am using.   I have also included the 
output from netstat -s on the 2.6.16.20 system during a time when it 
was having problems.




(added netdev)

A quick grep indicates that it isn't using TCP_NODELAY - we've had problems
with that in the past.

Perhaps a tcpdump of the net traffic will help to determine what's going on.



--
 Dr. Harry EdmonE-MAIL: [EMAIL PROTECTED]
 206-543-0547   [EMAIL PROTECTED]
 Dept of Atmospheric Sciences   FAX:206-543-0308
 University of Washington, Box 351640, Seattle, WA 98195-1640
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html