[Bug 1698454] Re: TCP/IP Throughput Limitation

2017-08-15 Thread Brandon Bates
In light of my previous findings I did a dual boot install with 15.10
and 16.04.  I put both installs on 4.3.6 mainline kernel and found that
15 was fine and 16 was not.  I started looking at installed packages and
tried to isolate some packages.  After uninstalling several and
rebooting I found that my performance went back up to normal.  Long
story short I found that uninstalling libnuma1 (which also uninstalls
irqbalance) solved the problem after a reboot.  FYI simply killing the
irqbalance task did not solve the problem.  I haven't taken time to
troubleshoot further but it's something with numa/interrupt processing
that is taking too long or something.  Reinstalling irqbalance/libnuma1
causes the problem again.  This is a single Xeon CPU on a dual CPU
board.

** Changed in: irqbalance (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1698454] Re: TCP/IP Throughput Limitation

2017-08-15 Thread Brandon Bates
** Also affects: irqbalance (Ubuntu)
   Importance: Undecided
   Status: New

** Package changed: linux (Ubuntu) => numactl (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1698454] Re: TCP/IP Throughput Limitation

2017-07-21 Thread Brandon Bates
Testing was with mainline builds in both cases, so it should rule out
the kernel.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1698454] Re: TCP/IP Throughput Limitation

2017-07-21 Thread Brandon Bates
My initial conclusion on something changing in the Kernel was incorrect.
I've since tested 17.04 all the way back to 4.2.8 and it still fails.
I've also tested 15.10 up to 4.3.6 and it works correctly.  So something
in Ubuntu's implementation changed between 15.10 and 16.04 that is
causing this bug.

** Tags removed: kernel-bug-exists-upstream
** Tags added: xenial yakkety

** Description changed:

  I'm having a severe TCP receive problem with a new Ubuntu 16 server and
  an Intel 82599ES 10-Gigabit SFP+ (X520 series card) when a windows 10
  machine is used to send the data. The receive performance is 1Gb to 2Gb
  steady using iperf single stream, while send is 9.42Gb.
  
  Here is what I have found:
  
  -Running 8 parallel streams in iperf gets me the full 9.42 in aggregate
  
  -Tried a few windows 10 machines with repeatable same results
  
  -Same results through switch or direct
  
  -Linux to linux no problems
  
  -Tested with clean installs of 16.04.02 and 4.4.0-66 kernel, latest the
  upgrade will give me, also tried 17.04 with up to 4.10.0-20-generic same
  problem, tried 4.11.0-041100rc8-generic and problem seemed to go away
  for a bit but came back so I think that might be a red herring (see
  interesting note below).
  
  -14.04 and 15.10 with up to 4.2.8-040208-generic is 9.42/9.42 (works
  fine, couldn't get 4.3 to install on 15.10)
  
  -14.04 with the latest 5.0.4 ixgbe driver still works fine, does not
  seem to be a driver version issue.
  
  -Tried swapping out cards with another same model/chipset with same
  exact result
  
  -Large receive offload increases the performance from a steady 1Gb to a
  steady 2.14Gb
  
  -Disabling sack under got me 75% of the way back to full speed, but it
  was very unstable (didn't hold at a solid speed)
  
  -Using a different brand of card in this same server (but not the same
  slot) (mellanox infiniband running in ethernet mode) is 9.42/8.5 (seems
  to work fine, 9.42 to 8.5 is windows machine limit I think)
  
  -Interesting: When swapping between intel and mellanox 10Gbe cards (with
  a reboot of the server inbetween, but not a reboot of the windows
  machine, and keeping the same IP on the server) the performance does not
  change immediately. When going from intel to mellanox the first test
  holds around 1 or 2Gbit, then after that it jumps up to 8.5 steady.
  Similarly when switching from mellanox to Intel the first 1 or 2 seconds
  of performance hits 8.x then drop in half or more and within 3 or 4
  seconds it is back to 1Gbit and stays there for each subsequent iperf
  test.
  
  -Interesting: When capturing packets on wireshark on windows,
  performance comes up to 8.5! (No, ditching windows isn't an option here
  unfortunately. :) So obviously something in the way the two tcp stacks
  are interacting without a buffer inbetween when the intel driver is
  online is causing issues.
  
  -Port bonding makes no difference (nor should it for single stream)
  
  -Tried rolling the intel driver back to the 3.2.1 version that is on
  14.04 but it was too old to compile
  
  -I suspect this is a kernel TCP/IP implementation change somewhere
  between 4.2 and 4.4 that has caused it to not play nice with window's
  stack.  Based on the delayed performance change I'm thinking something
  is messing with flow control, the tcp congestion algorithm, or the tcp
  window. I tried turning some various tcp options off, tweaking some
  values, changing congestion algorithms, hardware flow control, and
  comparing sysctl stuff from the u14 machine to this machine to no avail.
+ 
+ UPDATE:
+ My initial conclusion on something changing in the Kernel was incorrect.  
I've since tested 17.04 all the way back to 4.2.8 and it still fails.  I've 
also tested 15.10 up to 4.3.6 and it works correctly.  So something in Ubuntu's 
implementation changed between 15.10 and 16.04 that is causing this bug.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1698454] Re: TCP/IP Throughput Limitation

2017-07-11 Thread Brandon Bates
Is this something I should try to report elsewhere as a kernel bug?  I
haven't tried a dist other than Ubuntu yet.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1698454] Re: TCP/IP Throughput Limitation

2017-06-30 Thread Brandon Bates
Tested with mainline v4.12-rc7, bug still exists.  Added tag kernel-bug-
exists-upstream and changed to Confirmed

** Tags added: kernel-bug-exists-upstream

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


Re: [Bug 1698454] Re: TCP/IP Throughput Limitation

2017-06-20 Thread Brandon Bates
I will do that, won't be until end of next week though when I have
access to it.

> On Jun 19, 2017, at 8:01 AM, Joseph Salisbury 
>  wrote:
> 
> Would it be possible for you to test the latest upstream kernel? Refer
> to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
> v4.12 kernel[0].
> 
> If this bug is fixed in the mainline kernel, please add the following
> tag 'kernel-fixed-upstream'.
> 
> If the mainline kernel does not fix this bug, please add the tag:
> 'kernel-bug-exists-upstream'.
> 
> Once testing of the upstream kernel is complete, please mark this bug as
> "Confirmed".
> 
> 
> Thanks in advance.
> 
> [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12-rc5
> 
> -- 
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1698454
> 
> Title:
>  TCP/IP Throughput Limitation
> 
> Status in linux package in Ubuntu:
>  Incomplete
> 
> Bug description:
>  I'm having a severe TCP receive problem with a new Ubuntu 16 server
>  and an Intel 82599ES 10-Gigabit SFP+ (X520 series card) when a windows
>  10 machine is used to send the data. The receive performance is 1Gb to
>  2Gb steady using iperf single stream, while send is 9.42Gb.
> 
>  Here is what I have found:
> 
>  -Running 8 parallel streams in iperf gets me the full 9.42 in
>  aggregate
> 
>  -Tried a few windows 10 machines with repeatable same results
> 
>  -Same results through switch or direct
> 
>  -Linux to linux no problems
> 
>  -Tested with clean installs of 16.04.02 and 4.4.0-66 kernel, latest
>  the upgrade will give me, also tried 17.04 with up to
>  4.10.0-20-generic same problem, tried 4.11.0-041100rc8-generic and
>  problem seemed to go away for a bit but came back so I think that
>  might be a red herring (see interesting note below).
> 
>  -14.04 and 15.10 with up to 4.2.8-040208-generic is 9.42/9.42 (works
>  fine, couldn't get 4.3 to install on 15.10)
> 
>  -14.04 with the latest 5.0.4 ixgbe driver still works fine, does not
>  seem to be a driver version issue.
> 
>  -Tried swapping out cards with another same model/chipset with same
>  exact result
> 
>  -Large receive offload increases the performance from a steady 1Gb to
>  a steady 2.14Gb
> 
>  -Disabling sack under got me 75% of the way back to full speed, but it
>  was very unstable (didn't hold at a solid speed)
> 
>  -Using a different brand of card in this same server (but not the same
>  slot) (mellanox infiniband running in ethernet mode) is 9.42/8.5
>  (seems to work fine, 9.42 to 8.5 is windows machine limit I think)
> 
>  -Interesting: When swapping between intel and mellanox 10Gbe cards
>  (with a reboot of the server inbetween, but not a reboot of the
>  windows machine, and keeping the same IP on the server) the
>  performance does not change immediately. When going from intel to
>  mellanox the first test holds around 1 or 2Gbit, then after that it
>  jumps up to 8.5 steady. Similarly when switching from mellanox to
>  Intel the first 1 or 2 seconds of performance hits 8.x then drop in
>  half or more and within 3 or 4 seconds it is back to 1Gbit and stays
>  there for each subsequent iperf test.
> 
>  -Interesting: When capturing packets on wireshark on windows,
>  performance comes up to 8.5! (No, ditching windows isn't an option
>  here unfortunately. :) So obviously something in the way the two tcp
>  stacks are interacting without a buffer inbetween when the intel
>  driver is online is causing issues.
> 
>  -Port bonding makes no difference (nor should it for single stream)
> 
>  -Tried rolling the intel driver back to the 3.2.1 version that is on
>  14.04 but it was too old to compile
> 
>  -I suspect this is a kernel TCP/IP implementation change somewhere
>  between 4.2 and 4.4 that has caused it to not play nice with window's
>  stack.  Based on the delayed performance change I'm thinking something
>  is messing with flow control, the tcp congestion algorithm, or the tcp
>  window. I tried turning some various tcp options off, tweaking some
>  values, changing congestion algorithms, hardware flow control, and
>  comparing sysctl stuff from the u14 machine to this machine to no
>  avail.
> 
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1698454/+subscriptions

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1698454] [NEW] TCP/IP Throughput Limitation

2017-06-16 Thread Brandon Bates
Public bug reported:

I'm having a severe TCP receive problem with a new Ubuntu 16 server and
an Intel 82599ES 10-Gigabit SFP+ (X520 series card) when a windows 10
machine is used to send the data. The receive performance is 1Gb to 2Gb
steady using iperf single stream, while send is 9.42Gb.

Here is what I have found:

-Running 8 parallel streams in iperf gets me the full 9.42 in aggregate

-Tried a few windows 10 machines with repeatable same results

-Same results through switch or direct

-Linux to linux no problems

-Tested with clean installs of 16.04.02 and 4.4.0-66 kernel, latest the
upgrade will give me, also tried 17.04 with up to 4.10.0-20-generic same
problem, tried 4.11.0-041100rc8-generic and problem seemed to go away
for a bit but came back so I think that might be a red herring (see
interesting note below).

-14.04 and 15.10 with up to 4.2.8-040208-generic is 9.42/9.42 (works
fine, couldn't get 4.3 to install on 15.10)

-14.04 with the latest 5.0.4 ixgbe driver still works fine, does not
seem to be a driver version issue.

-Tried swapping out cards with another same model/chipset with same
exact result

-Large receive offload increases the performance from a steady 1Gb to a
steady 2.14Gb

-Disabling sack under got me 75% of the way back to full speed, but it
was very unstable (didn't hold at a solid speed)

-Using a different brand of card in this same server (but not the same
slot) (mellanox infiniband running in ethernet mode) is 9.42/8.5 (seems
to work fine, 9.42 to 8.5 is windows machine limit I think)

-Interesting: When swapping between intel and mellanox 10Gbe cards (with
a reboot of the server inbetween, but not a reboot of the windows
machine, and keeping the same IP on the server) the performance does not
change immediately. When going from intel to mellanox the first test
holds around 1 or 2Gbit, then after that it jumps up to 8.5 steady.
Similarly when switching from mellanox to Intel the first 1 or 2 seconds
of performance hits 8.x then drop in half or more and within 3 or 4
seconds it is back to 1Gbit and stays there for each subsequent iperf
test.

-Interesting: When capturing packets on wireshark on windows,
performance comes up to 8.5! (No, ditching windows isn't an option here
unfortunately. :) So obviously something in the way the two tcp stacks
are interacting without a buffer inbetween when the intel driver is
online is causing issues.

-Port bonding makes no difference (nor should it for single stream)

-Tried rolling the intel driver back to the 3.2.1 version that is on
14.04 but it was too old to compile

-I suspect this is a kernel TCP/IP implementation change somewhere
between 4.2 and 4.4 that has caused it to not play nice with window's
stack.  Based on the delayed performance change I'm thinking something
is messing with flow control, the tcp congestion algorithm, or the tcp
window. I tried turning some various tcp options off, tweaking some
values, changing congestion algorithms, hardware flow control, and
comparing sysctl stuff from the u14 machine to this machine to no avail.

** Affects: ubuntu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1698454

Title:
  TCP/IP Throughput Limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1698454/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs