Re: ixg(4) performances

2014-09-03 Thread Bert Kiers
On Sat, Aug 30, 2014 at 10:24:52AM +0100, Justin Cormack wrote:
 On Sat, Aug 30, 2014 at 8:22 AM, Thor Lancelot Simon t...@panix.com wrote:
  On Fri, Aug 29, 2014 at 12:22:31PM -0400, Terry Moore wrote:
 
  Is the ixg in an expansion slot or integrated onto the main board?
 
  If you know where to get a mainboard with an integrated ixg, I wouldn't
  mind hearing about it.
 
 They are starting to appear, eg
 http://www.supermicro.co.uk/products/motherboard/Xeon/C600/X9SRH-7TF.cfm

We have some SuperMicro X9DRW with 10 GbE Intel NICs on board.  NetBSD
current does not configure them.

NetBSD 6.1 says:

vendor 0x8086 product 0x1528 (ethernet network, revision 0x01) at pci1 dev 0 
function 0 not configured

Complete messages: http://netbsd.itsx.net/hw/x9drw.dmesg

NetBSD current from today also does not configure them.

(I boot from an USB-stick with current kernel and 6.1 userland. It wants
to mount sda0 but there is only dk0, dk1.  So I end up with read/only
disk.  I have to sort that out before I can save kernels messages.)

Grtnx,
-- 
B*E*R*T


Re: ixg(4) performances

2014-09-03 Thread Bert Kiers
On Wed, Sep 03, 2014 at 04:11:29PM +0200, Bert Kiers wrote:
 NetBSD 6.1 says:
 
 vendor 0x8086 product 0x1528 (ethernet network, revision 0x01) at pci1 dev 0 
 function 0 not configured
 
 Complete messages: http://netbsd.itsx.net/hw/x9drw.dmesg
 
 NetBSD current from today also does not configure them.

Btw, FreeBSD 9.3-RELEASE says:

ix0: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15 port 
0x8020-0x803f mem 0xde20-0xde3f,0xde404000-0xde407fff irq 26 at device 
0.0 on pci1
ix0: Using MSIX interrupts with 9 vectors
ix0: Ethernet address: 00:25:90:f9:49:20
ix0: PCI Express Bus: Speed 5.0GT/s Width x8

-- 
B*E*R*T


Re: ixg(4) performances

2014-09-03 Thread Emmanuel Dreyfus
On Wed, Sep 03, 2014 at 04:11:29PM +0200, Bert Kiers wrote:
 NetBSD 6.1 says:
 vendor 0x8086 product 0x1528 (ethernet network, revision 0x01) at pci1 dev 0 
 function 0 not configured

In src/sys/dev/pci/ixgbe/ we know about producct Id 0x1529 and 0x152A but 
not 0x1528. But this can probably be easily borrowed from FreeBSD:
http://svnweb.freebsd.org/base/head/sys/dev/ixgbe/

They call it IXGBE_DEV_ID_X540T, You can try to add in ixgbe_type.h:
#define IXGBE_DEV_ID_82599_X540T 0x1528

Then in ixgbe.c add a IXGBE_DEV_ID_82599_X540T line in 
ixgbe_vendor_info_array[]

In ixgbe_82599.c you need a case IXGBE_DEV_ID_82599_X540T
next to case IXGBE_DEV_ID_82599_BACKPLANE_FCOE if media
is indeed backplane. Otherwise add it at the appropriate place
in the switch statement.

And finally you need to add a case IXGBE_DEV_ID_82599_X540T
next to case IXGBE_DEV_ID_82599_BACKPLANE_FCOE in ixgbe_api.c

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-09-03 Thread Masanobu SAITOH

On 2014/09/04 0:40, Emmanuel Dreyfus wrote:

On Wed, Sep 03, 2014 at 04:11:29PM +0200, Bert Kiers wrote:

NetBSD 6.1 says:
vendor 0x8086 product 0x1528 (ethernet network, revision 0x01) at pci1 dev 0 
function 0 not configured


In src/sys/dev/pci/ixgbe/ we know about producct Id 0x1529 and 0x152A but
not 0x1528. But this can probably be easily borrowed from FreeBSD:
http://svnweb.freebsd.org/base/head/sys/dev/ixgbe/

They call it IXGBE_DEV_ID_X540T, You can try to add in ixgbe_type.h:
#define IXGBE_DEV_ID_82599_X540T 0x1528

Then in ixgbe.c add a IXGBE_DEV_ID_82599_X540T line in
ixgbe_vendor_info_array[]

In ixgbe_82599.c you need a case IXGBE_DEV_ID_82599_X540T
next to case IXGBE_DEV_ID_82599_BACKPLANE_FCOE if media
is indeed backplane. Otherwise add it at the appropriate place
in the switch statement.

And finally you need to add a case IXGBE_DEV_ID_82599_X540T
next to case IXGBE_DEV_ID_82599_BACKPLANE_FCOE in ixgbe_api.c


Our ixg(4) driver doesn't support X520. At least there is no
file ixgbe/ixgbe_x540.c of FreeBSD.


Bus FreeBSD NetBSD
 82597  PCI-X   ixgbdge
 82598  PCIeixgbe   ixg
 82599(X520)PCIeixgbe   ixg
 X540   PCIeixgbe   (ixg)(not yet)


--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: ixg(4) performances

2014-09-03 Thread Masanobu SAITOH

On 2014/09/04 11:24, Masanobu SAITOH wrote:

On 2014/09/04 0:40, Emmanuel Dreyfus wrote:

On Wed, Sep 03, 2014 at 04:11:29PM +0200, Bert Kiers wrote:

NetBSD 6.1 says:
vendor 0x8086 product 0x1528 (ethernet network, revision 0x01) at pci1 dev 0 
function 0 not configured


In src/sys/dev/pci/ixgbe/ we know about producct Id 0x1529 and 0x152A but
not 0x1528. But this can probably be easily borrowed from FreeBSD:
http://svnweb.freebsd.org/base/head/sys/dev/ixgbe/

They call it IXGBE_DEV_ID_X540T, You can try to add in ixgbe_type.h:
#define IXGBE_DEV_ID_82599_X540T 0x1528

Then in ixgbe.c add a IXGBE_DEV_ID_82599_X540T line in
ixgbe_vendor_info_array[]

In ixgbe_82599.c you need a case IXGBE_DEV_ID_82599_X540T
next to case IXGBE_DEV_ID_82599_BACKPLANE_FCOE if media
is indeed backplane. Otherwise add it at the appropriate place
in the switch statement.

And finally you need to add a case IXGBE_DEV_ID_82599_X540T
next to case IXGBE_DEV_ID_82599_BACKPLANE_FCOE in ixgbe_api.c


Our ixg(4) driver doesn't support X520. At least there is no


s/X520/X540/


file ixgbe/ixgbe_x540.c of FreeBSD.


 BusFreeBSD   NetBSD
  82597PCI-Xixgbdge
  82598PCIeixgbeixg
  82599(X520)PCIeixgbeixg
  X540PCIeixgbe(ixg)(not yet)





--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: ixg(4) performances

2014-08-30 Thread Thor Lancelot Simon
On Fri, Aug 29, 2014 at 12:22:31PM -0400, Terry Moore wrote:
 
 Is the ixg in an expansion slot or integrated onto the main board?

If you know where to get a mainboard with an integrated ixg, I wouldn't
mind hearing about it.

Thor


Re: ixg(4) performances

2014-08-30 Thread Justin Cormack
On Sat, Aug 30, 2014 at 8:22 AM, Thor Lancelot Simon t...@panix.com wrote:
 On Fri, Aug 29, 2014 at 12:22:31PM -0400, Terry Moore wrote:

 Is the ixg in an expansion slot or integrated onto the main board?

 If you know where to get a mainboard with an integrated ixg, I wouldn't
 mind hearing about it.

They are starting to appear, eg
http://www.supermicro.co.uk/products/motherboard/Xeon/C600/X9SRH-7TF.cfm

Justin


Re: ixg(4) performances

2014-08-30 Thread Matthias Drochner

On Fri, 29 Aug 2014 15:51:14 +
Emmanuel Dreyfus m...@netbsd.org wrote:
 I found this, but the result does not make sense: negociated  max ...

 Link Capabilities Ragister (0xAC): 0x00027482
 bits 3:0   Supprted Link speed:  0010 = 5 GbE and 2.5 GbE speed
 supported bits 9:4   Max link width: 001000 = x4

Wrong -- this means x8.

 bits 14:12 L0s exit lattency: 101 = 1 µs - 2 µs
 bits 17:15 L1 Exit lattency: 011 = 4 µs - 8 µs

 Link Status Register (0xB2): 0x1081
 bits 3:0   Current Link speed:  0001 = 2.5 GbE PCIe link
 bits 9:4   Negociated link width: 001000 = x8

So it makes sense.

best regards
Matthias




Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt





Re: ixg(4) performances

2014-08-30 Thread Emmanuel Dreyfus
Matthias Drochner m.droch...@fz-juelich.de wrote:

  Link Capabilities Ragister (0xAC): 0x00027482
  bits 3:0   Supprted Link speed:  0010 = 5 GbE and 2.5 GbE speed
  supported bits 9:4   Max link width: 001000 = x4
 
 Wrong -- this means x8.
 
  bits 14:12 L0s exit lattency: 101 = 1 µs - 2 µs
  bits 17:15 L1 Exit lattency: 011 = 4 µs - 8 µs
 
  Link Status Register (0xB2): 0x1081
  bits 3:0   Current Link speed:  0001 = 2.5 GbE PCIe link
  bits 9:4   Negociated link width: 001000 = x8
 
 So it makes sense.

Right, hence the ethernet board can do 5 GbE x 8, but something in front
of it can only do 2.5 GbE x 8.

But 2.5 GbE x 8 means 20 Gb/s, which is much larger than the 2.7 Gb/s I
get.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


RE: ixg(4) performances

2014-08-29 Thread Terry Moore
 
 -Original Message-
 From: tech-kern-ow...@netbsd.org [mailto:tech-kern-ow...@netbsd.org] On
 Behalf Of Emmanuel Dreyfus
 Sent: Thursday, August 28, 2014 23:55
 To: Terry Moore; 'Christos Zoulas'
 Cc: tech-kern@netbsd.org
 Subject: Re: ixg(4) performances
 
 Terry Moore t...@mcci.com wrote:
 
  There are several possibilities, all revolving about differences
  between the blog poster's base system and yorus.
 
 Do I have a way to investigate for appropriate PCI setup? Here is what
 dmesg says about it:
 
 pci0 at mainbus0 bus 0: configuration mode 1
 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 ppb4 at pci0 dev 14 function 0: vendor 0x10de product 0x005d (rev. 0xa3)
 ppb4: PCI Express 1.0 Root Port of PCI-E Root Complex
 pci5 at ppb4 bus 5
 pci5: i/o space, memory space enabled, rd/line, wr/inv ok
 ixg1 at pci5 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network
 Driver, Version - 2.3.10

I don't do PCIe on NetBSD -- these days we use it exclusively as VM guests
--, so I don't know what tools are available. Normally when doing this kind
of thing I poke around with a debugger or equivalent of pcictl. 

The dmesg output tells us that your ixg is directly connected to an Nvdia
root complex. So there are no bridges, but this might be a relevant
difference to the benchmark system. It's more common to be connected to an
Intel southbridge chip of some kind.

Next step would be to check the documentation on, and the configuration of,
the root complex -- it must also be configured for 4K read ahead (because
the read will launch from the ixg, will be buffered in the root complex,
forwarded to the memory controller, and then the answers will come back.

(PCIe is very fast at the bit transfer level, but pretty slow in terms of
read transfers per second. Read transfer latency is on the order of 2.5
microseconds / operation. This is why 4K transfers are so important in this
application.)

Anyway, there are multiple vendors involved (Intel, Nvidia, your BIOS maker,
because the BIOS is responsible for setting things like the read size to the
maximum across the bus -- I'm speaking loosely, but basically config
software has to set things up because the individual devices don't have
enough knowledge). So generally that may explain things. 

Still, you should check whether you have the right number of the right
generation of PCIe lanes connected to the ixg. If you look at the manual,
normally there's an obscure register that tells you how many lanes are
connected, and what generation. On the motherboards we use, each slot is
different, and it's not always obvious how the slots differ. Rather than
depending on documentation and the good intentions of the motherboard
developers, I always feel better looking at what the problem chip in
question thinks about number of lanes and speeds.

Hope this helps,
--Terry




Re: ixg(4) performances

2014-08-29 Thread Emmanuel Dreyfus
On Fri, Aug 29, 2014 at 08:48:51AM -0400, Terry Moore wrote:
 Still, you should check whether you have the right number of the right
 generation of PCIe lanes connected to the ixg. 

I found this, but the result does not make sense: negociated  max ...

Link Capabilities Ragister (0xAC): 0x00027482
bits 3:0   Supprted Link speed:  0010 = 5 GbE and 2.5 GbE speed supported
bits 9:4   Max link width: 001000 = x4
bits 14:12 L0s exit lattency: 101 = 1 µs - 2 µs
bits 17:15 L1 Exit lattency: 011 = 4 µs - 8 µs  

Link Status Register (0xB2): 0x1081
bits 3:0   Current Link speed:  0001 = 2.5 GbE PCIe link
bits 9:4   Negociated link width: 001000 = x8


-- 
Emmanuel Dreyfus
m...@netbsd.org


RE: ixg(4) performances

2014-08-29 Thread Terry Moore
 On Friday, August 29, 2014 11:51, Emmanuel Dreyfus wrote:
 
 On Fri, Aug 29, 2014 at 08:48:51AM -0400, Terry Moore wrote:
  Still, you should check whether you have the right number of the right
  generation of PCIe lanes connected to the ixg.
 
 I found this, but the result does not make sense: negociated  max ...
 
 Link Capabilities Ragister (0xAC): 0x00027482
 bits 3:0   Supprted Link speed:  0010 = 5 GbE and 2.5 GbE speed supported
 bits 9:4   Max link width: 001000 = x4
 bits 14:12 L0s exit lattency: 101 = 1 µs - 2 µs bits 17:15 L1 Exit
 lattency: 011 = 4 µs - 8 µs
 
 Link Status Register (0xB2): 0x1081
 bits 3:0   Current Link speed:  0001 = 2.5 GbE PCIe link
 bits 9:4   Negociated link width: 001000 = x8

I think there's a typo in the docs. In the PCIe spec, it says (for Link
Capabilities Register, table 7-15): 001000b is x8.  

But it's running at gen1. I strongly suspect that the benchmark case was
gen2 (since the ixg is capable of it).

Is the ixg in an expansion slot or integrated onto the main board?

--Terry  



Re: ixg(4) performances

2014-08-29 Thread Emmanuel Dreyfus
Terry Moore t...@mcci.com wrote:

 But it's running at gen1. I strongly suspect that the benchmark case was
 gen2 (since the ixg is capable of it).

gen1 vs gen2 is 2.5 Gb.s bs 5 Gb/s?

 Is the ixg in an expansion slot or integrated onto the main board?

In a slot.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: ixg(4) performances

2014-08-29 Thread Hisashi T Fujinaka

On Fri, 29 Aug 2014, Emmanuel Dreyfus wrote:


Terry Moore t...@mcci.com wrote:


But it's running at gen1. I strongly suspect that the benchmark case was
gen2 (since the ixg is capable of it).


gen1 vs gen2 is 2.5 Gb.s bs 5 Gb/s?


Gen 1 is capable of only 2.5GT/s (gigatransfers per second). Gen 2 is
capable of up to 5, but isn't guaranteed to be 5. Depending on how
chatty the device is on the PCIe bus, I think 2.5GT/s is enough for
something much closer to line rate than you're getting.

--
Hisashi T Fujinaka - ht...@twofifty.com
BSEE + BSChem + BAEnglish + MSCS + $2.50 = coffee


Re: ixg(4) performances

2014-08-28 Thread Emmanuel Dreyfus
On Tue, Aug 26, 2014 at 12:57:37PM +, Christos Zoulas wrote:
 I also found this page that tackles the same problem on Linux:
 http://dak1n1.com/blog/7-performance-tuning-intel-10gbe

It seems that page describe a slightly different model.
Intel 82599 datasheet is available here:
http://www.intel.fr/content/www/fr/fr/ethernet-controllers/82599-10-gbe-controller-datasheet.html

No reference to MMRBC in this document, but I understand Max Read Request
Size is the same thing. Page 765 tells us about register A8, bits 12-14
that should be set to 100.
pcictl /dev/pci5 read -d 0 -f 1 0x18 tells me the value 0x00092810

I tried this command:
pcictl /dev/pci5 write -d 0 -f 1 0x18 0x00094810 

Further pcictl read suggests it works as the new value is returned.
However it gives no performance improvement. This means that I 
misunderstood what this register is about, or how to change it (byte order?).

Or the performance are constrained by something unrelated. In the blog 
post cited above, the poster acheived more than 5 Gb/s before touching
MMRBC, while I am stuck at 2,7 GB/s. Any new idea welcome.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-08-28 Thread Emmanuel Dreyfus
On Tue, Aug 26, 2014 at 04:40:25PM +, Taylor R Campbell wrote:
 New version with some changes suggested by wiz@.

Anyone has objection to this change being committed and pulled up to 
netbsd-7?

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-08-28 Thread Stephan
What is your test setup? Do you have 2 identical boxes?

Does it perform better e.g. on Linux or FreeBSD? If so, you could
check how the config registers get set by that particular OS.

2014-08-28 9:26 GMT+02:00 Emmanuel Dreyfus m...@netbsd.org:
 On Tue, Aug 26, 2014 at 12:57:37PM +, Christos Zoulas wrote:
 I also found this page that tackles the same problem on Linux:
 http://dak1n1.com/blog/7-performance-tuning-intel-10gbe

 It seems that page describe a slightly different model.
 Intel 82599 datasheet is available here:
 http://www.intel.fr/content/www/fr/fr/ethernet-controllers/82599-10-gbe-controller-datasheet.html

 No reference to MMRBC in this document, but I understand Max Read Request
 Size is the same thing. Page 765 tells us about register A8, bits 12-14
 that should be set to 100.
 pcictl /dev/pci5 read -d 0 -f 1 0x18 tells me the value 0x00092810

 I tried this command:
 pcictl /dev/pci5 write -d 0 -f 1 0x18 0x00094810

 Further pcictl read suggests it works as the new value is returned.
 However it gives no performance improvement. This means that I
 misunderstood what this register is about, or how to change it (byte order?).

 Or the performance are constrained by something unrelated. In the blog
 post cited above, the poster acheived more than 5 Gb/s before touching
 MMRBC, while I am stuck at 2,7 GB/s. Any new idea welcome.

 --
 Emmanuel Dreyfus
 m...@netbsd.org


RE: ixg(4) performances

2014-08-28 Thread Terry Moore
 
 Or the performance are constrained by something unrelated. In the blog
 post cited above, the poster acheived more than 5 Gb/s before touching
 MMRBC, while I am stuck at 2,7 GB/s. Any new idea welcome.

The blog post refers to PCI-X, I'm more familiar with PCIe, but the concepts
are similar.

There are several possibilities, all revolving about differences between the
blog poster's base system and yorus.

1) the test case is using a platform that has better PCI performance (in the
PCIe world this could be: Gen3 versus Gen2 support in the slot being used;
more lanes in the slot being used)

2) the test case has a root complex with a PCI controller with better
performance than the one in your system; 

3) the test case system has a different PCI configuration, in particular
different bridging.  For example, a PCI bridge or switch on your platform
can change basic capabilities compared to the reference.

4) related to 3: one of the bridges on your system (between ixg and root
complex) is not configured for 4K reads, and so the setting on the ixg board
won't help [whereas this wasn't the case on the blog system].

5) related to 4: one of the bridges in your system (between ixg and root
complex) is not capable of 4K reads... (see 4). 

And of course you have to consider:

6) the writer has something else different than you have, for example
silicon rev, BIOS, PCI-X where you have PCIe, etc.
7) the problem is completely unrelated to PCIe.

You're in a tough situation, experimentally, because you can't take a
working (5 Gbps) system and directly compare to the non-working (2.7 Gbps)
situation.

--Terry



Re: ixg(4) performances

2014-08-28 Thread Christos Zoulas
In article 20140828072832.gi8...@homeworld.netbsd.org,
Emmanuel Dreyfus  m...@netbsd.org wrote:
On Tue, Aug 26, 2014 at 04:40:25PM +, Taylor R Campbell wrote:
 New version with some changes suggested by wiz@.

Anyone has objection to this change being committed and pulled up to 
netbsd-7?

Not me.

christos



Re: ixg(4) performances

2014-08-28 Thread Hisashi T Fujinaka

On Thu, 28 Aug 2014, Emmanuel Dreyfus wrote:


On Tue, Aug 26, 2014 at 12:57:37PM +, Christos Zoulas wrote:

I also found this page that tackles the same problem on Linux:
http://dak1n1.com/blog/7-performance-tuning-intel-10gbe


It seems that page describe a slightly different model.
Intel 82599 datasheet is available here:
http://www.intel.fr/content/www/fr/fr/ethernet-controllers/82599-10-gbe-controller-datasheet.html

No reference to MMRBC in this document, but I understand Max Read Request
Size is the same thing. Page 765 tells us about register A8, bits 12-14
that should be set to 100.
pcictl /dev/pci5 read -d 0 -f 1 0x18 tells me the value 0x00092810

I tried this command:
pcictl /dev/pci5 write -d 0 -f 1 0x18 0x00094810

Further pcictl read suggests it works as the new value is returned.
However it gives no performance improvement. This means that I
misunderstood what this register is about, or how to change it (byte order?).

Or the performance are constrained by something unrelated. In the blog
post cited above, the poster acheived more than 5 Gb/s before touching
MMRBC, while I am stuck at 2,7 GB/s. Any new idea welcome.


Isn't your PCIe slot constrained? I thought I remembered that you're
only getting 2.5GT/s and I forget what test you're running.

--
Hisashi T Fujinaka - ht...@twofifty.com
BSEE + BSChem + BAEnglish + MSCS + $2.50 = coffee


Re: ixg(4) performances

2014-08-28 Thread Emmanuel Dreyfus
On Thu, Aug 28, 2014 at 08:37:06AM -0700, Hisashi T Fujinaka wrote:
 Isn't your PCIe slot constrained? I thought I remembered that you're
 only getting 2.5GT/s and I forget what test you're running.

I use netperf, and I now get 2.7 Gb/s.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-08-28 Thread Emmanuel Dreyfus
Terry Moore t...@mcci.com wrote:

 There are several possibilities, all revolving about differences between the
 blog poster's base system and yorus.

Do I have a way to investigate for appropriate PCI setup? Here is what
dmesg says about it:

pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
ppb4 at pci0 dev 14 function 0: vendor 0x10de product 0x005d (rev. 0xa3)
ppb4: PCI Express 1.0 Root Port of PCI-E Root Complex
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled, rd/line, wr/inv ok
ixg1 at pci5 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network
Driver, Version - 2.3.10

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: ixg(4) performances

2014-08-27 Thread Emmanuel Dreyfus
On Tue, Aug 26, 2014 at 04:40:25PM +, Taylor R Campbell wrote:
How about the attached patch?  I've been sitting on this for months.
Both changes seem fine, but the board does not behave as told by
Linux crowd. At 0xe6 is a nul value where we should have 0x22, 
and attemps to change it does not seem to have any effect.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-08-26 Thread Emmanuel Dreyfus
On Tue, Aug 26, 2014 at 12:57:37PM +, Christos Zoulas wrote:
 ftp://ftp.supermicro.com/CDR-C2_1.20_for_Intel_C2_platform/Intel/LAN/v15.5/PROXGB/DOCS/SERVER/prform10.htm#Setting_MMRBC

Right, but NetBSD has no tool like Linux's setpci to tweak MMRBC, and if
the BIOS has no setting for it, NetBSD is screwed.

I see dev/pci/pciio.h  has a PCI_IOC_CFGREAD / PCI_IOC_CFGWRITE ioctl,
does that means Linux's setpci can be easily reproduced?

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-08-26 Thread Christos Zoulas
On Aug 26,  2:23pm, m...@netbsd.org (Emmanuel Dreyfus) wrote:
-- Subject: Re: ixg(4) performances

| On Tue, Aug 26, 2014 at 12:57:37PM +, Christos Zoulas wrote:
|  
ftp://ftp.supermicro.com/CDR-C2_1.20_for_Intel_C2_platform/Intel/LAN/v15.5/PROXGB/DOCS/SERVER/prform10.htm#Setting_MMRBC
| 
| Right, but NetBSD has no tool like Linux's setpci to tweak MMRBC, and if
| the BIOS has no setting for it, NetBSD is screwed.
| 
| I see dev/pci/pciio.h  has a PCI_IOC_CFGREAD / PCI_IOC_CFGWRITE ioctl,
| does that means Linux's setpci can be easily reproduced?

I would probably extend pcictl with cfgread and cfgwrite commands.

christos


Re: ixg(4) performances

2014-08-26 Thread Christos Zoulas
On Aug 26,  2:42pm, m...@netbsd.org (Emmanuel Dreyfus) wrote:
-- Subject: Re: ixg(4) performances

| On Tue, Aug 26, 2014 at 10:25:52AM -0400, Christos Zoulas wrote:
|  I would probably extend pcictl with cfgread and cfgwrite commands.
| 
| Sure, once it works I can do that, but a first attempt just
| ets EINVAL, any idea what can be wrong?
| 
| int fd;
| struct  pciio_bdf_cfgreg pbcr;
| 
| if ((fd = open(/dev/pci5, O_RDWR, 0)) == -1)
| err(EX_OSERR, open /dev/pci5 failed);
| 
| pbcr.bus = 5;
| pbcr.device = 0;
| pbcr.function = 0;
| pbcr.cfgreg.reg = 0xe6b;
| pbcr.cfgreg.val = 0x2e;

I think in the example that was 0xe6. I think the .b means byte access
(I am guessing). I think that we are only doing word accesses, thus
we probably need to read, mask modify write the byte. I have not
verified any of that, these are guesses... Look at the pcictl source
code.

| 
| if (ioctl(fd, PCI_IOC_BDF_CFGWRITE, pbcr) == -1)
| err(EX_OSERR, ioctl failed);
| 
| Inside the kernel, the only EINVAL is here:
| if (bdfr-bus  255 || bdfr-device = sc-sc_maxndevs ||
| bdfr-function  7)
| return EINVAL;
| 
| -- 
| Emmanuel Dreyfus
| m...@netbsd.org
-- End of excerpt from Emmanuel Dreyfus




Re: ixg(4) performances

2014-08-26 Thread Taylor R Campbell
   Date: Tue, 26 Aug 2014 10:25:52 -0400
   From: chris...@zoulas.com (Christos Zoulas)

   On Aug 26,  2:23pm, m...@netbsd.org (Emmanuel Dreyfus) wrote:
   -- Subject: Re: ixg(4) performances

   | I see dev/pci/pciio.h  has a PCI_IOC_CFGREAD / PCI_IOC_CFGWRITE ioctl,
   | does that means Linux's setpci can be easily reproduced?

   I would probably extend pcictl with cfgread and cfgwrite commands.

How about the attached patch?  I've been sitting on this for months.
Index: usr.sbin/pcictl/pcictl.8
===
RCS file: /cvsroot/src/usr.sbin/pcictl/pcictl.8,v
retrieving revision 1.10
diff -p -u -r1.10 pcictl.8
--- usr.sbin/pcictl/pcictl.825 Feb 2011 21:40:48 -  1.10
+++ usr.sbin/pcictl/pcictl.826 Aug 2014 15:38:55 -
@@ -33,7 +33,7 @@
 .\ ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 .\ POSSIBILITY OF SUCH DAMAGE.
 .\
-.Dd February 25, 2011
+.Dd June 12, 2014
 .Dt PCICTL 8
 .Os
 .Sh NAME
@@ -79,6 +79,31 @@ at the specified bus, device, and functi
 If the bus is not specified, it defaults to the bus number of the
 PCI bus specified on the command line.
 If the function is not specified, it defaults to 0.
+.Pp
+.Nm read
+.Op Fl b Ar bus
+.Fl d Ar device
+.Op Fl f Ar function
+.Ar reg
+.Pp
+Read the specified 32-bit aligned PCI configuration register and print
+it in hexadecimal to standard output.
+If the bus is not specified, it defaults to the bus number of the
+PCI bus specified on the command line.
+If the function is not specified, it defaults to 0.
+.Pp
+.Nm write
+.Op Fl b Ar bus
+.Fl d Ar device
+.Op Fl f Ar function
+.Ar reg
+.Ar value
+.Pp
+Write the specified value to the specified 32-bit aligned PCI
+configuration register.
+If the bus is not specified, it defaults to the bus number of the
+PCI bus specified on the command line.
+If the function is not specified, it defaults to 0.
 .Sh FILES
 .Pa /dev/pci*
 - PCI bus device nodes
Index: usr.sbin/pcictl/pcictl.c
===
RCS file: /cvsroot/src/usr.sbin/pcictl/pcictl.c,v
retrieving revision 1.18
diff -p -u -r1.18 pcictl.c
--- usr.sbin/pcictl/pcictl.c30 Aug 2011 20:08:38 -  1.18
+++ usr.sbin/pcictl/pcictl.c26 Aug 2014 15:38:55 -
@@ -76,6 +76,8 @@ static intprint_numbers = 0;
 
 static voidcmd_list(int, char *[]);
 static voidcmd_dump(int, char *[]);
+static voidcmd_read(int, char *[]);
+static voidcmd_write(int, char *[]);
 
 static const struct command commands[] = {
{ list,
@@ -88,10 +90,21 @@ static const struct command commands[] =
  cmd_dump,
  O_RDONLY },
 
+   { read,
+ [-b bus] -d device [-f function] reg,
+ cmd_read,
+ O_RDONLY },
+
+   { write,
+ [-b bus] -d device [-f function] reg value,
+ cmd_write,
+ O_WRONLY },
+
{ 0, 0, 0, 0 },
 };
 
 static int parse_bdf(const char *);
+static u_int   parse_reg(const char *);
 
 static voidscan_pci(int, int, int, void (*)(u_int, u_int, u_int));
 
@@ -230,6 +243,87 @@ cmd_dump(int argc, char *argv[])
scan_pci(bus, dev, func, scan_pci_dump);
 }
 
+static void
+cmd_read(int argc, char *argv[])
+{
+   int bus, dev, func;
+   u_int reg;
+   pcireg_t value;
+   int ch;
+
+   bus = pci_businfo.busno;
+   func = 0;
+   dev = -1;
+
+   while ((ch = getopt(argc, argv, b:d:f:)) != -1) {
+   switch (ch) {
+   case 'b':
+   bus = parse_bdf(optarg);
+   break;
+   case 'd':
+   dev = parse_bdf(optarg);
+   break;
+   case 'f':
+   func = parse_bdf(optarg);
+   break;
+   default:
+   usage();
+   }
+   }
+   argv += optind;
+   argc -= optind;
+
+   if (argc != 1)
+   usage();
+   reg = parse_reg(argv[0]);
+   if (pcibus_conf_read(pcifd, bus, dev, func, reg, value) == -1)
+   err(EXIT_FAILURE, pcibus_conf_read
+   (bus %d dev %d func %d reg %u), bus, dev, func, reg);
+   if (printf(%08x\n, value)  0)
+   err(EXIT_FAILURE, printf);
+}
+
+static void
+cmd_write(int argc, char *argv[])
+{
+   int bus, dev, func;
+   u_int reg;
+   pcireg_t value;
+   int ch;
+
+   bus = pci_businfo.busno;
+   func = 0;
+   dev = -1;
+
+   while ((ch = getopt(argc, argv, b:d:f:)) != -1) {
+   switch (ch) {
+   case 'b':
+   bus = parse_bdf(optarg);
+   break;
+   case 'd':
+   dev = parse_bdf(optarg);
+   break;
+   case 'f':
+   func = parse_bdf(optarg);
+   break;
+   default

Re: ixg(4) performances

2014-08-26 Thread Emmanuel Dreyfus
On Tue, Aug 26, 2014 at 11:13:50AM -0400, Christos Zoulas wrote:
 I think in the example that was 0xe6. I think the .b means byte access
 (I am guessing). 

Yes, I came to that conclusion reading pciutils sources. I discovered
they also had a man page explaining that -)

 I think that we are only doing word accesses, thus
 we probably need to read, mask modify write the byte. I have not
 verified any of that, these are guesses... Look at the pcictl source
 code.

I try writting at register 0xe4, but when reading again it is still 0. 

if (pcibus_conf_read(fd, 5, 0, 1, 0x00e4, val) != 0)
err(EX_OSERR, pcibus_conf_read failed);

printf(reg = 0x00e4,  val = 0x%08x\n, val);

val = (val  0xff00) | 0x002e;

if (pcibus_conf_write(fd, 5, 0, 1, 0x00e4, val) != 0)
err(EX_OSERR, pcibus_conf_write failed);


-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: ixg(4) performances

2014-08-26 Thread Taylor R Campbell
   Date: Tue, 26 Aug 2014 15:40:41 +
   From: Taylor R Campbell riastr...@netbsd.org

   How about the attached patch?  I've been sitting on this for months.

New version with some changes suggested by wiz@.
Index: usr.sbin/pcictl/pcictl.8
===
RCS file: /cvsroot/src/usr.sbin/pcictl/pcictl.8,v
retrieving revision 1.11
diff -p -u -r1.11 pcictl.8
--- usr.sbin/pcictl/pcictl.826 Aug 2014 16:21:15 -  1.11
+++ usr.sbin/pcictl/pcictl.826 Aug 2014 16:38:36 -
@@ -33,7 +33,7 @@
 .\ ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 .\ POSSIBILITY OF SUCH DAMAGE.
 .\
-.Dd February 25, 2011
+.Dd June 12, 2014
 .Dt PCICTL 8
 .Os
 .Sh NAME
@@ -79,6 +79,31 @@ at the specified bus, device, and functi
 If the bus is not specified, it defaults to the bus number of the
 PCI bus specified on the command line.
 If the function is not specified, it defaults to 0.
+.Pp
+.Cm read
+.Op Fl b Ar bus
+.Fl d Ar device
+.Op Fl f Ar function
+.Ar reg
+.Pp
+Read the specified 32-bit aligned PCI configuration register and print
+it in hexadecimal to standard output.
+If the bus is not specified, it defaults to the bus number of the
+PCI bus specified on the command line.
+If the function is not specified, it defaults to 0.
+.Pp
+.Cm write
+.Op Fl b Ar bus
+.Fl d Ar device
+.Op Fl f Ar function
+.Ar reg
+.Ar value
+.Pp
+Write the specified value to the specified 32-bit aligned PCI
+configuration register.
+If the bus is not specified, it defaults to the bus number of the
+PCI bus specified on the command line.
+If the function is not specified, it defaults to 0.
 .Sh FILES
 .Pa /dev/pci*
 - PCI bus device nodes
Index: usr.sbin/pcictl/pcictl.c
===
RCS file: /cvsroot/src/usr.sbin/pcictl/pcictl.c,v
retrieving revision 1.18
diff -p -u -r1.18 pcictl.c
--- usr.sbin/pcictl/pcictl.c30 Aug 2011 20:08:38 -  1.18
+++ usr.sbin/pcictl/pcictl.c26 Aug 2014 16:38:36 -
@@ -76,6 +76,8 @@ static intprint_numbers = 0;
 
 static voidcmd_list(int, char *[]);
 static voidcmd_dump(int, char *[]);
+static voidcmd_read(int, char *[]);
+static voidcmd_write(int, char *[]);
 
 static const struct command commands[] = {
{ list,
@@ -88,10 +90,21 @@ static const struct command commands[] =
  cmd_dump,
  O_RDONLY },
 
+   { read,
+ [-b bus] -d device [-f function] reg,
+ cmd_read,
+ O_RDONLY },
+
+   { write,
+ [-b bus] -d device [-f function] reg value,
+ cmd_write,
+ O_WRONLY },
+
{ 0, 0, 0, 0 },
 };
 
 static int parse_bdf(const char *);
+static u_int   parse_reg(const char *);
 
 static voidscan_pci(int, int, int, void (*)(u_int, u_int, u_int));
 
@@ -230,6 +243,91 @@ cmd_dump(int argc, char *argv[])
scan_pci(bus, dev, func, scan_pci_dump);
 }
 
+static void
+cmd_read(int argc, char *argv[])
+{
+   int bus, dev, func;
+   u_int reg;
+   pcireg_t value;
+   int ch;
+
+   bus = pci_businfo.busno;
+   func = 0;
+   dev = -1;
+
+   while ((ch = getopt(argc, argv, b:d:f:)) != -1) {
+   switch (ch) {
+   case 'b':
+   bus = parse_bdf(optarg);
+   break;
+   case 'd':
+   dev = parse_bdf(optarg);
+   break;
+   case 'f':
+   func = parse_bdf(optarg);
+   break;
+   default:
+   usage();
+   }
+   }
+   argv += optind;
+   argc -= optind;
+
+   if (argc != 1)
+   usage();
+   if (dev == -1)
+   errx(EXIT_FAILURE, read: must specify a device number);
+   reg = parse_reg(argv[0]);
+   if (pcibus_conf_read(pcifd, bus, dev, func, reg, value) == -1)
+   err(EXIT_FAILURE, pcibus_conf_read
+   (bus %d dev %d func %d reg %u), bus, dev, func, reg);
+   if (printf(%08x\n, value)  0)
+   err(EXIT_FAILURE, printf);
+}
+
+static void
+cmd_write(int argc, char *argv[])
+{
+   int bus, dev, func;
+   u_int reg;
+   pcireg_t value;
+   int ch;
+
+   bus = pci_businfo.busno;
+   func = 0;
+   dev = -1;
+
+   while ((ch = getopt(argc, argv, b:d:f:)) != -1) {
+   switch (ch) {
+   case 'b':
+   bus = parse_bdf(optarg);
+   break;
+   case 'd':
+   dev = parse_bdf(optarg);
+   break;
+   case 'f':
+   func = parse_bdf(optarg);
+   break;
+   default:
+   usage();
+   }
+   }
+   argv += optind;
+   argc -= optind;
+
+   if (argc != 2)
+   usage();
+   if (dev == -1)
+ 

Re: ixg(4) performances

2014-08-26 Thread Taylor R Campbell
   Date: Tue, 26 Aug 2014 14:42:55 +
   From: Emmanuel Dreyfus m...@netbsd.org

   On Tue, Aug 26, 2014 at 10:25:52AM -0400, Christos Zoulas wrote:
I would probably extend pcictl with cfgread and cfgwrite commands.

   Sure, once it works I can do that, but a first attempt just
   ets EINVAL, any idea what can be wrong?
   ...
   pbcr.bus = 5;
   pbcr.device = 0;
   pbcr.function = 0;
   pbcr.cfgreg.reg = 0xe6b;
   pbcr.cfgreg.val = 0x2e;

Can't do unaligned register reads/writes.  If you need other than
32-bit access, you need to select subwords for reads or do R/M/W for
writes.

   Inside the kernel, the only EINVAL is here:
   if (bdfr-bus  255 || bdfr-device = sc-sc_maxndevs ||
   bdfr-function  7)
   return EINVAL;

Old kernel sources?  I added a check recently for 32-bit alignment --
without which you'd hit a kassert or hardware trap shortly afterward.


Re: ixg(4) performances

2014-08-26 Thread David Young
On Tue, Aug 26, 2014 at 10:25:52AM -0400, Christos Zoulas wrote:
 On Aug 26,  2:23pm, m...@netbsd.org (Emmanuel Dreyfus) wrote:
 -- Subject: Re: ixg(4) performances
 
 | On Tue, Aug 26, 2014 at 12:57:37PM +, Christos Zoulas wrote:
 |  
 ftp://ftp.supermicro.com/CDR-C2_1.20_for_Intel_C2_platform/Intel/LAN/v15.5/PROXGB/DOCS/SERVER/prform10.htm#Setting_MMRBC
 | 
 | Right, but NetBSD has no tool like Linux's setpci to tweak MMRBC, and if
 | the BIOS has no setting for it, NetBSD is screwed.
 | 
 | I see dev/pci/pciio.h  has a PCI_IOC_CFGREAD / PCI_IOC_CFGWRITE ioctl,
 | does that means Linux's setpci can be easily reproduced?
 
 I would probably extend pcictl with cfgread and cfgwrite commands.

Emmanuel,

Most (all?) configuration registers are read/write.  Have you read the
MMRBC and found that it's improperly configured?

Are you sure that you don't have to program the MMRBC at every bus
bridge between the NIC and RAM?  I'm not too familiar with PCI Express,
so I really don't know.

Have you verified the information at
http://dak1n1.com/blog/7-performance-tuning-intel-10gbe with the 82599
manual?  I have tried to corroborate the information both with my PCI
Express book and with the 82599 manual, but I cannot make a match.
PCI-X != PCI Express; maybe ixgb != ixgbe?  (It sure looks like they're
writing about an 82599, but maybe they don't know what they're writing
about!)


Finally, adding cfgread/cfgwrite commands to pcictl seems like a step in
the wrong direction.  I know that this is UNIX and we're duty-bound to
give everyone enough rope, but may we reconsider our assisted-suicide
policy just this one time? :-)

How well has blindly poking configuration registers worked for us in
the past?  I can think of a couple of instances where an knowledgeable
developer thought that they were writing a helpful value to a useful
register and getting a desirable result, but in the end it turned out to
be a no-op.  In one case, it was an Atheros WLAN adapter where somebody
added to Linux some code that wrote to a mysterious PCI configuration
register, and then some of the *BSDs copied it.  In the other case, I
think that somebody used pci_conf_write() to write a magic value to a
USB host controller register that wasn't on a 32-bit boundary.  ISTR
that some incorrect value was written, instead.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


re: ixg(4) performances

2014-08-26 Thread matthew green

 Finally, adding cfgread/cfgwrite commands to pcictl seems like a step in
 the wrong direction.  I know that this is UNIX and we're duty-bound to
 give everyone enough rope, but may we reconsider our assisted-suicide
 policy just this one time? :-)
 
 How well has blindly poking configuration registers worked for us in
 the past?  I can think of a couple of instances where an knowledgeable
 developer thought that they were writing a helpful value to a useful
 register and getting a desirable result, but in the end it turned out to
 be a no-op.  In one case, it was an Atheros WLAN adapter where somebody
 added to Linux some code that wrote to a mysterious PCI configuration
 register, and then some of the *BSDs copied it.  In the other case, I
 think that somebody used pci_conf_write() to write a magic value to a
 USB host controller register that wasn't on a 32-bit boundary.  ISTR
 that some incorrect value was written, instead.

pciutils' setpci utility has exposed this for lots of systems for
years.  i don't see any value in keeping pcictl from being as usable
as other tools, and as you say, this is unix - rope and all.


.mrg.


Re: ixg(4) performances

2014-08-26 Thread Hisashi T Fujinaka

On Tue, 26 Aug 2014, David Young wrote:


How well has blindly poking configuration registers worked for us in
the past?


Well, with the part he's using (the 82599, I think) it shouldn't be that
blind. The datasheet has all the registers listed, which is the case for
most of Intel's Ethernet controllers.

--
Hisashi T Fujinaka - ht...@twofifty.com
BSEE(6/86) + BSChem(3/95) + BAEnglish(8/95) + MSCS(8/03) + $2.50 = latte


Re: ixg(4) performances

2014-08-26 Thread Thor Lancelot Simon
On Tue, Aug 26, 2014 at 12:17:28PM +, Emmanuel Dreyfus wrote:
 Hi
 
 ixgb(4) has poor performances, even on latest -current. Here is the
 dmesg output:
 ixg1 at pci5 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network Driver, 
 Version - 2.3.10
 ixg1: clearing prefetchable bit
 ixg1: interrupting at ioapic0 pin 9
 ixg1: PCI Express Bus: Speed 2.5Gb/s Width x8
 
 The interface is configued with:
 ifconfig ixg1 mtu 9000 tso4 ip4csum tcp4csum-tx udp4csum-tx

MTU 9000 considered harmful.  Use something that fits in 8K with the headers.
It's a minor piece of the puzzle but nonetheless, it's a piece.

Thor


Re: ixg(4) performances

2014-08-26 Thread Thor Lancelot Simon
On Tue, Aug 26, 2014 at 07:03:06PM -0700, Jonathan Stone wrote:
 Thor,
 
 The NetBSD  TCP stack can't handle 8K payload by page-flipping the payload 
 and prepending an mbuf for XDR/NFS/TCP/IP headers? Or is the issue the extra 
 page-mapping for the prepended mbuf?

The issue is allocating the extra page for a milligram of data.  It is almost
always a lose.  Better to choose the MTU so that the whole packet fits neatly
in 8192 bytes.

It is helpful to understand where MTU 9000 came from: SGI was trying to
optimise UDP NFS performance, for NFSv2 with 8K maximum RPC size, on
systems that had 16K pages.  You can't fit two of that kind of NFS request
in a 16K page, so you might as well allocate something a little bigger than
8K but that happens to leave your memory allocator some useful-sized chunks
to hand out to other callers.

I am a little hazy on the details, but I believe they ended up at MTU 9024
which is 8K + 768 + 64 (leaving a bunch of handy power-of-2 split sizes
as residuals: 4096 + 2048 + 1024 + 128 + 64) which just made no sense to
anyone else so everyone _else_ picked random sizes around 9000 that happened
to work for their hardware.  But at the end of the day, if you do not have
16K pages or are not optimizing for 8K NFSv2 requests on UDP, an MTU that
fits in 8K is almost always better.

Thor


Re: ixg(4) performances

2014-08-26 Thread Taylor R Campbell
   Date: Tue, 26 Aug 2014 12:44:43 -0500
   From: David Young dyo...@pobox.com

   Finally, adding cfgread/cfgwrite commands to pcictl seems like a step in
   the wrong direction.  I know that this is UNIX and we're duty-bound to
   give everyone enough rope, but may we reconsider our assisted-suicide
   policy just this one time? :-)

It's certainly wrong to rely on pcictl to read and write config
registers, but it's useful as a debugging tool and for driver
development -- just like the rest of pcictl.


Re: ixg(4) performances

2014-08-26 Thread Jonathan Stone
Thor,

The NetBSD  TCP stack can't handle 8K payload by page-flipping the payload and 
prepending an mbuf for XDR/NFS/TCP/IP headers? Or is the issue the extra 
page-mapping for the prepended mbuf?


On Tue, 8/26/14, Thor Lancelot Simon t...@panix.com wrote:

 Subject: Re: ixg(4) performances
 To: Emmanuel Dreyfus m...@netbsd.org
 Cc: tech-kern@netbsd.org
 Date: Tuesday, August 26, 2014, 6:56 PM
 
[...]
 
 MTU 9000 considered harmful.  Use something
 that fits in 8K with the headers.
 It's a
 minor piece of the puzzle but nonetheless, it's a
 piece.
 
 Thor



Re: ixg(4) performances

2014-08-26 Thread Emmanuel Dreyfus
Thor Lancelot Simon t...@panix.com wrote:

 MTU 9000 considered harmful.  Use something that fits in 8K with the headers.
 It's a minor piece of the puzzle but nonetheless, it's a piece.

mtu 8192 or 8000 does not cause any improvement over mtu 9000.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org