Re: [E1000-devel] Excessive frame dropping on 82574L

2009-12-28 Thread Richard Scobie
On 12/23/2009 11:27 AM, Brandeburg, Jesse wrote:
> Hi Richard, I would like you to try to increase the value programmed to the 
> lower 16 bits of the PBA register.
>
> find this code in netdev.c
> static struct e1000_info e1000_82574_info = {
>  .mac= e1000_82574,
>  .flags  = FLAG_HAS_HW_VLAN_FILTER
> #ifdef CONFIG_E1000E_MSIX
>| FLAG_HAS_MSIX
> #endif
>| FLAG_HAS_JUMBO_FRAMES
>| FLAG_HAS_WOL
>| FLAG_APME_IN_CTRL3
>| FLAG_RX_CSUM_ENABLED
>| FLAG_HAS_SMART_POWER_DOWN
>| FLAG_HAS_AMT
>| FLAG_HAS_CTRLEXT_ON_LOAD,
>  .pba= 20,
>  .max_hw_frame_size  = DEFAULT_JUMBO,
>  .init_ops   = e1000_init_function_pointers_82571,
>  .get_variants   = e1000_get_variants_82571,
> };
>
> change the above value for .pba to 36, if that doesn't work right change it 
> to 32.
>
> you'll be able to confirm your change worked with ethregs | grep PBA
>
> its currently 0x00140014 aka 0x14==20 transmit (top 16 bits) and receive 
> (lower 16 bits)
>
> This will nearly double your rx space in the on chip fifo, which should help.
>
> Please also double check you have the latest bios available for your system..

Hi Jesse,

This seems to have done the trick. After 200GB rsynced, only one packed 
has been dropped, whereas before, several 10's of thousands would have.

Regards,

Richard

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel


Re: [E1000-devel] Excessive frame dropping on 82574L

2009-12-22 Thread Richard Scobie
Hi Jesse,

Jesse Brandeburg wrote:
> On Mon, 2009-12-21 at 18:53 -0700, Richard Scobie wrote:
> 
>>I have a low end server, Core 2 Duo 2.8, 4GB used to backup using rsync 
>>over a 82574L interface. Kernel 2.6.30.9-102.fc11.x86_64 (e1000e 
>>0.3.3.4-k4). It is using MSI-X interrupts.
>>
>>It's suffering somewhat due to dropping frames:
>>
>>RX packets:294914332 errors:0 dropped:95203 overruns:0 frame:0
>>TX packets:355842341 errors:0 dropped:0 overruns:0 carrier:0
>>
>>and ethtool shows rx_missed_errors: 95203.
>>
>>Googling shows these are caused by the RX FIFO filling up.
> 
> 
> Hi Richard, can you give the whole ethtool -S output?  depending on the
> value of rx_no_buffer_count, you may be able to do something.

  rx_packets: 325509977
  tx_packets: 360014787
  rx_bytes: 441158607444
  tx_bytes: 424925363743
  rx_broadcast: 181969
  tx_broadcast: 954
  rx_multicast: 5725
  tx_multicast: 7
  rx_errors: 0
  tx_errors: 0
  tx_dropped: 0
  multicast: 5725
  collisions: 0
  rx_length_errors: 0
  rx_over_errors: 0
  rx_crc_errors: 0
  rx_frame_errors: 0
  rx_no_buffer_count: 0
  rx_missed_errors: 103969
  tx_aborted_errors: 0
  tx_carrier_errors: 0
  tx_fifo_errors: 0
  tx_heartbeat_errors: 0
  tx_window_errors: 0
  tx_abort_late_coll: 0
  tx_deferred_ok: 0
  tx_single_coll_ok: 0
  tx_multi_coll_ok: 0
  tx_timeout_count: 0
  tx_restart_queue: 0
  rx_long_length_errors: 0
  rx_short_length_errors: 0
  rx_align_errors: 0
  tx_tcp_seg_good: 22181224
  tx_tcp_seg_failed: 0
  rx_flow_control_xon: 0
  rx_flow_control_xoff: 0
  tx_flow_control_xon: 17013029
  tx_flow_control_xoff: 17012765
  rx_long_byte_count: 441158607444
  rx_csum_offload_good: 325459865
  rx_csum_offload_errors: 0
  rx_header_split: 0
  alloc_rx_buff_failed: 0
  tx_smbus: 0
  rx_smbus: 0
  dropped_smbus: 0
  rx_dma_failed: 0
  tx_dma_failed: 0


> The other thing to send is the output of lspci -vvv for your system, I'm
> curious if ASPM is enabled for the ethernet port or its upstream port.

05:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network
Connection
 Subsystem: Intel Corporation Gigabit CT Desktop Adapter
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
SERR-  The other thing we may be able to do is provide a patch to enable GRO if
> at all possible (which should help significantly if it is not already
> enabled,) you can check with ethtool -k ethX, but I guess it may already
> be on.

No, it appears to be off by default:

Offload parameters for eth0:
Cannot get device flags: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off


> Is flow control enabled to your switch?  Are you using jumbo frames?

As you can see from the ethtool -S output above there are a large number
of tx_flow_control actuations.

Initially flow control was disabled and when I noticed the amount of
dropped packets, I enabled it and that stopped further packet loss.

However given the amount of flow control activity for the relatively
short time I enabled it, throughput is probably suffering, so I'd prefer
to leave it off, as this workload should be ethernet constrained as it
is feeding a large, fast RAID.

> There was a fifo (flow control) configuration issue in several versions
> of the e1000e driver in the kernel.  If that was the case disabling flow
> control might help you, ethtool -A ethX autoneg off rx off tx off
> 
> ethtool -G ethX rx 4096 will max out the number of rx descriptors.
> 
> you also may benefit from decreasing the interrupt rate using
> ethtool -C ethX rx-usecs 125 (8000 interrupts per second) because you're
> not doing a latency sensitive workload

I'll give it a try.

> Please also provide /proc/interrupts and ethtool -e ethX, and if you are

CPU0   CPU1
   0:226481   IO-APIC-edge  timer
   1:  0  2   IO-APIC-edge  i8042
   8: 26 26   IO-APIC-edge  rtc0
   9:  0  0   IO-APIC-fasteoi   acpi
  12:  3  1   IO-APIC-edge  i8042
  16:  0  0   IO-APIC-fasteoi   pata_marvell
  17:   1694 410198   IO-APIC-fasteoi   sata_sil24
  28:   46793552  22812   PCI-MSI-edge  ahci
  29:   44433942  6   PCI-MSI-edge  eth0-rx-0
  30: 15   58772791   PCI-MSI-edge  eth0-tx-0
  31:  97668  3   PCI-MSI-edge  eth0
  32: 48 48   PCI-MSI-edge  ioc0
NMI:  0  0   Non-maskable interrupts
LOC:   27702671   26695738   Local timer interrup

Re: [E1000-devel] Excessive frame dropping on 82574L

2009-12-22 Thread Jesse Brandeburg
On Mon, 2009-12-21 at 18:53 -0700, Richard Scobie wrote:
> I have a low end server, Core 2 Duo 2.8, 4GB used to backup using rsync 
> over a 82574L interface. Kernel 2.6.30.9-102.fc11.x86_64 (e1000e 
> 0.3.3.4-k4). It is using MSI-X interrupts.
> 
> It's suffering somewhat due to dropping frames:
> 
> RX packets:294914332 errors:0 dropped:95203 overruns:0 frame:0
> TX packets:355842341 errors:0 dropped:0 overruns:0 carrier:0
> 
> and ethtool shows rx_missed_errors: 95203.
> 
> Googling shows these are caused by the RX FIFO filling up.

Hi Richard, can you give the whole ethtool -S output?  depending on the
value of rx_no_buffer_count, you may be able to do something.

The other thing to send is the output of lspci -vvv for your system, I'm
curious if ASPM is enabled for the ethernet port or its upstream port.

The other thing we may be able to do is provide a patch to enable GRO if
at all possible (which should help significantly if it is not already
enabled,) you can check with ethtool -k ethX, but I guess it may already
be on.

Is flow control enabled to your switch?  Are you using jumbo frames?
There was a fifo (flow control) configuration issue in several versions
of the e1000e driver in the kernel.  If that was the case disabling flow
control might help you, ethtool -A ethX autoneg off rx off tx off

ethtool -G ethX rx 4096 will max out the number of rx descriptors.

you also may benefit from decreasing the interrupt rate using
ethtool -C ethX rx-usecs 125 (8000 interrupts per second) because you're
not doing a latency sensitive workload

Please also provide /proc/interrupts and ethtool -e ethX, and if you are
feeling gung-ho, the output of the ethregs utility available at
sourceforge (you'll have to build it) in the Register Dump utility
section.
-- 
Jesse Brandeburg
This email sent via Evolution, powered by Linux


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel


[E1000-devel] Excessive frame dropping on 82574L

2009-12-21 Thread Richard Scobie
Hi,

I have a low end server, Core 2 Duo 2.8, 4GB used to backup using rsync 
over a 82574L interface. Kernel 2.6.30.9-102.fc11.x86_64 (e1000e 
0.3.3.4-k4). It is using MSI-X interrupts.

It's suffering somewhat due to dropping frames:

RX packets:294914332 errors:0 dropped:95203 overruns:0 frame:0
TX packets:355842341 errors:0 dropped:0 overruns:0 carrier:0

and ethtool shows rx_missed_errors: 95203.

Googling shows these are caused by the RX FIFO filling up.

Are there any "knobs" I can twiddle to stop this? There seem to be no 
buffer related module parameters.

Regards,

Richard




--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel