Re: [5/5] e1000e: Avoid receiver overrun interrupt bursts

2017-10-24 Thread Lennart Sorensen
On Tue, Sep 19, 2017 at 09:41:02PM +0200, Benjamin Poirier wrote:
> On 2017/09/19 12:38, Philip Prindeville wrote:
> > Hi.
> > 
> > We’ve been running this patchset (all 5) for about as long as they’ve been 
> > under review… about 2 months.  And in a burn-in lab with heavy traffic.
> > 
> > We’ve not seen a single link-flap in hundreds of ours of saturated traffic.
> > 
> > Would love to see some resolution soon on this as we don’t want to ship a 
> > release with unsanctioned patches.
> > 
> > Is there an estimate on when that might be?
> 
> The patches have been added to Jeff Kirsher's next-queue tree. I guess
> they will be submitted for v4.15 which might be released in early
> 2018...
> http://phb-crystal-ball.org/

And then they will be submitted to linux-stable so this long standing
regression can be fixed, right?

-- 
Len Sorensen


Re: [Intel-wired-lan] [PATCH 4/5] e1000e: Separate signaling for link check/link up

2017-08-02 Thread Lennart Sorensen
On Wed, Aug 02, 2017 at 02:28:07PM +0300, Neftin, Sasha wrote:
> On 7/21/2017 21:36, Benjamin Poirier wrote:
> > Lennart reported the following race condition:
> > 
> > \ e1000_watchdog_task
> >  \ e1000e_has_link
> >  \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
> >  /* link is up */
> >  mac->get_link_status = false;
> > 
> >  /* interrupt */
> >  \ e1000_msix_other
> >  hw->mac.get_link_status = true;
> > 
> >  link_active = !hw->mac.get_link_status
> >  /* link_active is false, wrongly */
> > 
> > This problem arises because the single flag get_link_status is used to
> > signal two different states: link status needs checking and link status is
> > down.
> > 
> > Avoid the problem by using the return value of .check_for_link to signal
> > the link status to e1000e_has_link().
> > 
> > Reported-by: Lennart Sorensen <lsore...@csclub.uwaterloo.ca>
> > Signed-off-by: Benjamin Poirier <bpoir...@suse.com>
> > ---
> >   drivers/net/ethernet/intel/e1000e/mac.c| 11 ---
> >   drivers/net/ethernet/intel/e1000e/netdev.c |  2 +-
> >   2 files changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/e1000e/mac.c 
> > b/drivers/net/ethernet/intel/e1000e/mac.c
> > index b322011ec282..f457c5703d0c 100644
> > --- a/drivers/net/ethernet/intel/e1000e/mac.c
> > +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> > @@ -410,6 +410,9 @@ void e1000e_clear_hw_cntrs_base(struct e1000_hw *hw)
> >*  Checks to see of the link status of the hardware has changed.  If a
> >*  change in link status has been detected, then we read the PHY 
> > registers
> >*  to get the current speed/duplex if link exists.
> > + *
> > + *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 
> > (link
> > + *  up).
> >**/
> >   s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
> >   {
> > @@ -423,7 +426,7 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
> >  * Change or Rx Sequence Error interrupt.
> >  */
> > if (!mac->get_link_status)
> > -   return 0;
> > +   return 1;
> > /* First we want to see if the MII Status Register reports
> >  * link.  If so, then we want to get the current speed/duplex
> > @@ -461,10 +464,12 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
> >  * different link partner.
> >  */
> > ret_val = e1000e_config_fc_after_link_up(hw);
> > -   if (ret_val)
> > +   if (ret_val) {
> > e_dbg("Error configuring flow control\n");
> > +   return ret_val;
> > +   }
> > -   return ret_val;
> > +   return 1;
> >   }
> >   /**
> > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
> > b/drivers/net/ethernet/intel/e1000e/netdev.c
> > index fc6a1db2..5a8ab1136566 100644
> > --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> > @@ -5081,7 +5081,7 @@ static bool e1000e_has_link(struct e1000_adapter 
> > *adapter)
> > case e1000_media_type_copper:
> > if (hw->mac.get_link_status) {
> > ret_val = hw->mac.ops.check_for_link(hw);
> > -   link_active = !hw->mac.get_link_status;
> > +   link_active = ret_val > 0;
> > } else {
> > link_active = true;
> > }
> 
> Hello Benjamin,
> 
> Will this patch fix any serious problem with link indication? Is it
> necessary? Can we consider your patch series without 4/5 part?

Without this patch, you have the race condition that can make the
watchdog_task mistakenly think the link is down when it isn't, and then
it resets the adapter, which does make the link go down.

So it is rather catastrophic for the interface.

The other patch to the interrupt handling should make it never get hit,
but the issue does still exist if not fixed and I wouldn't rule out that
it could possibly still happen even with the other fix in place.

-- 
Len Sorensen


Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-21 Thread Lennart Sorensen
On Thu, Jul 20, 2017 at 04:44:55PM -0700, Benjamin Poirier wrote:
> Could you please test the following patch and let me know if it:
> 1) reduces the interrupt rate of the Other msi-x vector
> 2) avoids the link flaps
> or
> 3) logs some dmesg warnings of the form "Other interrupt with unhandled [...]"
> In this case, please paste icr values printed.

By the way, while at fixing the e1000e, I just noticed that
if you are blasting the port with traffic when it comes up,
you risk getting a transmit queue time out, because the queue
is started before the carrier is up.  ixgbe already fixed that in
cdc04dcce0598fead6029a2f95e95a4d2ea419c2.  igb has the same problem (which
goes away by moving the queue start to the watchdog after carrier_on,
I just haven't got around to sending that patch yet).

I am going to try moving the queue start to the watchdog and try it again.

Trace looked like this:

[ cut here ]
WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x1f9/0x200
NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Modules linked in: dpi_drv(PO) ccu_util(PO) ipv4_mb(PO) 
l2bridge_config_util(PO) l2_config_util(PO) route_config_util(PO) 
qos_config_util(PO) sysapp_common(PO) chantry_fwd_eng_2800_config(PO) 
shim_module(PO) sadb_cc(PO) ipsecXformer(PO) libeCrypto(PO) ipmatch_cc(PO) 
l2h_cc(PO) ndproxy_cc(PO) arpint_cc(PO) portinfo_cc(PO) chantryqos_cc(PO) 
redirector_cc(PO) ix_ph(PO) fpm_core_cc(PO) pulse_cc(PO) vnstt_cc(PO) 
vnsap_cc(PO) fm_cc(PO) rutm_cc(PO) mutm_cc(PO) ethernet_tx_cc(PO) stkdrv_cc(PO) 
l2bridge_cc(PO) events_util(PO) sched_cc(PO) qm_cc(PO) ipv4_cc(PO) wred_cc(PO) 
tc_meter_cc(PO) dscp_classifier_cc(PO) classifier_6t_cc(PO) ent586_cc(PO) 
dev_cc_arp(PO) chantry_fwd_eng_2800_tables(PO) ether_arp_lib(PO) rtmv4_lib(PO) 
lkup_lib(PO) l2tm_lib(PO) fragmentation_lib(PO) properties_lib(PO) 
msg_support_lib(PO)
 utilities_lib(PO) cci_lib(PO) rm_lib(PO) libossl(O) vip(O) 
productSpec_x86_dp(PO) e1000e
CPU: 0 PID: 0 Comm: swapper/0 Tainted: P   O4.9.24 #20
Hardware name: Supermicro X7SPA-HF/X7SPA-HF, BIOS 1.2a   06/23/12  
  811cef1b 88007fc03e88 
 81037ade  88007fc03ed8 0001
  0082 0001 81037b4c
Call Trace:
  
 [] ? dump_stack+0x46/0x5b
 [] ? __warn+0xbe/0xe0
 [] ? warn_slowpath_fmt+0x4c/0x50
 [] ? mod_timer+0xf2/0x150
 [] ? dev_watchdog+0x1f9/0x200
 [] ? dev_graft_qdisc+0x70/0x70
 [] ? call_timer_fn.isra.26+0x11/0x80
 [] ? run_timer_softirq+0x128/0x150
 [] ? __do_softirq+0xeb/0x1f0
 [] ? irq_exit+0x55/0x60
 [] ? smp_apic_timer_interrupt+0x39/0x50
 [] ? apic_timer_interrupt+0x7c/0x90
  
 [] ? mwait_idle+0x51/0x80
 [] ? cpu_startup_entry+0xa7/0x130
 [] ? start_kernel+0x306/0x30e
---[ end trace ee759b7a56e1110b ]---

-- 
Len Sorensen


Re: [PATCH 4/5] e1000e: Separate signaling for link check/link up

2017-07-21 Thread Lennart Sorensen
On Fri, Jul 21, 2017 at 11:36:26AM -0700, Benjamin Poirier wrote:
> Lennart reported the following race condition:
> 
> \ e1000_watchdog_task
> \ e1000e_has_link
> \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
> /* link is up */
> mac->get_link_status = false;
> 
> /* interrupt */
> \ e1000_msix_other
> hw->mac.get_link_status = true;
> 
> link_active = !hw->mac.get_link_status
> /* link_active is false, wrongly */
> 
> This problem arises because the single flag get_link_status is used to
> signal two different states: link status needs checking and link status is
> down.
> 
> Avoid the problem by using the return value of .check_for_link to signal
> the link status to e1000e_has_link().
> 
> Reported-by: Lennart Sorensen <lsore...@csclub.uwaterloo.ca>
> Signed-off-by: Benjamin Poirier <bpoir...@suse.com>

This too seems potentially -stable worthy, although with patch 5, the
problem becomes much much less likely to occur.

-- 
Len Sorensen


Re: [PATCH 5/5] e1000e: Avoid receiver overrun interrupt bursts

2017-07-21 Thread Lennart Sorensen
On Fri, Jul 21, 2017 at 11:36:27AM -0700, Benjamin Poirier wrote:
> When e1000e_poll() is not fast enough to keep up with incoming traffic, the
> adapter (when operating in msix mode) raises the Other interrupt to signal
> Receiver Overrun.
> 
> This is a double problem because 1) at the moment e1000_msix_other()
> assumes that it is only called in case of Link Status Change and 2) if the
> condition persists, the interrupt is repeatedly raised again in quick
> succession.
> 
> Ideally we would configure the Other interrupt to not be raised in case of
> receiver overrun but this doesn't seem possible on this adapter. Instead,
> we handle the first part of the problem by reverting to the practice of
> reading ICR in the other interrupt handler, like before commit 16ecba59bc33
> ("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
> 0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
> from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
> anymore. We handle the second part of the problem by not re-enabling the
> Other interrupt right away when there is overrun. Instead, we wait until
> traffic subsides, napi polling mode is exited and interrupts are
> re-enabled.
> 
> Reported-by: Lennart Sorensen <lsore...@csclub.uwaterloo.ca>
> Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt")
> Signed-off-by: Benjamin Poirier <bpoir...@suse.com>

Any chance of this fix hitting -stable?  After all adapter reset under
load is not nice.

-- 
Len Sorensen


Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-21 Thread Lennart Sorensen
On Fri, Jul 21, 2017 at 11:27:09AM -0400,  wrote:
> On Thu, Jul 20, 2017 at 04:44:55PM -0700, Benjamin Poirier wrote:
> > Could you please test the following patch and let me know if it:
> > 1) reduces the interrupt rate of the Other msi-x vector
> > 2) avoids the link flaps
> > or
> > 3) logs some dmesg warnings of the form "Other interrupt with unhandled 
> > [...]"
> > In this case, please paste icr values printed.
> 
> I will give it a try.

So test looks excellent.  Seems to only get interrupts when link state
actually changes now.

> Another odd behaviour I see is that the driver will hang in
> napi_synchronize on shutdown if there is traffic at the time (at least
> I think that's the trigger, maybe the trigger is if there has been an
> overload of traffic and the backlog in napi was used).
> 
> From doing some searching, this seems to be a problem that has plagued
> some people for years with this driver.
> 
> I am having trouble figuring out exactly what napi_synchronize is waiting
> for and who is supposed to toggle the flag it is waiting on.  The flag
> appears to work backwards from what I would have expected it to do.
> I see lots of places that can set the bit, but only napi_enable seems
> to clear it again, and I don't see how that would get called for all
> the places that potentially set the bit.

I just realized NAPI_STATE_SCHED and NAPIF_STATE_SCHED are the same
thing and I need to look at both of those.

Still something seems odd in some corner case where napi gets stuck and
you can't close the port anymore due to napi_synchronize never being
able to finish.  Some traffic pattern causes that SCHED state bit to
get into the wrong state and nothing ever clears it.  Even managed to
see it get stuck so it never passed traffic again and hung on shutdown.
The napi poll was never called again.

-- 
Len Sorensen


Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-21 Thread Lennart Sorensen
On Thu, Jul 20, 2017 at 04:44:55PM -0700, Benjamin Poirier wrote:
> Could you please test the following patch and let me know if it:
> 1) reduces the interrupt rate of the Other msi-x vector
> 2) avoids the link flaps
> or
> 3) logs some dmesg warnings of the form "Other interrupt with unhandled [...]"
> In this case, please paste icr values printed.

I will give it a try.

Another odd behaviour I see is that the driver will hang in
napi_synchronize on shutdown if there is traffic at the time (at least
I think that's the trigger, maybe the trigger is if there has been an
overload of traffic and the backlog in napi was used).

>From doing some searching, this seems to be a problem that has plagued
some people for years with this driver.

I am having trouble figuring out exactly what napi_synchronize is waiting
for and who is supposed to toggle the flag it is waiting on.  The flag
appears to work backwards from what I would have expected it to do.
I see lots of places that can set the bit, but only napi_enable seems
to clear it again, and I don't see how that would get called for all
the places that potentially set the bit.

-- 
Len Sorensen


Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-20 Thread Lennart Sorensen
On Wed, Jul 19, 2017 at 05:07:47PM -0700, Benjamin Poirier wrote:
> Are you sure about this? In my testing, while triggering the overrun
> with the msleep, I read ICR when entering e1000_msix_other() and RXO is
> consistently set.

I had thousands of calls to e1000_msix_other where the only bit set
was OTHER.

I don't know if the cause is overruns, it just seems plausible.

> I'm working on a patch that uses that fact to handle the situation and
> limit the interrupt.

Excellent.

Running in MSI mode rather than MSI-X seems to not have the problem of
unexpected interrupts, but has other issues (such as loosing the IRQ
affinity setting if you do ifconfig down;ifconfig up on the interface,
which does not happen in MSI-X's case.)  That's rather annoying as you
can't set the affinity before bringing up the interface which is rather
undesirable.

-- 
Len Sorensen


Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-19 Thread Lennart Sorensen
On Tue, Jul 18, 2017 at 04:14:35PM -0700, Benjamin Poirier wrote:
> Thanks for the detailed analysis.
> 
> Refering to the original discussion around this patch series, it seemed like
> the IMS bit for a condition had to be set for the Other interrupt to be raised
> for that condition.
> 
> https://lkml.org/lkml/2015/11/4/683
> 
> In this case however, E1000_ICR_RXT0 is not set in IMS so Other shouldn't be
> raised for Receiver Overrun. Apparently something is going on...
> 
> I can reproduce the spurious Other interrupts with a simple mdelay()
> With the debugging patch at the end of the mail I see stuff like this
> while blasting with udp frames:
>   -0 [086] d.h1 15338.742675: e1000_msix_other: got Other 
> interrupt, count 15127
><...>-54504 [086] d.h. 15338.742724: e1000_msix_other: got Other 
> interrupt, count 1
><...>-54504 [086] d.h. 15338.742774: e1000_msix_other: got Other 
> interrupt, count 1
><...>-54504 [086] d.h. 15338.742824: e1000_msix_other: got Other 
> interrupt, count 1
>   -0 [086] d.h1 15340.745123: e1000_msix_other: got Other 
> interrupt, count 27584
><...>-54504 [086] d.h. 15340.745172: e1000_msix_other: got Other 
> interrupt, count 1
><...>-54504 [086] d.h. 15340.745222: e1000_msix_other: got Other 
> interrupt, count 1
><...>-54504 [086] d.h. 15340.745272: e1000_msix_other: got Other 
> interrupt, count 1
> 
> > hence sets the flag that (unfortunately) means both link is down and link
> > state should be checked.  Since this now happens 3000 times per second,
> > the chances of it happening while the watchdog_task is checking the link
> > state becomes pretty high, and it if does happen to coincice, then the
> > watchdog_task will reset the adapter, which causes a real loss of link.
> 
> Through which path does watchdog_task reset the adapter? I didn't
> reproduce that.

The other interrupt happens and sets get_link_status to true.  At some
point the watchdog_task runs on some core and calls e1000e_has_link,
which then calls check_for_link to find out the current link status.
While e1000e_check_for_copper_link is checking the link state and
after updating get_link_status to false to indicate link is up, another
interrupt occurs and another core handles it and changes get_link_status
to true again.  So by the time e1000e_has_link goes to determine the
return value, get_link_state has changed back again so now it returns
link down, and as a result the watchdog_task calls reset, because we
have packets in the transmit queue (we were busy forwarding over 10
packets per second when it happened).

Running on an Atom D525 which isn't very fast and uses hyperthreading
might have something to do with how the scheduling manages to trigger
this race condition.  On a faster CPU you very likely would be done
checking the link state quickly enough that the interrupt handler rarely
gets a chance to interfere.  Also we have the irq affinity set so the
RX/TX of one port is handled by one CPU, the RX/TX of the other port
by a different CPU and the Other interrupts and other tasks (like the
watchdog) are handled by the last two CPUs.

Either making the current link state its own bool and keeping it's meaning
away from get_link_state, or making the interrupt handler only change
get_link_state when LSC is actually present makes the problem go away.
Having two meanings to get_link_state (both link state needs checking
and what the link state is) causes issues.  After all it is using a bool
to store 3 values: Link is up, link needs checking but is up and link
needs checking but is down.  Of course the last two states are rather
quantum, in that you don't know which it is until you check.

> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index b3679728caac..689ad76d0d12 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -46,6 +46,8 @@
>  
>  #include "e1000.h"
>  
> +DEFINE_RATELIMIT_STATE(e1000e_ratelimit_state, 2 * HZ, 4);
> +
>  #define DRV_EXTRAVERSION "-k"
>  
>  #define DRV_VERSION "3.2.6" DRV_EXTRAVERSION
> @@ -937,6 +939,8 @@ static bool e1000_clean_rx_irq(struct e1000_ring 
> *rx_ring, int *work_done,
>   bool cleaned = false;
>   unsigned int total_rx_bytes = 0, total_rx_packets = 0;
>  
> + mdelay(10);
> +
>   i = rx_ring->next_to_clean;
>   rx_desc = E1000_RX_DESC_EXT(*rx_ring, i);
>   staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
> @@ -1067,6 +1071,13 @@ static bool e1000_clean_rx_irq(struct e1000_ring 
> *rx_ring, int *work_done,
>  
>   adapter->total_rx_bytes += total_rx_bytes;
>   adapter->total_rx_packets += total_rx_packets;
> +
> + if (__ratelimit(_ratelimit_state)) {
> + static unsigned int max;
> + max = max(max, total_rx_packets);
> + trace_printk("received %u max %u\n", 

commit 16ecba59 breaks 82574L under heavy load.

2017-07-18 Thread Lennart Sorensen
Commit 16ecba59bc333d6282ee057fb02339f77a880beb has apparently broken
at least the 82574L under heavy load (as in load heavy enough to cause
packet drops).  In this case, when running in MSI-X mode, the Other
Causes interrupt fires about 3000 times per second, but not due to link
state changes.  Unfortunately this commit changed the driver to assume
that the Other Causes interrupt can only mean link state change and
hence sets the flag that (unfortunately) means both link is down and link
state should be checked.  Since this now happens 3000 times per second,
the chances of it happening while the watchdog_task is checking the link
state becomes pretty high, and it if does happen to coincice, then the
watchdog_task will reset the adapter, which causes a real loss of link.

Reverting the commit makes everything work fine again (of course packets
are still dropped, but at least the link stays up, the adapter isn't
reset, and most packets make it through).

I tried checking what the bits in the ICR actually were under these
conditions, and it would appear that the only bit set is 24 (the Other
Causes interrupt bit).  So I don't know what the real cause is although
rx buffer overrun would be my guess, and in fact I see nothing in the
datasheet indicating that you can actually disable the rx buffer overrun
from generating an interrupt.

Prior to this commit, the interrupt handler explicitly checked that the
interrupt was caused by a link state change and only then did it trigger
a recheck which worked fine and did not cause incorrect adapter resets,
although it of course still had lots of undesired interrupts to deal with.

Of course ideally there would be a way to make these 3000 pointless
interrupts per second not happen, but unless there is a way to determine
that, I think this commit needs reverting, since it apparently causes
link failures on actual hardware that exists.

The ports are onboard intel 82574L on a Supermicro X7SPA-HF-D525 with
1.2a BIOS (upgrading to 1.2b to check if it makes a difference is not
an option unfortunately).

-- 
Len Sorensen


Re: e1000e on Thinkpad x60: gigabit not available due to "SmartSpeed"

2016-09-02 Thread Lennart Sorensen
On Thu, Sep 01, 2016 at 02:58:13PM -0700, Greg wrote:
> On Thu, 2016-09-01 at 22:14 +0200, Pavel Machek wrote:
> > Hi!
> > 
> > I have trouble getting 1000mbit out of my ethernet card.
> > 
> > I tried direct connection between two PCs with different cables, and
> > no luck.
> > 
> > Today I tried connection to 1000mbit switch, and no luck, either. (Two
> > cables, one was cat6, both short).
> > 
> > My computer sees 1000mbit being advertised by the other side, but does
> > not advertise 1000mbit, "Link Speed was downgraded by SmartSpeed".
> 
> Check your cables?
> 
> https://vmxp.wordpress.com/2015/01/06/1gbe-intel-nic-throttled-to-100mbit-by-smartspeed/

Of course if it isn't the cable, then it could even be a broken pin in
the port.  As far as I can tell, anything that causes one of the 3rd
or 4th pairs of wires to not work will degrade to 100Mbit on just the
first 2 pairs of wires and give that message.  Some badly implemented
switches can also cause it of course.

-- 
Len Sorensen


Re: CVE-2014-9900 fix is not upstream

2016-08-24 Thread Lennart Sorensen
On Tue, Aug 23, 2016 at 10:25:45PM +0100, Al Viro wrote:
> Sadly, sizeof is what we use when copying that sucker to userland.  So these
> padding bits in the end would've leaked, true enough, and the case is somewhat
> weaker.  And any normal architecture will have those, but then any such
> architecture will have no more trouble zeroing a 32bit value than 16bit one.

Hmm, good point.  Too bad I don't see a compiler option of "zero all
padding in structs".  Certainly generating the code should not really
be that different.

I see someone did request it 2 years ago:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63479

-- 
Len Sorensen


Re: CVE-2014-9900 fix is not upstream

2016-08-23 Thread Lennart Sorensen
On Tue, Aug 23, 2016 at 01:34:05PM -0700, Joe Perches wrote:
> On Tue, 2016-08-23 at 21:09 +0100, Al Viro wrote:
> > On Tue, Aug 23, 2016 at 11:24:06AM -0700, David Miller wrote:
> > ... and then we can file a bug report against the sodding compiler.  Note
> > that
> > struct ethtool_wolinfo {
> > __u32   cmd;
> > __u32   supported;
> > __u32   wolopts;
> > __u8sopass[SOPASS_MAX]; // 6, actually
> > };
> > is not going to *have* padding.  Not on anything even remotely sane.
> > If array of 6 char as member of a struct requires 64bit alignment on some
> > architecture, I would really like some of what the designers of that ABI
> > must have been smoking.
> 
> try this on x86-64
> 
> $ pahole -C ethtool_wolinfo vmlinux
> struct ethtool_wolinfo {
>   __u32  cmd;  /* 0 4 */
>   __u32  supported;/* 4 4 */
>   __u32  wolopts;  /* 8 4 */
>   __u8   sopass[6];/*12 6 */
> 
>   /* size: 20, cachelines: 1, members: 4 */
>   /* padding: 2 */
>   /* last cacheline: 20 bytes */
> };

That would be padding after the structure elements.

I think what was meant is that it won't add padding in the middle of the
structure due to alignment, ie it isn't doing:

struct ethtool_wolinfo {
__u32  cmd;  /* 0 4 */
__u32  supported;/* 4 4 */
__u32  wolopts;  /* 8 4 */
<4 bytes padding here>
__u8   sopass[6];/*16 6 */
};

which would have 4 bytes of padding in the middle between wolopts
and sopass.

I would not think it is the compilers job to worry about what is after
your structure elements, since you shouldn't be going there.

-- 
Len Sorensen


Re: [PATCH] net: ethernet: ti: cpdma: switch to use genalloc

2016-06-24 Thread Lennart Sorensen
On Fri, Jun 24, 2016 at 07:58:32PM +0300, Grygorii Strashko wrote:
> Oh. nice :( So, seems, I'd need to send v3. Right?
> By the way, this code hasn't been introduced by this patch - I've
> just moved whole function from one place to another.

Well since it is moving I would think that was a handy time to fix the
coding style violation too, since it got noticed.

That leaves just one place in that file violating that part of the coding
style (the other is in cpdma_chan_dump).

Somehow it wasn't spotted when the code was put in back in 2010, and since
they were wrapped lines, they don't stand out quite as much visually.

-- 
Len Sorensen


Re: [PATCH] net: ethernet: ti: cpdma: switch to use genalloc

2016-06-24 Thread Lennart Sorensen
On Fri, Jun 24, 2016 at 11:35:15AM +0530, Mugunthan V N wrote:
> >> +static void cpdma_desc_pool_destroy(struct cpdma_desc_pool *pool)
> >> +{
> >> +if (!pool)
> >> +return;
> >> +
> >> +WARN_ON(pool->used_desc);
> >> +if (pool->cpumap) {
> >> +dma_free_coherent(pool->dev, pool->mem_size, pool->cpumap,
> >> +  pool->phys);
> >> +} else {
> >> +iounmap(pool->iomap);
> >> +}
> >> +}
> >> +
> > single if, brackets?
> 
> if() has multiple line statement, so brackets are must.

It is line wrapped, it is still one statement.  And you can't argue the
else being multiple lines, although the style does require using brackets
for the else if the if required them.

Style says "Do not unnecessarily use braces where a single statement will do."
It says statement, not line.  A multiline wrapped statement is still
one statement.

I may personally hate the lack of brackets, but style wise it seems very
clear that the linux kernel only uses brakcets when required, which is
only when there is more than one statement.  I prefer what you did,
but not as much as I prefer consistency.

-- 
Len Sorensen


Re: [PATCH] ti: Remove no longer used functions and prototypes in the files, cpsw_ale.c and cpsw_ale.h

2015-05-29 Thread Lennart Sorensen
On Fri, May 29, 2015 at 12:31:57PM -0400, Nicholas Krause wrote:
 This removes the function, cpsw_ale_flush and its prototype from the
 files cpsw_ale.c and cpsw_ale.h due to having no more callers. Finally
 we also remove the functions, cpsw_ale_set_vlan_entry, 
 cpsw_ale_flush_ucast and cpsw_ale_add_ucast and their prototypes
 due to their only caller being removed with the removal of the
 function, cpsw_ale.c respectfully.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com
 ---
  drivers/net/ethernet/ti/cpsw_ale.c | 162 
 -
  drivers/net/ethernet/ti/cpsw_ale.h |   3 -
  2 files changed, 165 deletions(-)
 
 diff --git a/drivers/net/ethernet/ti/cpsw_ale.c 
 b/drivers/net/ethernet/ti/cpsw_ale.c
 index 6e927b4..b360dc8 100644
 --- a/drivers/net/ethernet/ti/cpsw_ale.c
 +++ b/drivers/net/ethernet/ti/cpsw_ale.c
 @@ -147,27 +147,6 @@ static int cpsw_ale_write(struct cpsw_ale *ale, int idx, 
 u32 *ale_entry)
   return idx;
  }
  
 -static int cpsw_ale_match_addr(struct cpsw_ale *ale, u8 *addr, u16 vid)
 -{
 - u32 ale_entry[ALE_ENTRY_WORDS];
 - int type, idx;
 -
 - for (idx = 0; idx  ale-params.ale_entries; idx++) {
 - u8 entry_addr[6];
 -
 - cpsw_ale_read(ale, idx, ale_entry);
 - type = cpsw_ale_get_entry_type(ale_entry);
 - if (type != ALE_TYPE_ADDR  type != ALE_TYPE_VLAN_ADDR)
 - continue;
 - if (cpsw_ale_get_vlan_id(ale_entry) != vid)
 - continue;
 - cpsw_ale_get_addr(ale_entry, entry_addr);
 - if (ether_addr_equal(entry_addr, addr))
 - return idx;
 - }
 - return -ENOENT;
 -}
 -
  static int cpsw_ale_match_vlan(struct cpsw_ale *ale, u16 vid)
  {
   u32 ale_entry[ALE_ENTRY_WORDS];
 @@ -268,147 +247,6 @@ int cpsw_ale_flush_multicast(struct cpsw_ale *ale, int 
 port_mask, int vid)
  }
  EXPORT_SYMBOL_GPL(cpsw_ale_flush_multicast);
  
 -static void cpsw_ale_flush_ucast(struct cpsw_ale *ale, u32 *ale_entry,
 -  int port_mask)
 -{
 - int port;
 -
 - port = cpsw_ale_get_port_num(ale_entry);
 - if ((BIT(port)  port_mask) == 0)
 - return; /* ports dont intersect, not interested */
 - cpsw_ale_set_entry_type(ale_entry, ALE_TYPE_FREE);
 -}
 -
 -int cpsw_ale_flush(struct cpsw_ale *ale, int port_mask)
 -{
 - u32 ale_entry[ALE_ENTRY_WORDS];
 - int ret, idx;
 -
 - for (idx = 0; idx  ale-params.ale_entries; idx++) {
 - cpsw_ale_read(ale, idx, ale_entry);
 - ret = cpsw_ale_get_entry_type(ale_entry);
 - if (ret != ALE_TYPE_ADDR  ret != ALE_TYPE_VLAN_ADDR)
 - continue;
 -
 - if (cpsw_ale_get_mcast(ale_entry))
 - cpsw_ale_flush_mcast(ale, ale_entry, port_mask);
 - else
 - cpsw_ale_flush_ucast(ale, ale_entry, port_mask);
 -
 - cpsw_ale_write(ale, idx, ale_entry);
 - }
 - return 0;
 -}
 -EXPORT_SYMBOL_GPL(cpsw_ale_flush);
 -
 -static inline void cpsw_ale_set_vlan_entry_type(u32 *ale_entry,
 - int flags, u16 vid)
 -{
 - if (flags  ALE_VLAN) {
 - cpsw_ale_set_entry_type(ale_entry, ALE_TYPE_VLAN_ADDR);
 - cpsw_ale_set_vlan_id(ale_entry, vid);
 - } else {
 - cpsw_ale_set_entry_type(ale_entry, ALE_TYPE_ADDR);
 - }
 -}
 -
 -int cpsw_ale_add_ucast(struct cpsw_ale *ale, u8 *addr, int port,
 -int flags, u16 vid)
 -{
 - u32 ale_entry[ALE_ENTRY_WORDS] = {0, 0, 0};
 - int idx;
 -
 - cpsw_ale_set_vlan_entry_type(ale_entry, flags, vid);
 -
 - cpsw_ale_set_addr(ale_entry, addr);
 - cpsw_ale_set_ucast_type(ale_entry, ALE_UCAST_PERSISTANT);
 - cpsw_ale_set_secure(ale_entry, (flags  ALE_SECURE) ? 1 : 0);
 - cpsw_ale_set_blocked(ale_entry, (flags  ALE_BLOCKED) ? 1 : 0);
 - cpsw_ale_set_port_num(ale_entry, port);
 -
 - idx = cpsw_ale_match_addr(ale, addr, (flags  ALE_VLAN) ? vid : 0);
 - if (idx  0)
 - idx = cpsw_ale_match_free(ale);
 - if (idx  0)
 - idx = cpsw_ale_find_ageable(ale);
 - if (idx  0)
 - return -ENOMEM;
 -
 - cpsw_ale_write(ale, idx, ale_entry);
 - return 0;
 -}
 -EXPORT_SYMBOL_GPL(cpsw_ale_add_ucast);
 -
 -int cpsw_ale_del_ucast(struct cpsw_ale *ale, u8 *addr, int port,
 -int flags, u16 vid)
 -{
 - u32 ale_entry[ALE_ENTRY_WORDS] = {0, 0, 0};
 - int idx;
 -
 - idx = cpsw_ale_match_addr(ale, addr, (flags  ALE_VLAN) ? vid : 0);
 - if (idx  0)
 - return -ENOENT;
 -
 - cpsw_ale_set_entry_type(ale_entry, ALE_TYPE_FREE);
 - cpsw_ale_write(ale, idx, ale_entry);
 - return 0;
 -}
 -EXPORT_SYMBOL_GPL(cpsw_ale_del_ucast);
 -
 -int cpsw_ale_add_mcast(struct cpsw_ale *ale, u8 *addr, int port_mask,
 -int flags, u16 vid, int mcast_state)
 -{
 - u32 

Re: No idea about shaping trough many pc

2008-01-10 Thread Lennart Sorensen
On Thu, Jan 10, 2008 at 12:06:35PM +0300, Badalian Vyacheslav wrote:
 Hello all.
 I try more then 2 month resolve problem witch my shaping.  Maybe you can 
 help for me?
 
 Sheme:
+---+
 + - | Shaping PC 1 | -+
 /  +---+  \
 ++   /   ++  \  
 + +
 | Cisco |  + | Shaping PC N  | ---+ -| CISCO |
 ++   \   ++  /  
 +-+
 \  +-+   /
 + - | Shaping PC 20 | +
+-+
 
 Network - Over 10k users. Common bandwidth to INTERNET more then 1 GBs
 All computers have BGP and turn on multipath.
 Cisco can't do load sharing by Packet (its can resolve all my problems 
 =((( ). Only by DST IP, SRC IP, or +Level4.
 Ok. User must have speed 1mbs.
 Lets look variants:
 1. Create rules to user = (1mbs/N computers). If user use N connection 
 all great, but if it use 1 connection his speed = 1mbs/N - its not look 
 good. All be great if cisco can PER PACKET load sharing =(
 2. Create rules to user = 1mbs. If user use 1 connection all great, but 
 if it use N connection his speed much more then needed limit =(
 
 Why i use 20 PC? Becouse 1 pc normal forward 100-150mbs... when it have 
 100% cpu usage on Sofware Interrupts...

I have managed forwarding of 600Mbps using about 15% CPU load on a
500MHz Geode LX, using 4 100Mbit pcnet32 interfaces and a small tweak to
how the NAPI is implemented on it.  Adding traffic shapping and such to
the processing would certainly increase the CPU load, but hopefully not
by much.  The reason I didn't get more than 600Mbps was that the PCI bus
is now full.

 Any idea how to resolve this problem?
 
 In my dreams (feature request to netdev ;) ):
 Get PC - title: MASTER TC.  All 20 PC syncronize statistic with MASTER 
 and have common rules and statistic. Then i use variant 2 and will be 
 happy... but its not real? =(
 Maybe have other variants?

Well now sure about synchornizing and all that.  I still think if I can
manage 600Mbps forwarding rate using a slow poke Geode then a modern CPU
like a Q6600 with a number of PCIe gig ports should be able to do quite
a lot.

The tweak I did was to add a timer to the driver that I can activate
whenever I finish emptying the receive queue.  When the timer expires it
adds the port back to the NAPI queue, and when it is called again the
poll will either process whatever packets arrived during the delay, or
it will actually unmask the IRQ and go back to IRQ mode.  The delay I
use is 1 jiffy, and I run with 1000HZ and set the queues to 256 packets,
since 1ms at 100MBps can provide at most about 200 packets (64byte worst
case).  I simply check whenever I empty the queue how many packets I
just processed.  If greater than 0, I enable the timer to expire on the
next jiffy and leave the port masked after removing port from napi
polling, and if it was 0 then I must have been called again after the
timer expired and still had no packets to process in which case I unmask
the IRQ and don't enable the timer.  I had to change the HZ to 1000
since at 250 or 100 I wouldn't be able to handle the worst case number
of packets (the pcnet32 has a maximum of 512 packets in a queue).

With NAPI the normal behaviour is that whenever you empty the receive
queue, you reenable IRQs, but it doesn't take that fast a CPU to
actually empty the queue all the time and then you end up with the
overhead for masking IRQs everytime you receive packets, process them,
and then the overhead of unmasking the IRQ just to within a fraction of
a milisecond getting an IRQ for the next packet.  With the delay until
the next jiffy for unmasking the IRQ you end up causing a potential lag
on processing packets of up to 1ms, although on average less than that,
but the IRQ load drops dramatically and the overhead of managing the IRQ
masking and the IRQ handler goes away.  In the case of this system the
CPU load dropped from 90% at 500Mbps to 15% at 600Mbps, and the
interrupt rate dropped from one IRQ every couple of packets, to one IRQ
at the start of each burst of packets.

I believe some GB ethernet ports and most 10Gig ports have the ability
to do delayed IRQ where they wait for a certain number of packets before
generating an IRQ, which is pretty much what I tried to emulate with my
tweak and it sure works amazingly well.

--
Len Sorensen
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] pcnet32: fix non-napi packet reception

2007-10-18 Thread Lennart Sorensen
On Wed, Oct 17, 2007 at 05:04:01PM -0700, Don Fry wrote:
 I have no objections myself.  It has been slowly moving that direction.
 First with the napi implementation, default off, labeled experimental.
 Then removing experimental and then making the default on.
 
 If any other user of the pcnet32 has objections, now is the time to
 speak loudly!

I have used NAPI only on the pcnet32 for quite a while now.  In fact I
think a few of my local patches would break if I disabled NAPI.

__
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iproute2: resend of patches from Debian.

2007-10-11 Thread Lennart Sorensen
On Thu, Oct 11, 2007 at 08:25:32PM +0200, Andreas Henriksson wrote:
 Patch from debian iproute package.
 
 diff -urNad iproute-20060323~/ip/iplink.c iproute-20060323/ip/iplink.c
 --- iproute-20060323~/ip/iplink.c 2006-03-22 00:57:50.0 +0100
 +++ iproute-20060323/ip/iplink.c  2006-09-08 21:07:14.0 +0200
 @@ -384,6 +384,10 @@
   }
  
   if (newname  strcmp(dev, newname)) {
 + if (strlen(newname) == 0) {
 + printf(\\ is not valid device identifier\n,dev);
 + return -1;
 + }
   if (do_changename(dev, newname)  0)
   return -1;
   dev = newname;

Isn't that printf missing somewhere for the 'dev' argument to go?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] bnx2: factor out gzip unpacker

2007-09-24 Thread Lennart Sorensen
On Fri, Sep 21, 2007 at 11:37:52PM +0100, Denys Vlasenko wrote:
 But I compile net/* into bzImage. I like netbooting :)

Isn't it possible to netboot with an initramfs image?  I am pretty sure
I have seen some systems do exactly that.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Odd behaviour of proxy_arp (although I solved part of it and think I figured out what stupid thing it is doing)

2007-07-26 Thread Lennart Sorensen
On Mon, Jul 23, 2007 at 04:36:22PM -0400, Lennart Sorensen wrote:
 I have been seeing some occasional strange behavior when using
 proxy_arp.  I have a router running with an ADSL PPPoE link to the
 Internet, and an Ethernet link to a local network.  It has proxy_arp
 enabled on the internal Ethernet port since I sometimes have ipsec
 tunnels running where I use proxy_arp to proxy for the IP assigned to
 the other end of the tunnel so that local machines can find and reach
 it.  I run two independent subnets on the local network (one with fixed
 IPs for my machines here, and another with DHCP addresses for guest
 machines that visit occasionally just to give them Internet access).
 
 I run 10.0.0.0/8 and 192.168.254.0/24 on the local network with the
 router having an IP in each subnet.
 
 The strangeness that occurs is that once in a while there is a 10
 second period where the system will answer all arp requests for all IPs
 on the local network, with it's own MAC address, which is clearly wrong
 since it doesn't have any of those IP addresses.  It seems to happen
 every couple of days or so on average, although not at any specific
 time.  One day it happened at 11:32:30 to 11:32:39, and a few days later
 it happened at 12:08:38 to 12:08:48.  If I disable proxy_arp, it never
 happens at all, but then I loose the ability to do what I have proxy_arp
 enabled for in the first place.

It turns out the reason for the 10 seconds or so, was to do with running
VRRP and how I handle routes in that situation, so I fixed that.  It
still doesn't solve the annoyance below.

 Related to that problem, there is also the annoyance that any IP that
 isn't part of either of the two subnets the router belongs to, have arp
 requests answered by the router all the time, which it also should not
 be answering, since it doesn't actually have a clue what those IP
 addresses belong to and certainly has no idea where it should forward to
 to reach them.  I occasionally have other random subnets in use on the
 network for running local test networks separate from everything else.
 It would be great if the kernel would keep its nose out of those subnets
 too.
 
 So far I have seen this behavior with 2.6.8, 2.6.16, and 2.6.18 (being
 the kernels I have run on this router).
 
 So have I misunderstood something about what proxy_arp is supposed to
 do, or is proxy_arp in the kernel simply broken, or is it perhaps
 mis-designed?  Are there some tuning parameters that could perhaps make
 it actually do what one would expect it to be doing?

So I found out part of the problem.

If ip forwarding is enabled (and why ever would it not be) and proxy_arp
is enabled, and you have a default route set, then the kernel will
answer arp requests for any IP address that it doesn't think is local to
a given interface.  So if the system has an ip of 10.0.0.254/8 on eth1,
and an arp request arrives for 192.168.1.1 it will answer the arp
request with the MAC of eth1 just because it thinks it could forwards
the packet through the default route.

Now the reason for having proxy_arp enabled in the first place, is to
allow ipsec connections to use 10.x.x.x/8 addresses for the remote
client, to make them appear local.  This of course does not mean I want
to screw up life for people doing a small test on the local network with
192.168.x.x addresses on their own devices.

So the question is, can one make the kernel only answer arp requests for
target IPs that belong to a given network interface?  It looks like
arp_filter or arp_ignore should do such a thing, but no matter what I
set that to, it still answers all arp requests for IPs it doesn't think
are local to the network as far as I can tell (I was doing arping for
random addresses from a client on the network, although the source IP
would have been considered local so maybe that doesn't count).  I do see
arp requests show up occationally from other clients on the network
which get answered by the router when they shouldn't have though, so I
don't think the source IP has anything to do with it.  Perhaps there is
a mistake in the filter/ignore code making it not filter requests
correctly.

Of course overall I am finding it hard to understand when anyone would
ever want to answer arp requests for any IP that shouldn't exist on the
network the request comes from and which the router doesn't explicitly
have an arp entry for (such as the ipsec case).  I especially don't see
when you would ever want the default route to be considered for matching
to determine if it could forward the packet and should answer the arp
request.

Is it incorrect to enable proxy_arp in the case where I want to answer
arp requests for IPs that should be local but which I have to tunnel to
reach?  Is there a better solution?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Odd behaviour of proxy_arp

2007-07-23 Thread Lennart Sorensen
I have been seeing some occasional strange behavior when using
proxy_arp.  I have a router running with an ADSL PPPoE link to the
Internet, and an Ethernet link to a local network.  It has proxy_arp
enabled on the internal Ethernet port since I sometimes have ipsec
tunnels running where I use proxy_arp to proxy for the IP assigned to
the other end of the tunnel so that local machines can find and reach
it.  I run two independent subnets on the local network (one with fixed
IPs for my machines here, and another with DHCP addresses for guest
machines that visit occasionally just to give them Internet access).

I run 10.0.0.0/8 and 192.168.254.0/24 on the local network with the
router having an IP in each subnet.

The strangeness that occurs is that once in a while there is a 10
second period where the system will answer all arp requests for all IPs
on the local network, with it's own MAC address, which is clearly wrong
since it doesn't have any of those IP addresses.  It seems to happen
every couple of days or so on average, although not at any specific
time.  One day it happened at 11:32:30 to 11:32:39, and a few days later
it happened at 12:08:38 to 12:08:48.  If I disable proxy_arp, it never
happens at all, but then I loose the ability to do what I have proxy_arp
enabled for in the first place.

Related to that problem, there is also the annoyance that any IP that
isn't part of either of the two subnets the router belongs to, have arp
requests answered by the router all the time, which it also should not
be answering, since it doesn't actually have a clue what those IP
addresses belong to and certainly has no idea where it should forward to
to reach them.  I occasionally have other random subnets in use on the
network for running local test networks separate from everything else.
It would be great if the kernel would keep its nose out of those subnets
too.

So far I have seen this behavior with 2.6.8, 2.6.16, and 2.6.18 (being
the kernels I have run on this router).

So have I misunderstood something about what proxy_arp is supposed to
do, or is proxy_arp in the kernel simply broken, or is it perhaps
mis-designed?  Are there some tuning parameters that could perhaps make
it actually do what one would expect it to be doing?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-06-04 Thread Lennart Sorensen
On Mon, May 07, 2007 at 01:45:11PM -0400, Lennart Sorensen wrote:
 Hmm, I thought I saw it on two systems already, but I should go try that
 again.

Hmm, still haven't figured this out.  I just saw this one this morning:

BUG: soft lockup detected on CPU#0!
 [c0103fc4] dump_stack+0x24/0x30
 [c013d36e] softlockup_tick+0x7e/0xc0
 [c011eb23] update_process_times+0x33/0x80
 [c01062c9] timer_interrupt+0x39/0x80
 [c013d6fd] handle_IRQ_event+0x3d/0x70
 [c013da59] __do_IRQ+0xa9/0x150
 [c0104e55] do_IRQ+0x25/0x60
 [c010313a] common_interrupt+0x1a/0x20
 [c013d6d8] handle_IRQ_event+0x18/0x70
 [c013da59] __do_IRQ+0xa9/0x150
 [c0104e55] do_IRQ+0x25/0x60
 [c010313a] common_interrupt+0x1a/0x20
 [c0119cda] __do_softirq+0x3a/0xa0
 [c0119d6d] do_softirq+0x2d/0x30
 [c0119fb7] irq_exit+0x37/0x40
 [c0104e5a] do_IRQ+0x2a/0x60
 [c010313a] common_interrupt+0x1a/0x20
 [c013dcee] setup_irq+0xce/0x1e0
 [c013de97] request_irq+0x97/0xb0
 [d0851f9d] pcnet32_open+0x4d/0x3d0 [pcnet32]
 [c023a4f9] dev_open+0x39/0x80
 [c0238cea] dev_change_flags+0xfa/0x130
 [c027eb9f] devinet_ioctl+0x4ff/0x6f0
 [c022dab1] sock_ioctl+0xf1/0x1f0
 [c017413c] do_ioctl+0x2c/0x80
 [c01741e2] vfs_ioctl+0x52/0x2f0
 [c01744ef] sys_ioctl+0x6f/0x80
 [c0102ef7] syscall_call+0x7/0xb
 [b7f41d04] 0xb7f41d04

And it is happening on multiple systems.  I am starting to wonder if it
is a bug in the soft lockup detection.  Maybe it really isn't locked up
but just momentarily appears to be.  I will try turning off the soft
lockup detection and see what happens.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PCNET32] Lock solid with netconsole

2007-05-28 Thread Lennart Sorensen
On Mon, May 28, 2007 at 05:25:51PM +0200, Emmanuel Fust? wrote:
  Any difference if you disable the debug messages in the pcnet32
  driver and you apply the patch below ?
  
  diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c
  index 9c171a7..be4513f 100644
  --- a/drivers/net/pcnet32.c
  +++ b/drivers/net/pcnet32.c
  @@ -2556,11 +2556,12 @@ pcnet32_interrupt(int irq, void *dev_id)
  unsigned long ioaddr;
  u16 csr0;
  int boguscnt = max_interrupt_work;
  +   unsigned long flags;
   
  ioaddr = dev-base_addr;
  lp = netdev_priv(dev);
   
  -   spin_lock(lp-lock);
  +   spin_lock_irqsave(lp-lock, flags);
   
  csr0 = lp-a.read_csr(ioaddr, CSR0);
  while ((csr0  0x8f00)  --boguscnt = 0) {
  @@ -2632,7 +2633,7 @@ pcnet32_interrupt(int irq, void *dev_id)
  printk(KERN_DEBUG %s: exiting interrupt, csr0=%#4.4x.\n,
 dev-name, lp-a.read_csr(ioaddr, CSR0));
   
  -   spin_unlock(lp-lock);
  +   spin_unlock_irqrestore(lp-lock, flags);
   
  return IRQ_HANDLED;
   }
  
 Hi,
 Tested under very high console activity and it no longer freeze.

Hmm, I have been seeing lockups too and asked about doing something
almost exactly the same as this recently, but was told that it shouldn't
need irqs disabled at this point.  Well if it makes netconsole more
stable, I think I will try adding it to and see if it makes the problems
go away for good (my problem only happens at random and can be days
between it happening).

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Questions about IPsec and Netfilter

2007-05-10 Thread Lennart Sorensen
On Thu, May 10, 2007 at 10:36:14AM -0400, Alan Stern wrote:
 I've got a few questions about the relationship between the IPsec 
 implementation and Netfilter.
 
 Q1: At what points during packet processing do the IPsec transformations 
 occur?  In particular, which netfilter hooks do they come before and 
 after?  And likewise, which routing operations do they come before and 
 after?

Are you using netkey or klips?

 Q2: When a packet using IPsec tunnel mode is encapsulated or 
 de-encapsulated, does the newly-formed packet return to some earlier point 
 in the stack for further netfilter processing or routing?  What about 
 transport mode?

As far as I can tell the encrypted packet goes in the INPUT chain, then
is decrypted and goes back in either INPUT or FORWARD depending on the
unencrypted source/destination.  Well for netkey anyhow.  klips goes in
the INPUT chain and then is decrypted and then comes in the ipsecX
interface either on INPUT or FORWARD chains.

 Q3: How can iptables rules determine whether they are dealing with a 
 packet which has been de-encapsulated from (or encapsulated within) an 
 IPsec wrapper?

If using netkey, and 2.6.16 or newer, then the policy tag will be ipsec
if it was decrypted from an ipsec tunnel.  I recently had to upgrade to
shorewall 3.x to deal with that when I wnet to using netkey and 2.6.18
kernel together.  With klips the packets from an ipsec tunnel arive on
the ipsecX interface after being decrypted so you can recognize them
that way.

 Q4: Is it true that NAT-Traversal isn't implemented for transport mode?

No idea.

 In RFC 2401 (Security Architecture for the Internet Protocol), section 5
 includes this text:
 
As mentioned in Section 4.4.1 The Security Policy Database (SPD),
the SPD must be consulted during the processing of all traffic
(INBOUND and OUTBOUND), including non-IPsec traffic.  If no policy is
found in the SPD that matches the packet (for either inbound or
outbound traffic), the packet MUST be discarded.
 
 But on Linux systems, by default the SPD is normally empty (as shown by
 setkey -DP) and all packets are allowed to pass unhindered.
 
 Q5: Isn't this a violation of the RFC?  Or is there some implicit policy 
 entry which accepts all packets without applying any security association?
 
 Thanks for any answers.  I may think up more questions later...

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-07 Thread Lennart Sorensen
On Fri, May 04, 2007 at 03:02:36PM -0400, Lennart Sorensen wrote:
 Well I don't know, but something is going wrong and causing the soft
 lock up.  I must admit I am surprised if an interrupt can occour while
 handling an interrupt, but then again maybe that is supposed to be
 allowed.

I tried building a kernel where the only change was enabling the spin
lock debugging.  It doesn't fail, while without spin lock debugging it
seemed to fail very frequently.  Darn!  I hate when debugging makes hides
the problem.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-07 Thread Lennart Sorensen
On Mon, May 07, 2007 at 04:48:37PM +0200, Frederik Deweerdt wrote:
 Can you try running on another Geode LX system, just to rule out a
 hardware problem on you board?

Hmm, I thought I saw it on two systems already, but I should go try that
again.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-04 Thread Lennart Sorensen
On Thu, May 03, 2007 at 04:31:43PM -0400, Lennart Sorensen wrote:
 I have had this happen a few times recently and was wondering if anyone
 has an idea what could be going on:
 
 BUG: soft lockup detected on CPU#0!
  [c0103fc4] dump_stack+0x24/0x30
  [c013d71e] softlockup_tick+0x7e/0xc0
  [c011eb23] update_process_times+0x33/0x80
  [c01062c9] timer_interrupt+0x39/0x80
  [c013daad] handle_IRQ_event+0x3d/0x70
  [c013de09] __do_IRQ+0xa9/0x150
  [c0104e55] do_IRQ+0x25/0x60
  [c010313a] common_interrupt+0x1a/0x20
  [d084e00c] pcnet32_dwio_read_csr+0xc/0x20 [pcnet32]
  [d084e9d2] pcnet32_interrupt+0x42/0x2b0 [pcnet32]
  [c013daad] handle_IRQ_event+0x3d/0x70
  [c013de09] __do_IRQ+0xa9/0x150
  [c0104e55] do_IRQ+0x25/0x60
  [c010313a] common_interrupt+0x1a/0x20
  [c013da88] handle_IRQ_event+0x18/0x70
  [c013de09] __do_IRQ+0xa9/0x150
  [c0104e55] do_IRQ+0x25/0x60
  [c010313a] common_interrupt+0x1a/0x20
  [5791] 0x5791
 
 This is on a system running a Geode LX at 500MHz, using 2.6.18 based
 kernel (specifically a slightly modified debian 4.0 Etch kernel).
 
 I am really wondering where do I go looking for the cause of this.  The
 same kernel running on a Geode SC1200 (GX1) does not appear to do this.
 
 If I knew what the error meant I would have a better idea how to debug
 it and fix it.

I looked at the pcnet32_interrupt function and where it calls
pcnet32_dwio_read_csr and saw this:

2550 /* The PCNET32 interrupt handler. */
2551 static irqreturn_t
2552 pcnet32_interrupt(int irq, void *dev_id)
2553 {
2554 struct net_device *dev = dev_id;
2555 struct pcnet32_private *lp;
2556 unsigned long ioaddr;
2557 u16 csr0;
2558 int boguscnt = max_interrupt_work;
2559
2560 ioaddr = dev-base_addr;
2561 lp = netdev_priv(dev);
2562
2563 spin_lock(lp-lock);
2564
2565 csr0 = lp-a.read_csr(ioaddr, CSR0);
2566 while ((csr0  0x8f00)  --boguscnt = 0) {
2567 if (csr0 == 0x) {
2568 break;  /* PCMCIA remove happened */

So I wonder, what happens if an interrupt occours, and since one of the
devices on that interrupt is the pcnet32 so it grabs the port lock, goes
to read CSR0, and then another interrupt occours on the same IRQ line
(I run with PREEMPT enabled if that matters) and the pcnet32 interrupt
handler is called again but since the port is already locked it has to
wait, causing the cpu to be locked up.

Should line 2563 be a spin_lock_irqsave instead along with the
appropriate unluck later?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-04 Thread Lennart Sorensen
On Fri, May 04, 2007 at 04:33:26PM +0200, Frederik Deweerdt wrote:
 On Fri, May 04, 2007 at 10:10:24AM -0400, Lennart Sorensen wrote:
  On Thu, May 03, 2007 at 04:31:43PM -0400, Lennart Sorensen wrote:
 [...]
  Should line 2563 be a spin_lock_irqsave instead along with the
  appropriate unluck later?
 IIRC, when you enable lockdep, it will complain about spinlocks used in
 an invalid context.

What is lockdep and how do I enable it?

I enabled SPINLOCK_DEBUG and am going to try that kernel now (except it
hit the bug before I could even log in and install the kernel this time,
so another reboot first).

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-04 Thread Lennart Sorensen
On Fri, May 04, 2007 at 05:34:38PM +0200, Frederik Deweerdt wrote:
 For the what part, see Documentation/lockdep-design.txt. You'll enable
 it by with SPINLOCK_DEBUG, indeed.

Well I hope to see it hit the BUG again soon then to see what it has to
say.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-04 Thread Lennart Sorensen
On Fri, May 04, 2007 at 11:40:09AM -0400, Lennart Sorensen wrote:
 On Fri, May 04, 2007 at 05:34:38PM +0200, Frederik Deweerdt wrote:
  For the what part, see Documentation/lockdep-design.txt. You'll enable
  it by with SPINLOCK_DEBUG, indeed.
 
 Well I hope to see it hit the BUG again soon then to see what it has to
 say.

Well I didn't see anything for a while with SPINLOCK_DEBUG enabled
(maybe I didn't wait long enough).  So I tried changing it to
spin_lock_irqsave, and that didn't go well.  I got this as the result
now:

onfiguring network interfaces...eth1: link up, 100Mbps, full-duplex
BUG: spinlock recursion on CPU#0, ifconfig/962
 lock: cf7a3304, .magic: dead4ead, .owner: ifconfig/962, .owner_cpu: 0
 [c0104024] dump_stack+0x24/0x30
 [c01e3947] _raw_spin_lock+0x137/0x140
 [c02981ec] _spin_lock_irqsave+0x1c/0x30
 [d084eb86] pcnet32_interrupt+0x216/0x290 [pcnet32]
 [c013b95d] handle_IRQ_event+0x3d/0x70
 [c013ba2c] __do_IRQ+0x9c/0x120
 [c0105025] do_IRQ+0x25/0x60
 [c010316a] common_interrupt+0x1a/0x20
 [c011927a] __do_softirq+0x3a/0xa0
 [c011930d] do_softirq+0x2d/0x30
 [c0119557] irq_exit+0x37/0x40
 [c010502a] do_IRQ+0x2a/0x60
 [c010316a] common_interrupt+0x1a/0x20
 [c02983c0] _spin_unlock_irqrestore+0x10/0x40
 [d08517ea] pcnet32_open+0x27a/0x390 [pcnet32]
 [c02343e9] dev_open+0x39/0x80
 [c0232b5a] dev_change_flags+0xfa/0x130
 [c0277b7f] devinet_ioctl+0x4ff/0x6f0
 [c0227b24] sock_ioctl+0xf4/0x1f0
 [c017027c] do_ioctl+0x2c/0x80
 [c0170322] vfs_ioctl+0x52/0x2f0
 [c017062f] sys_ioctl+0x6f/0x80
 [c0102f27] syscall_call+0x7/0xb
 [b7eebd04] 0xb7eebd04
BUG: spinlock lockup on CPU#0, ifconfig/962, cf7a3304
 [c0104024] dump_stack+0x24/0x30
 [c01e391f] _raw_spin_lock+0x10f/0x140
 [c02981ec] _spin_lock_irqsave+0x1c/0x30
 [d084eb86] pcnet32_interrupt+0x216/0x290 [pcnet32]
 [c013b95d] handle_IRQ_event+0x3d/0x70
 [c013ba2c] __do_IRQ+0x9c/0x120
 [c0105025] do_IRQ+0x25/0x60
 [c010316a] common_interrupt+0x1a/0x20
 [c011927a] __do_softirq+0x3a/0xa0
 [c011930d] do_softirq+0x2d/0x30
 [c0119557] irq_exit+0x37/0x40
 [c010502a] do_IRQ+0x2a/0x60
 [c010316a] common_interrupt+0x1a/0x20
 [c02983c0] _spin_unlock_irqrestore+0x10/0x40
 [d08517ea] pcnet32_open+0x27a/0x390 [pcnet32]
 [c02343e9] dev_open+0x39/0x80
 [c0232b5a] dev_change_flags+0xfa/0x130
 [c0277b7f] devinet_ioctl+0x4ff/0x6f0
 [c0227b24] sock_ioctl+0xf4/0x1f0
 [c017027c] do_ioctl+0x2c/0x80
 [c0170322] vfs_ioctl+0x52/0x2f0
 [c017062f] sys_ioctl+0x6f/0x80
 [c0102f27] syscall_call+0x7/0xb
 [b7eebd04] 0xb7eebd04

Obviously that wasn't so good.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-04 Thread Lennart Sorensen
On Fri, May 04, 2007 at 01:44:56PM -0400, Lennart Sorensen wrote:
 On Fri, May 04, 2007 at 11:40:09AM -0400, Lennart Sorensen wrote:
  On Fri, May 04, 2007 at 05:34:38PM +0200, Frederik Deweerdt wrote:
   For the what part, see Documentation/lockdep-design.txt. You'll enable
   it by with SPINLOCK_DEBUG, indeed.
  
  Well I hope to see it hit the BUG again soon then to see what it has to
  say.
 
 Well I didn't see anything for a while with SPINLOCK_DEBUG enabled
 (maybe I didn't wait long enough).  So I tried changing it to
 spin_lock_irqsave, and that didn't go well.  I got this as the result
 now:
 
 onfiguring network interfaces...eth1: link up, 100Mbps, full-duplex
 BUG: spinlock recursion on CPU#0, ifconfig/962
  lock: cf7a3304, .magic: dead4ead, .owner: ifconfig/962, .owner_cpu: 0
  [c0104024] dump_stack+0x24/0x30
  [c01e3947] _raw_spin_lock+0x137/0x140
  [c02981ec] _spin_lock_irqsave+0x1c/0x30
  [d084eb86] pcnet32_interrupt+0x216/0x290 [pcnet32]
  [c013b95d] handle_IRQ_event+0x3d/0x70
  [c013ba2c] __do_IRQ+0x9c/0x120
  [c0105025] do_IRQ+0x25/0x60
  [c010316a] common_interrupt+0x1a/0x20
  [c011927a] __do_softirq+0x3a/0xa0
  [c011930d] do_softirq+0x2d/0x30
  [c0119557] irq_exit+0x37/0x40
  [c010502a] do_IRQ+0x2a/0x60
  [c010316a] common_interrupt+0x1a/0x20
  [c02983c0] _spin_unlock_irqrestore+0x10/0x40
  [d08517ea] pcnet32_open+0x27a/0x390 [pcnet32]
  [c02343e9] dev_open+0x39/0x80
  [c0232b5a] dev_change_flags+0xfa/0x130
  [c0277b7f] devinet_ioctl+0x4ff/0x6f0
  [c0227b24] sock_ioctl+0xf4/0x1f0
  [c017027c] do_ioctl+0x2c/0x80
  [c0170322] vfs_ioctl+0x52/0x2f0
  [c017062f] sys_ioctl+0x6f/0x80
  [c0102f27] syscall_call+0x7/0xb
  [b7eebd04] 0xb7eebd04
 BUG: spinlock lockup on CPU#0, ifconfig/962, cf7a3304
  [c0104024] dump_stack+0x24/0x30
  [c01e391f] _raw_spin_lock+0x10f/0x140
  [c02981ec] _spin_lock_irqsave+0x1c/0x30
  [d084eb86] pcnet32_interrupt+0x216/0x290 [pcnet32]
  [c013b95d] handle_IRQ_event+0x3d/0x70
  [c013ba2c] __do_IRQ+0x9c/0x120
  [c0105025] do_IRQ+0x25/0x60
  [c010316a] common_interrupt+0x1a/0x20
  [c011927a] __do_softirq+0x3a/0xa0
  [c011930d] do_softirq+0x2d/0x30
  [c0119557] irq_exit+0x37/0x40
  [c010502a] do_IRQ+0x2a/0x60
  [c010316a] common_interrupt+0x1a/0x20
  [c02983c0] _spin_unlock_irqrestore+0x10/0x40
  [d08517ea] pcnet32_open+0x27a/0x390 [pcnet32]
  [c02343e9] dev_open+0x39/0x80
  [c0232b5a] dev_change_flags+0xfa/0x130
  [c0277b7f] devinet_ioctl+0x4ff/0x6f0
  [c0227b24] sock_ioctl+0xf4/0x1f0
  [c017027c] do_ioctl+0x2c/0x80
  [c0170322] vfs_ioctl+0x52/0x2f0
  [c017062f] sys_ioctl+0x6f/0x80
  [c0102f27] syscall_call+0x7/0xb
  [b7eebd04] 0xb7eebd04
 
 Obviously that wasn't so good.

Nevermind.  I am obviously an idiot today placing spin_lock_irqsave both
in place of spin_lock and spin_unlock.  Yeah that will work well.  Now
to try with spin_lock_irqrestore or whatever it is called.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-05-04 Thread Lennart Sorensen
On Fri, May 04, 2007 at 11:24:33AM -0700, Don Fry wrote:
 All instances of obtaining the lock in pcnet32 are done as
 spin_lock_irqsave except the interrupt handler itself.  The interrupt mask
 needs to be saved everywhere else, but the interrupt handler is known not
 to need to save the flags.
 
 If the lock is held and the same CPU tries to get the lock again, it will
 wait a very long time ;-(.  I believe the locking is fine for a
 non-preemptable kernel, but I have little experience with a preemptable
 kernel.
 
 When does a preemptable kernel allow interrupts to occur?

I have no idea actually.

 Is there a bug in this particular architectures locking code?

On i386?  I hope not.

 From looking at preempt-locking.txt the driver has (1) no per-cpu data,
 (2) 'CPU state protection' should be fine,
 (3) the 'lock is acquired and released by the same task'.
 I don't see a problem unless I am misunderstanding something.

Well I don't know, but something is going wrong and causing the soft
lock up.  I must admit I am surprised if an interrupt can occour while
handling an interrupt, but then again maybe that is supposed to be
allowed.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] e100 driver on ARM

2007-04-27 Thread Lennart Sorensen
On Thu, Apr 26, 2007 at 09:19:34AM -0700, H. Peter Anvin wrote:
 Why wouldn't that be permitted?  It, in fact, happens all the time (the
 host bridge withdraws the GNT# line and raises STOP#, which does a
 Termination With Data of the bus transfer.)  This is a normal event and
 if you can't handle it you won't work with many host bridges at all.

Well there must have been something else wrong then.  Certainly I saw
data corruption on a rtl8139.  No problems with the same hardware using
a geode SC1200, so I have no idea.  I liked the speed of the PXA255 a
lot better than the slow poke SC1200.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 09/11] forcedeth: improve NAPI logic

2007-04-27 Thread Lennart Sorensen
On Thu, Apr 26, 2007 at 10:53:04AM -0400, Ayaz Abdulla wrote:
 Ok. In that case, the patch needs to be improved.
 
 The following needs to be done when NAPI is enabled:
 - remove the tx handling within the ISRs
 - mask off the tx interrupts within the ISRs that handle tx processing
 - re-enable tx interrupts within the NAPI handler
 - add tx handling within the NAPI handler (this patch covers it)

I thought a number of drivers handled tx from napi while receives were
happening, but went to plain interrupts if no receives were happening.
Maybe I misread the code (I have mainly dealt with pcnet32 so far).
Certainly for gigabit I would think napi all the time would be much more
efficient.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] e100 driver on ARM

2007-04-17 Thread Lennart Sorensen
On Mon, Apr 16, 2007 at 11:07:36AM -0400, David Acker wrote:
 Lennart Sorensen wrote:
 Which PCI host controller are you using with the PXA255?  We tried using
 a PXA255 based system with a PCI controller a couple of years ago and
 have to change to a different cpu in the end due to the PCI controller
 simply not being valid PCI.  The PXA255 wasn't designed for PCI, and I
 get the impression that non of the PCI companion chips for it do a good
 enough job to actually add it correctly.
 
 
 Sorry for the delay in responding...my wife and I just had twins!
 
 We are using the IT8152G RISC-to-PCI companion chip.

Well the IT8152G+PXA255 combination used on the SBC we tried a couple of
years ago did not work.  The PCI bus had errors and the SBC maker gave
up trying to fix it.  We switched to a Geode SC1200 based board instead
which works fine PCI wise.

My suspicision (although it is only that) is that the PXA255 trying to
access memory may cause interruptions in PCI bus master transfers, which
is of course not permitted by the PCI spec (at least the way I read it).
We tried it with RTL8139, AMD 972 (both ethernet) as well as a number of
T1/E1 and DDS wan cards from sangoma.  The wan cards had the most issues
with it (they drivers and hardware would get out of sync due to PCI bus
problems), while the ethernet just had occational packet corruption.  I
will certainly never consider using a PXA + ITE pci controller
combination ever again.  Too bad since the performance of the PXA is
amazing.  The PXA chips are not designed to speak to PCI, and the ITE
companion chip doesn't quite do the job of pretending it was.  I would
expect problems if you do pci bus master transfers and/or any kind of
PCI bus traffic load.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-09 Thread Lennart Sorensen
On Sun, Apr 08, 2007 at 08:29:07PM +0100, W Agtail wrote:
 This is what I'm trying to achieve with the following iptables/iproute2
 configuration on both web servers:
 
 iptables -t mangle -A PREROUTING -p tcp --dport 8088 -i eth0 -j LOG
 --log-prefix fwmark 1: 
 iptables -t mangle -A PREROUTING -p tcp --dport 8089 -i eth0 -j LOG
 --log-prefix fwmark 2: 
 
 iptables -t mangle -A PREROUTING -p tcp --dport 8088 -i eth0 -j MARK
 --set-mark 1
 iptables -t mangle -A PREROUTING -p tcp --dport 8089 -i eth0 -j MARK
 --set-mark 2

You are supposed to mangle things _coming_ from port 8088 and 8089.
After all it is the replies you are trying to affect, not the requests.
So it should be the --sport not --dport.  And of course outbound not
incoming on eth0.

 iptables -t mangle -A PREROUTING -m mark --mark 1 -j LOG --log-prefix
 marked 1: 
 iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG --log-prefix
 marked 2: 
 
 ip route add table 1 default via 10.18.35.11 dev eth0 # GW1
 ip route add table 2 default via 10.18.35.21 dev eth0 # GW2
 
 ip rule add fwmark 1 table 1
 ip rule add fwmark 2 table 2
 
 On web2, the default gw is set to gw2 and in /var/log/messages, I can
 see packets appear to be marked. However, for some reason, 8088 is still
 routing back via gw2 (default gw) rather than being routed via gw1,
 which I'm trying to do with the above ip rules etc.
 
 Is the above the correct syntax? or I guess I could totally be missing
 the plot?
 
 Many thanks for your time on this one.

Hope that helps.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-09 Thread Lennart Sorensen
On Mon, Apr 09, 2007 at 06:13:50PM +0200, Patrick McHardy wrote:
 As the name suggests, POSTROUTING comes after routing, so marking
 packets there doesn't affect routing. Use PREROUTING for forwarded
 traffic and OUTPUT for locally generated traffic.

I didn't even notice that had been changed.  It used to say PREROUTING
when it was for --dport, and all I suggested changing was --dport to
--sport and change the -o part (probably to nothing at all really since
routing hasn't been decided yet).

Yes it absolutely has to be done PREROUTING.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-09 Thread Lennart Sorensen
On Mon, Apr 09, 2007 at 06:02:23PM +0100, W Agtail wrote:
 Thanks Patrick for your comments too.
 It seems that you can't mix PREROUTING with --sport or -o.
 I've also changed the ip rule tables to higher numbers, so I now have:

I thought you could have --sport, but NOT -o.  No need for -o of course.

 iptables -t mangle -A PREROUTING -p tcp --dport 8088 -i eth0 -j LOG
 --log-prefix fwmark 1: 
 iptables -t mangle -A PREROUTING -p tcp --dport 8089 -i eth0 -j LOG
 --log-prefix fwmark 2: 
 
 iptables -t mangle -A PREROUTING -p tcp --dport 8088 -i eth0 -j MARK
 --set-mark 1
 iptables -t mangle -A PREROUTING -p tcp --dport 8089 -i eth0 -j MARK
 --set-mark 2
 iptables -t mangle -A PREROUTING -m mark --mark 1 -j LOG --log-prefix
 marked 1: 
 iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG --log-prefix
 marked 2: 

The thing is that the destination port will NEVER be 8088 for the
outgoing packets from apache.  The source port will be.

Try this:

iptables -t mangle -A PREROUTING -p tcp --sport 8088 -j LOG --log-prefix 
fwmark 1: 
iptables -t mangle -A PREROUTING -p tcp --sport 8089 -j LOG --log-prefix 
fwmark 2: 
iptables -t mangle -A PREROUTING -p tcp --sport 8088 -j MARK --set-mark 1
iptables -t mangle -A PREROUTING -p tcp --sport 8089 -j MARK --set-mark 2
iptables -t mangle -A PREROUTING -m mark --mark 1 -j LOG --log-prefix marked 
1: 
iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG --log-prefix marked 
2: 

 ip route add table 8088 default via 10.18.35.11 dev eth0
 ip route add table 8089 default via 10.18.35.21 dev eth0
 
 ip rule add fwmark 1 table 8088
 ip rule add fwmark 2 table 8089
 
 
 # Confirmation of syntax:
 iptables -t mangle --list -v -n
 Chain PREROUTING (policy ACCEPT 5921 packets, 403K bytes)
  pkts bytes target prot opt in out source
 destination
18   984 LOGtcp  --  eth0   *   0.0.0.0/0
 0.0.0.0/0   tcp dpt:8088 LOG flags 0 level 4 prefix `fwmark 1: '
 0 0 LOGtcp  --  eth0   *   0.0.0.0/0
 0.0.0.0/0   tcp dpt:8089 LOG flags 0 level 4 prefix `fwmark 2: '
18   984 MARK   tcp  --  eth0   *   0.0.0.0/0
 0.0.0.0/0   tcp dpt:8088 MARK set 0x1
 0 0 MARK   tcp  --  eth0   *   0.0.0.0/0
 0.0.0.0/0   tcp dpt:8089 MARK set 0x2
18   984 LOGall  --  *  *   0.0.0.0/0
 0.0.0.0/0   MARK match 0x1 LOG flags 0 level 4 prefix `marked 1:
 '
 0 0 LOGall  --  *  *   0.0.0.0/0
 0.0.0.0/0   MARK match 0x2 LOG flags 0 level 4 prefix `marked 2:
 '
 
 
 ip rule list
 0:  from all lookup local
 32764:  from all fwmark 0x2 lookup 8089
 32765:  from all fwmark 0x1 lookup 8088
 32766:  from all lookup main
 32767:  from all lookup default
 
 ip route list table 8088; ip route list table 8089
 default via 10.18.35.11 dev eth0
 default via 10.18.35.21 dev eth0
 
 This is what I see in web2's /var/log/messages:
 Apr  9 06:46:58 web2-fc6 kernel: fwmark 1: IN=eth0 OUT=
 MAC=00:0c:29:d1:08:48:00:0c:29:49:04:9f:08:00 SRC=192.168.0.241
 DST=10.18.35.52 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=42359 DF PROTO=TCP
 SPT=33321 DPT=8088 WINDOW=5840 RES=0x00 SYN URGP=0
 
 Apr  9 06:46:58 web2-fc6 kernel: marked 1: IN=eth0 OUT=
 MAC=00:0c:29:d1:08:48:00:0c:29:49:04:9f:08:00 SRC=192.168.0.241
 DST=10.18.35.52 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=42359 DF PROTO=TCP
 SPT=33321 DPT=8088 WINDOW=5840 RES=0x00 SYN URGP=0
 
 As you can see, packets appear to be marked.
 But here's a tcpdump on gw2's eth1:
 
 07:20:35.004205 192.168.0.241.59438  10.18.35.52.8088: S
 221760494:221760494(0) win 5840 mss 1460,sackOK,timestamp 1320423
 0,nop,wscale 6 (DF)
 07:20:35.013144 10.18.35.52.8088  192.168.0.241.59438: S
 2705868365:2705868365(0) ack 221760495 win 5792 mss
 1460,sackOK,timestamp 2191014 1320423,nop,wscale 1 (DF)
 07:20:35.021857 192.168.0.241.59438  10.18.35.52.8088: R
 221760495:221760495(0) win 0 (DF)
 07:20:38.069688 192.168.0.241.59438  10.18.35.52.8088: S
 221760494:221760494(0) win 5840 mss 1460,sackOK,timestamp 1321173
 0,nop,wscale 6 (DF)
 07:20:38.069695 10.18.35.52.8088  192.168.0.241.59438: S
 2706988830:2706988830(0) ack 221760495 win 5792 mss
 1460,sackOK,timestamp 2192135 1321173,nop,wscale 1 (DF)
 07:20:38.071232 192.168.0.241.59438  10.18.35.52.8088: R
 221760495:221760495(0) win 0 (DF)
 
 So, traffic is being returned via gw2, rather than gw1 :(

They are marked I guess, but much too late.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-09 Thread Lennart Sorensen
On Mon, Apr 09, 2007 at 07:05:31PM +0100, W Agtail wrote:
 Nice one, but unfortunately still doesn't work.
 I'm now not seeing any marked messages in /var/log/messages and traffic
 still going via gw2 for port 8088.

What does 'iptables -v -t mangle -L' show at the moment?  Have you been
flushing it between attemps to make sure you don't have conflicting
rules?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-09 Thread Lennart Sorensen
On Mon, Apr 09, 2007 at 07:24:07PM +0100, W Agtail wrote:
 Yup, I've been flushing iptables each time. This is what we have atm:
 iptables -n -v -t mangle -L
 
 Chain PREROUTING (policy ACCEPT 12656 packets, 2518K bytes)
  pkts bytes target prot opt in out source
 destination
 0 0 LOGtcp  --  *  *   0.0.0.0/0
 0.0.0.0/0   tcp spt:8088 LOG flags 0 level 4 prefix `fwmark 1: '
 0 0 LOGtcp  --  *  *   0.0.0.0/0
 0.0.0.0/0   tcp spt:8089 LOG flags 0 level 4 prefix `fwmark 2: '
 0 0 MARK   tcp  --  *  *   0.0.0.0/0
 0.0.0.0/0   tcp spt:8088 MARK set 0x1
 0 0 MARK   tcp  --  *  *   0.0.0.0/0
 0.0.0.0/0   tcp spt:8089 MARK set 0x2
 0 0 LOGall  --  *  *   0.0.0.0/0
 0.0.0.0/0   MARK match 0x1 LOG flags 0 level 4 prefix `marked 1:
 '
 0 0 LOGall  --  *  *   0.0.0.0/0
 0.0.0.0/0   MARK match 0x2 LOG flags 0 level 4 prefix `marked 2:
 '
 
 Chain INPUT (policy ACCEPT 10664 packets, 2438K bytes)
  pkts bytes target prot opt in out source
 destination
 
 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target prot opt in out source
 destination
 
 Chain OUTPUT (policy ACCEPT 6311 packets, 896K bytes)
  pkts bytes target prot opt in out source
 destination
 
 Chain POSTROUTING (policy ACCEPT 6311 packets, 896K bytes)
  pkts bytes target prot opt in out source
 destination

Odd how the packet count on those mangle table entries is 0.  It seems
like it is never even getting to there.  Do you need a rule in the
output chain telling it to send some packets to the mangle table?  That
doesn't make sense either though.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-08 Thread Lennart Sorensen
On Sun, Apr 08, 2007 at 04:35:53AM +0100, W Agtail wrote:
 Hope you can help.
 
 I have the following setup using LVS (Linux Virtual Servers):
 
 LAN192.168.0.0/24-  = CLIENTS
 |   |
 |   |
 LVS1LVS2
  vip1: 192.168.0.111 vip2: 192.168.0.121
  eth0: 192.168.0.110 eth0: 192.168.0.120
  eth1: 10.18.35.10   eth1: 10.18.35.20
  gw1:  10.18.35.11   gw2:  10.18.35.21
 |   |
 |   |
 LAN10.18.35.0/24-
 |   |
 |   |
 Apache WEB1 10.18.35.51:8088   WEB2 10.18.35.52:8088
 Apache WEB1 10.18.35.51:8089   WEB2 10.18.35.52:8088
 
 
 ### LVS ###
 The two LVS servers have a VIP and a GW.
 LVS1  LVS2 have ip_forward set to 1.
 
 LVS1 has the following iptables:
 iptables -t nat -A PREROUTING  -i eth0 -j DNAT --to 192.168.0.111
 iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 192.168.0.111
 with ipvsadm forwarding vip1:8088 to web1:8088  web2:8088
 
 LVS2 has the following iptables:
 iptables -t nat -A PREROUTING  -i eth0 -j DNAT --to 192.168.0.121
 iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 192.168.0.121
 with ipvsadm forwarding vip1:8089 to web1:8089  web2:8089
 
 ### WEB ###
 The two Web servers have 2 virtual web servers listening on ports 8088 
 8089 and have the following iptables  iproute2 config:
 iptables -t mangle -A PREROUTING -p tcp --dport 8088 -i eth0 -j MARK
 --set-mark 1
 iptables -t mangle -A PREROUTING -p tcp --dport 8089 -i eth0 -j MARK
 --set-mark 2
 
 ip route add table 1 default via 10.18.35.11 dev eth0
 ip route add table 2 default via 10.18.35.21 dev eth0
 
 ip rule add fwmark 1 table 1
 ip rule add fwmark 2 table 2
 
 WEB1's default GW is set to gw1.
 WEB2's default GW is set to gw2.
 
 CLIENTS should be able to connect to vip1:8088 and vip2:8089
 
 ### MY PROBLEM ###
 
 If i set WEB2's default GW to gw1, everything works as expected (as I
 now only have one GW).
 But when trying to set WEB2's default GW to gw2, things don't work.
 For example, if i was to run: curl vip1:8088 from a CLIENT, I would be
 able to connect to web1:8088 via LVS OK, but unable to connect to
 web2:8088 should LVS take me to web2.
 
 Its as though the iptables/ip route settings are not working as they
 should.
 
 Any ideas what I'm doing wrong?
 Many thanks, W Agtail.

Well give I am not sure what you are trying to do, I will take a guess.
I think you are trying to have redundant load balancers and multiple web
servers behind those two load balancers.  Here is how I would do it:

LAN192.168.0.0/24-  = CLIENTS
|   |
|   |
LVS1LVS2
 vrrp: 192.168.0.110 (linked)vrrp: 192.168.0.110 (linked)
 eth0: 192.168.0.111 eth0: 192.168.0.112

 eth1: 10.18.35.11   eth1: 10.18.35.12
 vrrp: 10.18.35.10 (master)  vrrp: 10.18.35.10 (slave)
|   |
|   |
LAN10.18.35.0/24-
|   |
|   |
Apache WEB1 10.18.35.51:8088   WEB2 10.18.35.52:8088
Apache WEB1 10.18.35.51:8089   WEB2 10.18.35.52:8088

So using VRRP to have a shared virtual IP between the two load
balancers, any client can connect to 192.168.0.110 and be sent through
to one of the web servers.  The server side interface also has a VRRP
virtual IP shared between the two load balancers, which is linked to the
other virtual IP, so that if the link goes down on one side of the load
balancer, it will automatically drop the virtual IP on both sides to let
the slave machine take over control of the IP.  To the clients this
should be pretty transparent since they don't need to know the IP
changed, other than the momentary change in mac address (letting vrrp
play with the mac address just causes a terrible mess in my experience,
and I have had much better luck by simply changing IPs and letting the
clients relear the new mac).

keepalived's vrrp works very well (Hmm, actually I think I made some
fixes to it, which I don't remember if I sent back upstream yet.  I
should check that tomorrow).

You could run multiple vrrps per interface if you want to somehow have
one be the master of one IP and the other the master of another to allow
different traffic to use each load balancer by default, but everything
going through one in case of a failure.

--
Len Sorensen
-
To unsubscribe from this 

Re: two gateways with one NIC

2007-04-08 Thread Lennart Sorensen
On Sun, Apr 08, 2007 at 05:10:15PM +0100, W Agtail wrote:
 Hi, and thanks very much for your response. Your guess sounds spot on. 
 
 As you've mentioned, using one sync group works quite well and gives you
 an active/passive LVS cluster (not sure of correct terminology here -
 sorry), thus all traffic goes via LVS1, leaving LVS2 not doing much
 unless LVS1 fails.
 
 I thought it would be a cool idea to setup two sync groups to ultimately
 handle several Apache instances on the two Apache servers. This way,
 both LVS servers would be used in a kind of active/active fashion and
 would be a master/slave to each other. For example, vip1  gw1 could
 possibly end up on LVS2 with vip2  gw2.
 
 The challenge though in having two sync groups, with two GWs. I would
 like all traffic coming through vip1 to be returned via gw1 and all
 traffic coming through vip2 to be returned via gw2.
 
 I am using keepalived (v1.1.13) with two sync groups. One with vip1 
 gw1 and another with vip2  gw2. Port 8088 will always comes through
 vip1/gw1, load balancing to web1:8088 and web2:8088. Port 8089 will
 always come through vip2/gw2, load balancing to web1:8089 and web2:8089.
 
 Web1's default gw is set to gw1 and web2's default gw is set to gw2. But
 this causing issues when say, vip1:8088 gets forwarded through gw1 to
 web2:8088 and doesn't get back back via gw2. To get round this, I need
 something like iproute2 on web2 to send all 8088 traffic back through
 gw1.

You have to set up both web servers to use the same gateway.  You can
setup an alternate routing table and tag packets from the apache on port
8089 to use the other gateway IP instead, but any traffic handled by
LVS1 _must_ be returned through LVS1.  So both web servers have to have
identical configuration (which is also much simpler to maintain).

You can use iptables to tag packets matching the source port of 8089 and
have ip route route all packets with that specific tag using an
alternate routing table, which will then use the other LVS.

So if you have two VRRP groups, you have port 8088 return by the regular
default gateway going to the first group IP, and you have tagging flag
all port 8089 packets to go through the second vrrp IP.  If an LVS
fails, both vrrp groups end up on the working LVS and everything still
works, but while both works, one LVS handles one port, and the other the
other port.  Of course routing packets is hardly a lot of work, so it
may not really be worth the bother to do anything extra with two groups.
You really have to configure both web servers identically though in
terms of routes.

 Hope this makes a little more sense to what I'm trying to achieve?
 Thanks again.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] e100 driver on ARM

2007-03-29 Thread Lennart Sorensen
On Thu, Mar 29, 2007 at 01:17:38AM -0400, David Acker wrote:
 I have a pxa255 based system with PCI added to it.  The e100 would have 
 memory corruption in its receive buffers detected by slab debugging 
 unless I put in the patch to use the S-bit.
 
 Here is a link to the patch posting:
 http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc3/2.6.20-rc3-mm1/broken-out/git-netdev-all.patch
 Search for e100.c.
 
 http://www-gatago.com/linux/kernel/15457063.html - This discussion seems 
 to hit the issue.
 
 There appears to be a race on the cache line where the EL bit and the 
 next packet info live. In my case the hardware appeared to write to a 
 free packet.  The S-bit seems to make the hardware stop and spin on the 
 bit, while the EL bit seems to let the hardware try to use that packet.
 
 This race would occur less often when the receive buffer chain is always 
 refilled before the hardware can use them up.  On our 400 Mhz Xscale, we 
 can use up all 256 buffers if the PCI bus has another busy device on it. 
  In our case it is an 802.11g miniPCI card and our software was routing 
 all ethernet packets to the wireless interface and vice versa while TCP 
 streams were running accross these connections.

Which PCI host controller are you using with the PXA255?  We tried using
a PXA255 based system with a PCI controller a couple of years ago and
have to change to a different cpu in the end due to the PCI controller
simply not being valid PCI.  The PXA255 wasn't designed for PCI, and I
get the impression that non of the PCI companion chips for it do a good
enough job to actually add it correctly.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-03-19 Thread Lennart Sorensen
On Sat, Mar 17, 2007 at 10:08:10PM +0900, takada wrote:
 I tested some patterns. just X86_OOSTORE was effective. WBINVD is needless.
 
 --- arch/i386/Kconfig.cpu~2007-02-05 03:44:54.0 +0900
 +++ arch/i386/Kconfig.cpu 2007-02-17 21:25:52.0 +0900
 @@ -322,7 +322,7 @@ config X86_USE_3DNOW
  
  config X86_OOSTORE
   bool
 - depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR
 + depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR || MGEODEGX1
   default y
  
  config X86_TSC

Well that is exactly what I did for the Geode SC1200 (a GX1 based
design) as well and it certainly improved things a lot for me as well.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-03-15 Thread Lennart Sorensen
On Thu, Mar 15, 2007 at 02:39:39PM +0900, takada wrote:
 Hiroshi Miura posted `Geode out-of-order store enables' patch in Jun, 2003.
 There is http://lkml.org/lkml/2003/6/5/57 .
 OOSTORE was enabled at this point in time. It seems to have disappeared 
 somewhere.

I believe the patch was rejected as 'not required' since the data sheet
is not very clear on that feature.

 BTW, I use MediaGX with kernel 2.6.20(and 2.6.20.3) and suspend2. When I 
 resume 
 the PC and use the PC Card modem, PC is hungup. However, PC isn't hung up when
 I apply a WBINVD patch.
 I can't understand it whether there is problem in resume of suspend2 or 
 MediaGX
 or both. Many drivers lack support for resume on my PC.

Which patch are you refering to?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] pcnet32: only allocate init_block dma consistent

2007-03-07 Thread Lennart Sorensen
On Tue, Mar 06, 2007 at 07:39:21PM -0800, Michael K. Edwards wrote:
 On 3/6/07, Ralf Baechle [EMAIL PROTECTED] wrote:
 This small change btw. delivers about ~ 3% extra performance on a very
 slow test system.
 
 Has this change been tested / benchmarked under VMWare?  pcnet32 is
 the (default?) virtual device presented by VMWare Workstation, and
 that's probably a large fraction of its use in the field these days.
 But then Don probably already knows that.  :-)

Unless you install vmware tools in which case you use vmxnet instead
which of course performs better since it knows it isn't talking to real
hardware.

I am currently about to try what this patch does to the performance of
our system (266MHz Geode SC1200 with 4 pcnet32's).

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Strange connection slowdown on pcnet32

2007-02-20 Thread Lennart Sorensen
On Mon, Feb 19, 2007 at 06:59:16PM -0500, Lennart Sorensen wrote:
 I am also noticing the receive error count going up, and the source is
 this code:
 
 if (status  0x01)  /* Only count a general error at the */
lp-stats.rx_errors++;  /* end of a packet. */
 
 It appears this means I am receiving a frame marked with End Of Packet
 but without Start of Packet.  I have no idea how that happens, but it
 shouldn't be able to make the driver and MAC stop processing the receive
 ring.

Well the packets actually have both start and end marked, but also have
overflow marked, so the cpu simply isn't keeping up it seems (It is
taking about 100% of the cpu to push through 6500KB/s).  Certainly the
CONFIG_X86_OOSTORE makes a major difference, although I am still not
sure why.  Simply skipping ahead one or two receive descriptors when the
current one is marked as owned by the MAC but the one a few ahead is
owned by the CPU allows it to continue receiving when it happens.  I
really want to find out why it happens though, although I am not sure
how to go about doing that.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-02-20 Thread Lennart Sorensen
On Tue, Feb 20, 2007 at 08:34:13PM +0900, takada wrote:
 I posted with 2.6.20 + enabled X86_OOSTORE.
 The clflush sze line is in /proc/cpuinfo. but clfush is not in flags line.
 
 BTW, can we use WBINVD instruction? I tested compile only.
 Do you know a method to change dynamically without #ifdef when it works
 with MediaGX/GeodeGX.
 
 diff -Narup a/include/asm-i386/io.h b/include/asm-i386/io.h
 --- a/include/asm-i386/io.h   2007-02-20 16:23:25.0 +0900
 +++ b/include/asm-i386/io.h   2007-02-20 17:07:14.0 +0900
 @@ -232,7 +232,19 @@ static inline void memcpy_toio(volatile 
   *   2. Accidentally out of order processors (PPro errata #51)
   */
   
 -#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
 +#ifdef CONFIG_MGEODEGX1
 +
 +static inline void dma_flush_cache(void)
 +{
 + __asm__ __volatile__ (wbinvd: : :memory);
 +}
 +
 +#define dma_cache_inv(_start,_size)  dma_flush_cache()
 +#define dma_cache_wback(_start,_size)dma_flush_cache()
 +#define dma_cache_wback_inv(_start,_size)dma_flush_cache()
 +#define flush_write_buffers()
 +
 +#elif defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
  
  static inline void flush_write_buffers(void)
  {
 -

Well it is starting to look like it isn't a caching issue, but more
likely an issue of which order writes are performed in.  I think the MAC
might be seeing the ownership bit change before the rest of the
descriptor, which shouldn't happen.  With X86_OOSTORE, wmb() is called
between setting the fields in the descriptor and setting the ownership
bit to the MAC.  I still have to investigate a bit more to find out for
sure, but that could certainly explain why X86_OOSTORE makes the problem
become much less frequent.  It doesn't completely elliminate it though.
Of course maybe there are two different problems with the same symptoms.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Strange connection slowdown on pcnet32

2007-02-19 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote:
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: pcnet32_poll: pcnet32_rx() got 16 packets
 eth1: base: 0x05215812 status: 0310 next-status: 0310
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: netif_receive_skb(skb)
 eth1: pcnet32_poll: pcnet32_rx() got 16 packets
 eth1: base: 0x04c51812 status: 8000 next-status: 0310
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x6f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0310
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0310
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0310
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 
 So somehow it ends up that when it reads the status of the descriptor at
 address 0x04c51812, it sees the status as 0x8000 (which means owned by
 the MAC I believe), even though the next descriptor in the ring has a
 sensible status, indicating that the descriptor is ready to be handled
 by the driver.  Since the descriptor isn't ready, we exit without
 handling anything and NAPI reschedules is the next time we get an
 interrupt, and after some random number of tries, we finally see the
 right status and handle the packet, along with a bunch of other packets
 waiting in the descriptor ring.  Then we seem to hit the exact same
 descriptor address again, with the same problem in the status we read,
 and again we are stuck for a while, until finally we see the right
 status, and another pile of packets get handled, and we again hit the
 same descriptor address and get stuck.

I have been poking at things with firescope to see if the MAC is
actually writing to system memory or not.

The entry that it gets stuch on is _always_ entry 0 in the rx_ring.
There does not appear to be any exceptions to this.  

Here is my firescope (slightly modified for this purpose) dump of the
rx_ring of eth1:

Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\
  :   : | | |len| |tus| | length  | | |
RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00
RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00
RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00
RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00
RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00
RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00
RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00 00 00 00 00 00 00
RXdesc[31]:66941f0: 12 28 f1 04 fa

Re: Re: Strange connection slowdown on pcnet32

2007-02-19 Thread Lennart Sorensen
On Mon, Feb 19, 2007 at 03:11:36PM -0500, Lennart Sorensen wrote:
 I have been poking at things with firescope to see if the MAC is
 actually writing to system memory or not.
 
 The entry that it gets stuch on is _always_ entry 0 in the rx_ring.
 There does not appear to be any exceptions to this.  
 
 Here is my firescope (slightly modified for this purpose) dump of the
 rx_ring of eth1:
 
 Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\
   :   : | | |len| |tus| | length  | | |
 RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
 RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00
 RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00
 RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00
 RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00
 RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00
 RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00
 RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00 00 00 00 00 00 00
 RXdesc[31]:66941f0: 12 28 f1 04 fa f9 00 80 40 00 00 00 00 00 00 00
 
 I only ever see entry 0 as status 0080 (0x8000 which is owned by mac),
 and this is while the driver is checking entry 0 every time it tries to
 check for any waiting packets.
 
 Running tcpdump while pinging gives the interesting result that some
 packets are ariving out of order making it seem like the driver is
 processing the packets out of order.  Perhaps the driver is wrong to be
 looking at entry 0, and should be looking at entry 1 and is hence stuck
 until the whole receive ring has been filled again?
 
 15:06:04.112812 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 1
 15:06:05.119799 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 2
 15:06:05.120159 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 2
 15:06:05.127045 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 1
 15:06:06.119862 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 3
 15:06:07.119921 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 4
 15:06:08.119994 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 5
 15:06:08.426400 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 3
 15:06:08.427915 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 4
 15:06:08.429033 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 5
 15:06:09.120053 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 6
 15:06:10.120109 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 7
 15:06:10.705332 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 6
 15:06:10.707258 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 7
 15:06:11.120175 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 8
 15:06:12.120233 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 9
 15:06:13.120297 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 10
 15:06:14.120359 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 11
 15:06:14.120737 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 11
 15:06:14.127064 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 8
 15:06:14.127700 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 9
 15:06:14.128268 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 10
 15:06:15.120426 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 12
 15:06

Re: Re: Strange connection slowdown on pcnet32

2007-02-19 Thread Lennart Sorensen
On Mon, Feb 19, 2007 at 05:18:45PM -0500, Lennart Sorensen wrote:
 On Mon, Feb 19, 2007 at 03:11:36PM -0500, Lennart Sorensen wrote:
  I have been poking at things with firescope to see if the MAC is
  actually writing to system memory or not.
  
  The entry that it gets stuch on is _always_ entry 0 in the rx_ring.
  There does not appear to be any exceptions to this.  
  
  Here is my firescope (slightly modified for this purpose) dump of the
  rx_ring of eth1:
  
  Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\
:   : | | |len| |tus| | length  | | |
  RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
  RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00
  RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00
  RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00
  RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00
  RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00
  RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00
  RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00 00 00 00 00 00 00
  RXdesc[31]:66941f0: 12 28 f1 04 fa f9 00 80 40 00 00 00 00 00 00 00
  
  I only ever see entry 0 as status 0080 (0x8000 which is owned by mac),
  and this is while the driver is checking entry 0 every time it tries to
  check for any waiting packets.
  
  Running tcpdump while pinging gives the interesting result that some
  packets are ariving out of order making it seem like the driver is
  processing the packets out of order.  Perhaps the driver is wrong to be
  looking at entry 0, and should be looking at entry 1 and is hence stuck
  until the whole receive ring has been filled again?
  
  15:06:04.112812 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 1
  15:06:05.119799 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 2
  15:06:05.120159 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 2
  15:06:05.127045 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 1
  15:06:06.119862 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 3
  15:06:07.119921 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 4
  15:06:08.119994 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 5
  15:06:08.426400 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 3
  15:06:08.427915 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 4
  15:06:08.429033 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 5
  15:06:09.120053 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 6
  15:06:10.120109 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 7
  15:06:10.705332 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 6
  15:06:10.707258 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 7
  15:06:11.120175 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 8
  15:06:12.120233 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 9
  15:06:13.120297 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 10
  15:06:14.120359 IP 10.128.10.254  10.128.10.1: icmp 64: echo request seq 11
  15:06:14.120737 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 11
  15:06:14.127064 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 8
  15:06:14.127700 IP 10.128.10.1  10.128.10.254: icmp 64: echo reply seq 9
  15:06:14.128268

Re: Re: Strange connection slowdown on pcnet32

2007-02-19 Thread Lennart Sorensen
On Mon, Feb 19, 2007 at 05:29:20PM -0500, Lennart Sorensen wrote:
 I just noticed, it seems almost all these problems occour right at the
 start of transfers when the tcp window size is still being worked out
 for the connection speed, and I am seeing the error count go up in
 ifconfig for the port when it happens too.  Is it possible for an error
 to get flagged in a receive descriptor without the owner bit being
 updated?

It seems the problem actually occours when the receive descriptor ring
is full.  This seems to generate one (or sometimes more) descriptors in
the ring which claim to be owned by the MAC, but at the head of the
receive ring as far as the driver is concerned.  I see some note in the
driver about an SP3G chipset sometimes causing this.  How would one
identify this and clear such descriptors out of the way?  Getting stuck
until the next time the MAC gets around to the descriptor and overwrites
it is not good, since it causes delays, and out of order packets.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Strange connection slowdown on pcnet32

2007-02-19 Thread Lennart Sorensen
On Mon, Feb 19, 2007 at 06:45:48PM -0500, Lennart Sorensen wrote:
 It seems the problem actually occours when the receive descriptor ring
 is full.  This seems to generate one (or sometimes more) descriptors in
 the ring which claim to be owned by the MAC, but at the head of the
 receive ring as far as the driver is concerned.  I see some note in the
 driver about an SP3G chipset sometimes causing this.  How would one
 identify this and clear such descriptors out of the way?  Getting stuck
 until the next time the MAC gets around to the descriptor and overwrites
 it is not good, since it causes delays, and out of order packets.

I am also noticing the receive error count going up, and the source is
this code:

if (status  0x01)  /* Only count a general error at the */
   lp-stats.rx_errors++;  /* end of a packet. */

It appears this means I am receiving a frame marked with End Of Packet
but without Start of Packet.  I have no idea how that happens, but it
shouldn't be able to make the driver and MAC stop processing the receive
ring.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-02-19 Thread Lennart Sorensen
On Sat, Feb 17, 2007 at 11:11:13PM +0900, takada wrote:
 is it mean what doesn't help with doesn't call set_cx86_reoder()?
 this function disable to reorder at 0x4000: to 0x:.
 does pcnet32 access at out of above range?
 
 --- arch/i386/Kconfig.cpu~2007-02-05 03:44:54.0 +0900
 +++ arch/i386/Kconfig.cpu 2007-02-17 21:25:52.0 +0900
 @@ -322,7 +322,7 @@ config X86_USE_3DNOW
  
  config X86_OOSTORE
   bool
 - depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR
 + depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR || MGEODEGX1
   default y
  
  config X86_TSC

Well it turns out that enabling OOSTORE doesn't elliminate the problem,
but it does make it go from occouring within seconds to occouring within
many hours.  I am off to investigate some more.

Does anyone know if there is any way to flush a cache line of the cpu to
force rereading system memory for a given address or address range?

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-02-19 Thread Lennart Sorensen
On Mon, Feb 19, 2007 at 11:48:27AM -0800, Roland Dreier wrote:
   Does anyone know if there is any way to flush a cache line of the cpu to
   force rereading system memory for a given address or address range?
 
 There is the clflush instruction, but not all x86 CPUs support it.
 You need to check the CPUID flag to know for sure (/proc/cpuinfo will
 show a clflush flag if it is supported).

Well I will check for that.  Of course it is still possible that is it
actually the network chip screwing up somehow.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-02-19 Thread Lennart Sorensen
On Tue, Feb 20, 2007 at 08:56:39AM +0900, takada wrote:
 /proc/cpuinfo with MediaGXm :
 
 processor : 0
 vendor_id : CyrixInstead
 cpu family: 5
 model : 5
 model name: Cyrix MediaGXtm MMXtm Enhanced
 stepping  : 2
 cpu MHz   : 199.750
 cache size: 16 KB
 fdiv_bug  : no
 hlt_bug   : no
 f00f_bug  : no
 coma_bug  : no
 fpu   : yes
 fpu_exception : yes
 cpuid level   : 2
 wp: yes
 flags : fpu tsc msr cx8 cmov mmx cxmmx
 bogomips  : 401.00
 clflush size  : 32

Hmm with 2.6.18 I am seeing:

processor   : 0
vendor_id   : CyrixInstead
cpu family  : 5
model   : 9
model name  : Geode(TM) Integrated Processor by National Semi
stepping: 1
cpu MHz : 266.648
cache size  : 16 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu tsc msr cx8 cmov mmx cxmmx
bogomips: 534.50

Similar, but the last line isn't there.  It looks like 2.6.18 doesn't
actually have code to print that information though.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

2007-02-17 Thread Lennart Sorensen
On Sat, Feb 17, 2007 at 11:11:13PM +0900, takada wrote:
 is it mean what doesn't help with doesn't call set_cx86_reoder()?
 this function disable to reorder at 0x4000: to 0x:.
 does pcnet32 access at out of above range?

No it is accessing system memory by DMA to transfer frames.  Since the
system has 128MB ram, the addresses are probably all in the first 128MB
range.

I tried changing cyrix.c to explicitly set the serialize bit (0x8000 in
PCR0) rather than explcitly clearing it as is done now.  Didn't make a
difference.  I tried reversing the memory bypass setting, which also did
nothing.  Enabling CONFIG_X86_OOSTORE and recompiling however does make
a difference.

 --- arch/i386/Kconfig.cpu~2007-02-05 03:44:54.0 +0900
 +++ arch/i386/Kconfig.cpu 2007-02-17 21:25:52.0 +0900
 @@ -322,7 +322,7 @@ config X86_USE_3DNOW
  
  config X86_OOSTORE
   bool
 - depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR
 + depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR || MGEODEGX1
   default y
  
  config X86_TSC

I did:
depends on ((MWINCHIP3D || MWINCHIP2 || MWINCHIPC6)  MTRR) || 
MGEODEGX1
since I wasn't sure of the precedence in the Kconfig files.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange connection slowdown on pcnet32

2007-02-16 Thread Lennart Sorensen
On Thu, Feb 15, 2007 at 05:50:30PM -0500, Lennart Sorensen wrote:
 I have encountered a strange behaviour with the pcnet32.
 
 I am transfering data from a server to a client routing it through my
 router.  The router has 2 ethernet ports, both of which are amd 972
 chips (pcnet32).  The transfer has so far been either http or ftp (both
 see the same problem).  I transfer lots of data, and after a while (I
 have seen anywhere from 200 to 700MB or so) the speed suddenly drops to
 less than 1KB/s.  If I ping from the router to the server, the ping
 requests go out normally (seen by tcpdump on the server) every second,
 but on the router the replies are not seen by the kernel for multiple
 seconds.  Sometimes I will see 3 ping replies together, sometimes 5 or
 even 10.  The turn around times will show 10500, 9500, 8500, ..., 500ms
 for the packets received in a batch.  ifconfig on the router shows the
 packet receive counts showing up in lumps, just as ping does, and
 tcpdump on the interface on the router.
 
 Doing ifconfig down and up on the port connecting to the server makes
 the problem clear and it can handle another pile of data before the
 problem reappears.
 
 The CPU on the router is not fast enough to ensure there won't ever be
 dropped packets at 100Mbps.  When I force the port to the server to
 10Mbps I have no problems at all.
 
 Replacing the port to the server with an rtl8139 doesn't show any
 problems at 100Mbps, although the transfer rate drops from 6500KBps to
 4000KBps compared to using the pcnet32.
 
 Kernel used so far is 2.6.16 and 2.6.18.
 
 I have a tulip card I intend to try with as well just to see if it
 affects anything other than the pcnet32.
 
 Does anyone have any hints as to what part of the code to look at for
 changes made by doing ifconfig eth1 down; ifconfig eth1 up?  Any ideas
 as to what could make the reception of packets suddenly get very very
 slow?
 
 On one pass where I was running tcpdump on the router, I saw a wrap of
 the sequence number right before the problem occoured, but that has not
 been the case every time as far as I can tell, so I am not sure if that
 is related to the problem at all.

I have run some tests using 2.6.8 now, and so far it hasn't failed.

Still investigating...

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange connection slowdown on pcnet32

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 09:35:54AM -0500, Lennart Sorensen wrote:
 I have run some tests using 2.6.8 now, and so far it hasn't failed.
 
 Still investigating...

And 5 minutes later 2.6.8 failed the same way too.  Maybe I will go back
to 2.4 and check.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Strange connection slowdown on pcnet32

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 10:21:24AM -0600, [EMAIL PROTECTED] wrote:
 Are there any messages in the log about timeouts, or anything else from the 
 driver? When it gets in this state, can you communicate with another system, 
 and does it have the same slow behavior?

Nope no timeouts or messages.  As far as the system looks, cpu and ram and
logs show nothing unusual.  Just very slow reception on the ethernet port
going towards the server providing the data for the transfer.  Messages do
get through eventually, but very very late (when a ping reply arives at
the port and takes 5 to 10 seconds to make it to the network stack, then
something isn't right, at least when there is no other traffic waiting).

I did have NAPI in the driver even in 2.6.8 (I was adding that at the
time).  I am now testing with 2.6.8 without NAPI (so no mask/unmask of
receive interrupts taking place), and so far it has run for over an hour
without failing, although that doens't prove it won't, just that it has
lasted longer.

I think I will try compiling 2.6.18 again with NAPI disabled on the
pcnet32 and see what that does.  There is a chance that something in the
NAPI implementation is breaking the chip's receive somehow although I
can't currently imagine what it could be or how.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Strange connection slowdown on pcnet32

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 12:21:10PM -0500, Lennart Sorensen wrote:
 On Fri, Feb 16, 2007 at 10:21:24AM -0600, [EMAIL PROTECTED] wrote:
  Are there any messages in the log about timeouts, or anything else from the 
  driver? When it gets in this state, can you communicate with another 
  system, and does it have the same slow behavior?
 
 Nope no timeouts or messages.  As far as the system looks, cpu and ram and
 logs show nothing unusual.  Just very slow reception on the ethernet port
 going towards the server providing the data for the transfer.  Messages do
 get through eventually, but very very late (when a ping reply arives at
 the port and takes 5 to 10 seconds to make it to the network stack, then
 something isn't right, at least when there is no other traffic waiting).
 
 I did have NAPI in the driver even in 2.6.8 (I was adding that at the
 time).  I am now testing with 2.6.8 without NAPI (so no mask/unmask of
 receive interrupts taking place), and so far it has run for over an hour
 without failing, although that doens't prove it won't, just that it has
 lasted longer.
 
 I think I will try compiling 2.6.18 again with NAPI disabled on the
 pcnet32 and see what that does.  There is a chance that something in the
 NAPI implementation is breaking the chip's receive somehow although I
 can't currently imagine what it could be or how.

So I have determined that when the port gets stuck/slow it is hitting
this problem:

(in pcnet32_rx):
while (quota  npackets  (short)le16_to_cpu(rxp-status) = 0) {
if (netif_msg_intr(lp)) printk(KERN_DEBUG %s: pcnet32_rx 
npackets %d\n, dev-name, npackets);
pcnet32_rx_entry(dev, lp, rxp, entry);
npackets += 1;
/*
 * The docs say that the buffer length isn't touched, but Andrew
 * Boyd of QNX reports that some revs of the 79C965 clear it.
 */
rxp-buf_length = le16_to_cpu(2 - PKT_BUF_SZ);
wmb();  /* Make sure owner changes after others are visible */
rxp-status = le16_to_cpu(0x8000);
entry = (++lp-cur_rx)  lp-rx_mod_mask;
rxp = lp-rx_ring[entry];
}

Unfortunately rxp-status reads as 0x8000 for a long time, and then
eventually changes to 0x0310 at which point the receive happens.  Until
that happens, the poll is called about once per second and each time
returns that 0 packets were received but that more packets are waiting.

I can't figure out why it would get a status of 0x8000 which means that
the MAC hasn't changed the ownership flag on the packet yet, even though
it generated a receive interrupt multiple seconds ago.  Could it be some
caching issue that makes the cpu not realize that the memory has in fact
been changed by DMA?  Any way to force a cache update for a memory
location?

The CPU is a Geode SC1200 (Geode GX1 + Companion in one).  So far I have
seen __memcpy from system ram to device memory get data out of order, so
I have no reason to believe the cpu doesn't have more stupid bugs
related to doing I/O.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Strange connection slowdown on pcnet32

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 03:23:00PM -0500, Lennart Sorensen wrote:
 So I have determined that when the port gets stuck/slow it is hitting
 this problem:
 
 (in pcnet32_rx):
 while (quota  npackets  (short)le16_to_cpu(rxp-status) = 0) {
 if (netif_msg_intr(lp)) printk(KERN_DEBUG %s: pcnet32_rx 
 npackets %d\n, dev-name, npackets);
 pcnet32_rx_entry(dev, lp, rxp, entry);
 npackets += 1;
 /*
  * The docs say that the buffer length isn't touched, but 
 Andrew
  * Boyd of QNX reports that some revs of the 79C965 clear it.
  */
 rxp-buf_length = le16_to_cpu(2 - PKT_BUF_SZ);
 wmb();  /* Make sure owner changes after others are visible */
 rxp-status = le16_to_cpu(0x8000);
 entry = (++lp-cur_rx)  lp-rx_mod_mask;
 rxp = lp-rx_ring[entry];
 }
 
 Unfortunately rxp-status reads as 0x8000 for a long time, and then
 eventually changes to 0x0310 at which point the receive happens.  Until
 that happens, the poll is called about once per second and each time
 returns that 0 packets were received but that more packets are waiting.
 
 I can't figure out why it would get a status of 0x8000 which means that
 the MAC hasn't changed the ownership flag on the packet yet, even though
 it generated a receive interrupt multiple seconds ago.  Could it be some
 caching issue that makes the cpu not realize that the memory has in fact
 been changed by DMA?  Any way to force a cache update for a memory
 location?
 
 The CPU is a Geode SC1200 (Geode GX1 + Companion in one).  So far I have
 seen __memcpy from system ram to device memory get data out of order, so
 I have no reason to believe the cpu doesn't have more stupid bugs
 related to doing I/O.

It seems whenever it gets stuck, it is always the same descripter it is
stuck on.  Here is my current log:

eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
eth1: base: 0x04c51812 status: 8000 next-status: 0340
eth1: pcnet32_poll: pcnet32_rx() got 0 packets
eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
eth1: exiting interrupt

Re: Re: Strange connection slowdown on pcnet32

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote:
 It seems whenever it gets stuck, it is always the same descripter it is
 stuck on.  Here is my current log:
 
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
 eth1: base: 0x04c51812 status: 8000 next-status: 0340
 eth1: pcnet32_poll: pcnet32_rx() got 0 packets
 eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
 eth1: exiting interrupt, csr0

MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 05:27:28PM -0500, Lennart Sorensen wrote:
 On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote:
  It seems whenever it gets stuck, it is always the same descripter it is
  stuck on.  Here is my current log:
  
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000 next-status: 0340
  eth1: pcnet32_poll: pcnet32_rx() got 0 packets
  eth1: interrupt  csr0=0x4f3 new csr=0x33, csr3=0x.
  eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
  eth1: base: 0x04c51812 status: 8000

Re: MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)

2007-02-16 Thread Lennart Sorensen
On Fri, Feb 16, 2007 at 05:48:24PM -0500, Lennart Sorensen wrote:
 Well so far it really looks like enabling OOSTORE on the Geode
 SC1200/GX1 really does make a difference.  A bit of searching seems to
 indicate the person that originally submitted the patch that enabled
 load/store reordering on the MediaGX/Geode though it might need OOSTORE,
 but was convinced by others it didn't.  Looks like it really does need
 it.  The failure that occoured before within a few seconds of starting a
 large transfer, no longer fails and all I did was enable
 CONFIG_X86_OOSTORE, and recompile pcnet32.ko and load the new module on
 the running system.  Moving back to the pcnet32.ko built without OOSTORE
 enabled hits the failure again within seconds, until ifconfig eth1
 down/up reinitialized it's descriptor ring, after which it survices
 another bit of transfer and then fails again.

Well forcing load/store serialize on the CPU doesn't help, disalbing
memory bypass doesn't help.  Enabling the X86_OOSTORE does help.  What a
stupid CPU design.

So far nothing has managed to fix the __memcpy_toio in the jsm driver
getting data out of order when sending on an exar pci uart chip.  Only
calling memcpy with one byte at a time seems to work there.  Works fine
on every other cpu of course.  What else am I going to discover is wrong
with this CPU.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Strange connection slowdown on pcnet32

2007-02-15 Thread Lennart Sorensen
I have encountered a strange behaviour with the pcnet32.

I am transfering data from a server to a client routing it through my
router.  The router has 2 ethernet ports, both of which are amd 972
chips (pcnet32).  The transfer has so far been either http or ftp (both
see the same problem).  I transfer lots of data, and after a while (I
have seen anywhere from 200 to 700MB or so) the speed suddenly drops to
less than 1KB/s.  If I ping from the router to the server, the ping
requests go out normally (seen by tcpdump on the server) every second,
but on the router the replies are not seen by the kernel for multiple
seconds.  Sometimes I will see 3 ping replies together, sometimes 5 or
even 10.  The turn around times will show 10500, 9500, 8500, ..., 500ms
for the packets received in a batch.  ifconfig on the router shows the
packet receive counts showing up in lumps, just as ping does, and
tcpdump on the interface on the router.

Doing ifconfig down and up on the port connecting to the server makes
the problem clear and it can handle another pile of data before the
problem reappears.

The CPU on the router is not fast enough to ensure there won't ever be
dropped packets at 100Mbps.  When I force the port to the server to
10Mbps I have no problems at all.

Replacing the port to the server with an rtl8139 doesn't show any
problems at 100Mbps, although the transfer rate drops from 6500KBps to
4000KBps compared to using the pcnet32.

Kernel used so far is 2.6.16 and 2.6.18.

I have a tulip card I intend to try with as well just to see if it
affects anything other than the pcnet32.

Does anyone have any hints as to what part of the code to look at for
changes made by doing ifconfig eth1 down; ifconfig eth1 up?  Any ideas
as to what could make the reception of packets suddenly get very very
slow?

On one pass where I was running tcpdump on the router, I saw a wrap of
the sequence number right before the problem occoured, but that has not
been the case every time as far as I can tell, so I am not sure if that
is related to the problem at all.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/26] rt2x00: EEPROM 93Cx6

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 05:47:41PM +0100, Ivo van Doorn wrote:
 Do you need to actually write data to the eeprom chip?
 Currently the module does not support writing to the eeprom,
 this is something I could add (The original Ralink code, where this module
 is based on also contains the code to write to the EEPROM).

I am going to use it to write the custom pci vendor ID to the eeprom, so
yes I intend to write to it.  The code appears as if it has the ability
to write to the eeprom but I didn't look at all of it carefully yet.  I
don't actually have any need to read it back, although I intend to do so
to verify the contents.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] d80211, rt2x00: fixes

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 06:00:35PM +0100, Jiri Benc wrote:
 John, in addition to the previous pull request, please also apply the
 following two fixes.

What is the state of the rx2x00 driver by now?  I have been playing
around with an rt2500 based card, with some success but not enough for
me to switch over from wired ethernet yet on my machine.  I used to get
lots of hard lockups, but with the latest cvs snapshot in debian's
rt2x00-source package, it no longer seems to lockup.  It also now works
with WPA without using wpa_supplicant (Yay!  Good work.), it does
however very frequently pause the transfer, and then after a while (20
or 30 seconds probably) it will start moving data again and my transfer
will continue.  Is this considered normal for now?  My card happens to
be a linksys WMP54G version 4.0.  At least pauses beat crashes.  It's
going the right way for a work in progress.

I guess I should go read the bug tracking system and try out newer cvs
versions. :)

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] eeprom_93cx6: Add write support

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 07:56:50PM +0100, Ivo van Doorn wrote:
 This patch addes support for writing to the eeprom,
 this also moves some duplicate code into seperate functions.
 
 Signed-off-by Ivo van Doorn [EMAIL PROTECTED]

Thank you.  I will have a try with that to see if I can get that to work
with the jsm driver.  Too bad the serial drivers don't have any
geteeprom/seteeprom standard ioctl's the way ethtool does for network
devices.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] d80211, rt2x00: fixes

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 12:38:43PM -0500, Dan Williams wrote:
 How, by private ioctls?  That's just wrong; I believe you still need to
 go through the 4-way handshake to get the right keying information even
 if you use PSK, which means you still need the supplicant, right?

All I did was add this to /etc/network/interfaces:

iface wlan0 inet static
address 192.168.1.51
network 192.168.1.0
netmask 255.255.255.0
gateway 192.168.1.254
broadcast 192.168.1.255
pre-up ifconfig wlan0 up
pre-up iwpriv wlan0 set AuthMode=WPAPSK
pre-up iwpriv wlan0 set EncrypType=TKIP
pre-up iwconfig wlan0 essid USR8054
pre-up iwpriv wlan0 set WPAPSK=My WPA passphrase...

It seems to work, although I guess I could be wrong.  It was what I
found in the documentation for the rt2x00 driver for doing WPA.  It
looks nothing like the wpa_supplicant stuff I used to have with an older
version of the driver.  My understanding was that the rt2x00 driver
and/or d80211 stack took care of it now.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] d80211, rt2x00: fixes

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 06:56:57PM +0100, Ivo van Doorn wrote:
 rt2x00 completely uses the dscape stack, so I am not sure how he is
 managing this wpa without wpa_supplicant with rt2x00.
 Lennart, are you using rt2x00 or the legacy rt2500 driver?

rt2x00 with dscape stack.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] d80211, rt2x00: fixes

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 06:49:07PM +0100, Ivo van Doorn wrote:
 Well results seem to vary between users. Since recently users have started
 reporting panics and freezes with rt2x00. I have not yet traced that problem
 to the source, because the panics I have received don't contain any rt2x00
 or d80211 functions. But the presence of the rt2x00 module is the
 important factor in reproducing the crash. :(
 Others however seem to have more success with rt2x00, master mode
 seems to work with reasonable speed. Association with managed mode
 is still very shortlifed. People who manage to get associated are being
 kicked from the AP quite quickly. (This could be because d80211 is not
 sending NULL-frames every once in a while).

I should do some more testing and submit a report of how it is behaving.
Anything specific worth checking if it misbehaves (so far misbehaving
seems to be pausing the network transmissions for a short period and
then resuming).

 But since results vary much between users, I can still descrive the rt2x00
 state as experimental driver

Well every once in a while I load a new version and see how it is.
Eventually I hope it will work perfectly, and I can move my mythtv box
into the living room with the TV using wireless, rather than sitting in
the basement next to the router machine using wired ethernet.  Running
ethernet cable through the wall when the basement is all finished just
seems like too much work.  I got an rt based card because I knew it was
being worked on.  I know someday it will simply work, which is good
enough for me.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] d80211, rt2x00: fixes

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 10:28:15PM +0100, Ivo Van Doorn wrote:
 That is definately the rt2500 legacy driver and _not_ the rt2x00 driver.

Yeah I just noticed that a few minutes ago.  I had been trying out both
to see how they worked, and I left the old module loaded by accident.

 Correct, that is why those iwpriv commands are the clear evidence
 you are not using rt2x00 but rt2500 legacy. Check which driver is loaded
 rt2500 means legacy
 rt2500pci means rt2x00.

Yep I am now poking with the wpa_supplicant again, getting other
interesting messages from it.

I will try to give a report on how my system behaves with the 2x00
driver soon since apparently some of my testing was with the wrong
driver.  Oops. :)

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Simultanious transmits seems to cause hang on pcnet32

2006-07-18 Thread Lennart Sorensen
I am currently doing some testing on my system and managing to totally
hang the system (so that the watchdog has to come along and reboot it).

The setup is this:
I have a PLX PCI-PCI bridge with 4 79C972 chips behind it, each running
100baseTX.  I am transmitting traffic from a smartbits test system from
port 1 to port 3 and back, and from port 2 to port 4 and back.  I am
running 500 packets/second with 60 byte packets each way.

If I start the traffic on all 4 ports at the same time, I get less than
100 packets received back at the smartbits on each port, and then the
linux kernel is hung.  No response to anything I have tried.  The
watchdog then reboots the system.

If I start traffic on less than 4 ports, and then add the remaining
ports a second or so later, then it runs just fine and keeps up with the
traffic.

I tried making the traffic all flow out eth0 (an rtl8139 port) instead
of out the pcnet32 ports, and then there is no problem, so I think there
is some problem when multiple ports try to start transmitting at the
same time.

So far it has failed with 2.6.8 and 2.6.16 and with 2.6.17's pcnet32
with the napi patches applied.

I noticed that sometime between 2.6.4 and 2.6.8, the TxDone interrupts
were removed entirely, where as they used to be sent every once in a
while.  I am not sure if this is making a difference yet.

I tried increasing the ring sizes to their maximum setting of 9/9 rather
than the current default of 4/5, and that didn't make any difference
either.

Does anyone have a suggestion for how to go about debuging this issue?
So far I am very confused.

I tried turning on lots of debuging in pcnet32, but that seems to slow
the system down enough (printing debug messages on the serial console)
that it only manages to transmit 10 packets per port per second, at
which point it doesn't lock up.  Reducing the test setting from 500
60byte packets/second to 100 makes the problem disappear as well.

So I am open for suggestions to try.  I really don't know where to go
about debuging this when it makes the kernel lock up.  It makes me think
it is getting stuck somewhere with interrupts disabled, but I can't see
anything in the transmit code that looks like that could happen.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Simultanious transmits seems to cause hang on pcnet32

2006-07-18 Thread Lennart Sorensen
On Tue, Jul 18, 2006 at 10:57:47AM -0700, Don Fry wrote:
 I don't know what a 'smartbits test system' is or how it works.  Could
 you please briefly explain what it is and does?

It is a network test system built by spirent (www.spirentcom.com).

It is mainly a layer 2 test system (you configure what you want it
ethernet packet to look like, what rate you want them sent at, and what
fields to change and by how much on each packet sent out).  We have it
configured to generate packets from 192.168.1.2 to 192.168.3.2 (and vice
versa), with the ip of the router with the pcnet32 chips in it, set as
the gateway.  The packets are simply an ethernet packet with the IPv4
header with the source and destination IP filled in, along with the
other required fields and the checksum, and then the data part of the
packet filled with 0s in this case.

 Is the rdl8139 on the same PCI bus?

The 8139 is on the primary PCI bus, the 972s are behind the pci bridge.
The 8139 driver is normally not even loaded.

 Is there a version of the pcnet32 driver that does work?  Is this a
 stock driver or do you have modifications made as well?

I haven't found one that works yet.  The only changes I have made are to
initialize the PHY and set the MAC address, since we don't have an eeprom
connected to the 972s.  I was thinking of going and trying with 2.4.27
or something around there, to see if an older driver behaves differently.

 The ltint or TxDone interrupt deferral code was removed in May 2004,
 2.6.7 timeframe.  Every transmit packet causes an interrupt, rather than
 just occasionally.

Hmm, the way I read the code, it looked like setting the status to 8300
made no packet generate the interrupt, and setting it to 9300 made a
packet generate an interrupt.  I guess I read it backwards.  That
wouldn't surprise me. :)

 Does reducing the ring size make any difference?  Or tx large/rx small,
 or vice-versa?

I don't know.  I can try that.

 Is there any way to see what is happening on the PCI bus where the
 pcnet32 devices are connected?  Or see what is happening on the master
 side of the pci-to-pci bridge?  Do the chips share any interrupt lines
 or do they all have dedicated irq's?

We have two interrupts for the PCI bus, irq10 and 11.  eth1 and 3 share
one, and eth2 and 4 share the other.

 Is this an SMP or UP system?

Single amd geode SCx200 266MHz.

I have also considered building with PREEMPT off, to see if that makes a
difference, not that there are really any user space processes doing
anything on the system.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH] pcnet32: NAPI support

2006-06-29 Thread Lennart Sorensen
On Wed, Jun 28, 2006 at 09:55:41AM -0700, Don Fry wrote:
 Yes, I saw the debug statements when creating the email and was too lazy
 to remove them and create a new patch.  The patch needs to be broken up
 into functional pieces anyway, so since it has passed all of my testing
 as well, I will start on that...

So it might make 2.6.18 then? :)

I just updated the driver in my 2.6.16 kernel to the 2.6.17 version and
applied your patch, and then added my own weird stuff that no one else
will want.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH] pcnet32: NAPI support

2006-06-28 Thread Lennart Sorensen
On Fri, Jun 23, 2006 at 02:32:12PM -0700, Don Fry wrote:
 This set of changes combines the work done by Len Sorensen and myself to
 add compile time support for NAPI for the pcnet32 driver.  I have tested
 it on ia32 and ppc64 hardware with various versions of the pcnet32
 adapter.  I have also made a few changes requested by Jon Mason, but the
 substitution of the many magic numbers in the driver is not yet done.
 
 If no-one encounters any problems when testing this, I will break up the
 several changes, into proper patches and submit them next week.

Well so far this is working for me.  It is a somewhat different layout
of the interrupt handler so it took me a bit of work to get the features
I need patched in, but in the end I ended up with simpler code as a
reesult, so I am quite happy with the new layout.  The driver works on
everything I have to try it on so far.

 Signed-off-by:  Don Fry [EMAIL PROTECTED]
 
 
 --- linux-2.6.17/drivers/net/orig.Kconfig 2006-06-15 11:49:39.0 
 -0700
 +++ linux-2.6.17/drivers/net/Kconfig  2006-06-22 15:44:52.0 -0700
 @@ -1272,6 +1272,23 @@ config PCNET32
 file:Documentation/networking/net-modules.txt. The module
 will be called pcnet32.
  
 +config PCNET32_NAPI
 + bool Use RX polling (NAPI) (EXPERIMENTAL)
 + depends on PCNET32  EXPERIMENTAL
 + help
 +   NAPI is a new driver API designed to reduce CPU and interrupt load
 +   when the driver is receiving lots of packets from the card. It is
 +   still somewhat experimental and thus not yet enabled by default.
 +
 +   If your estimated Rx load is 10kpps or more, or if the card will be
 +   deployed on potentially unfriendly networks (e.g. in a firewall),
 +   then say Y here.
 +
 +   See file:Documentation/networking/NAPI_HOWTO.txt for more
 +   information.
 +
 +   If in doubt, say N.
 +
  config AMD8111_ETH
   tristate AMD 8111 (new PCI lance) support
   depends on NET_PCI  PCI
 
 --- linux-2.6.17/drivers/net/orig.pcnet32.c   Sat Jun 17 18:49:35 2006
 +++ linux-2.6.17/drivers/net/pcnet32.cFri Jun 23 13:13:02 2006
 @@ -21,9 +21,15 @@
   *
   */
  
 +#include linux/config.h
 +
  #define DRV_NAME pcnet32
 -#define DRV_VERSION  1.32
 -#define DRV_RELDATE  18.Mar.2006
 +#ifdef CONFIG_PCNET32_NAPI
 +#define DRV_VERSION  1.33-NAPI
 +#else
 +#define DRV_VERSION  1.33
 +#endif
 +#define DRV_RELDATE  23.Jun.2006
  #define PFX  DRV_NAME : 
  
  static const char *const version =
 @@ -58,18 +64,15 @@ static const char *const version =
   * PCI device identifiers for new style Linux PCI Device Drivers
   */
  static struct pci_device_id pcnet32_pci_tbl[] = {
 - { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LANCE_HOME,
 -   PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
 - { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LANCE,
 -   PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
 + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LANCE_HOME), },
 + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LANCE), },
  
   /*
* Adapters that were sold with IBM's RS/6000 or pSeries hardware have
* the incorrect vendor id.
*/
 - { PCI_VENDOR_ID_TRIDENT, PCI_DEVICE_ID_AMD_LANCE,
 -   PCI_ANY_ID, PCI_ANY_ID,
 -   PCI_CLASS_NETWORK_ETHERNET  8, 0x00, 0},
 + { PCI_DEVICE(PCI_VENDOR_ID_TRIDENT, PCI_DEVICE_ID_AMD_LANCE),
 +   .class = (PCI_CLASS_NETWORK_ETHERNET  8), .class_mask = 0x00, },
  
   { } /* terminate list */
  };
 @@ -277,13 +280,14 @@ struct pcnet32_private {
   u32 phymask;
  };
  
 -static void pcnet32_probe_vlbus(void);
  static int pcnet32_probe_pci(struct pci_dev *, const struct pci_device_id *);
  static int pcnet32_probe1(unsigned long, int, struct pci_dev *);
  static int pcnet32_open(struct net_device *);
  static int pcnet32_init_ring(struct net_device *);
  static int pcnet32_start_xmit(struct sk_buff *, struct net_device *);
 -static int pcnet32_rx(struct net_device *);
 +#ifdef CONFIG_PCNET32_NAPI
 +static int pcnet32_poll(struct net_device *dev, int *budget);
 +#endif
  static void pcnet32_tx_timeout(struct net_device *dev);
  static irqreturn_t pcnet32_interrupt(int, void *, struct pt_regs *);
  static int pcnet32_close(struct net_device *);
 @@ -425,6 +429,235 @@ static struct pcnet32_access pcnet32_dwi
   .reset = pcnet32_dwio_reset
  };
  
 +static void pcnet32_netif_stop(struct net_device *dev)
 +{
 + dev-trans_start = jiffies;
 + netif_poll_disable(dev);
 + netif_tx_disable(dev);
 +}
 +
 +static void pcnet32_netif_start(struct net_device *dev)
 +{
 + netif_wake_queue(dev);
 + netif_poll_enable(dev);
 +}
 +
 +/*
 + * Allocate space for the new sized tx ring.
 + * Free old resources
 + * Save new resources.
 + * Any failure keeps old resources.
 + * Must be called with lp-lock held.
 + */
 +static void pcnet32_realloc_tx_ring(struct net_device 

Re: [RFT] pcnet32 NAPI changes

2006-06-20 Thread Lennart Sorensen
On Tue, Jun 20, 2006 at 08:53:55AM -0500, Jon Mason wrote:
 The amount of polls per received packet is very low, thus removing the
 benefit of NAPI.  A compile time option would allow those users who know
 better to DTRT.

Well I know on the slow poke system I run on, with the napi polling, the
system can process packets, and get work done, and not fall over and die
from handling interrupts.  Without it, even 70Mbit of data on a single
port will flood the system with packet overruns to the point the
watchdog times out and the system reboots.  So I don't know if polling
is slightly more inefficient with little traffic, it is certainly a lot
more efficient and safer when there is suddenly a lot more traffic.
Maybe it should be a module option, so that you can pick what you want.
Heck it could be a per port option even. :)

 Yup, but the everyone else is doing it argument never worked with my
 parents. All it takes is one brave soul to determine the reasoning
 behind the magic numbers and convert them into #define's.  Shouldn't be
 more than one day's work.

Is this a magic number in your opinion?

lp-a.write_csr(ioaddr, 0, 0x0002);  /* Set STRT bit */

I guess one could do
#define CSR0_RST 0x0001
#define CSR0_STRT 0x0002
#define CSR0_STOP 0x0004
etc...

and then
lp-a.write_csr(ioaddr, 0, CSR0_STRT); /* Set STRT bit */

Does that help?  I am not sure.  I think the comment behind it is
plenty.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] pcnet32 NAPI changes

2006-06-20 Thread Lennart Sorensen
On Tue, Jun 20, 2006 at 11:05:04AM -0500, Jon Mason wrote:
 The point of my comment was CPU utilization.
 
 It appears that a bug is trying to be fixed by adding NAPI. This
 sounds a bit hackish to me, and could hide the root cause of the
 problem. So I'm not sure that is the best idea, but I will defer to
 the maintainer.

No it isn't a bug.  If the hardware generates enough interrupts to keep
the cpu at 100% handling them, starving user space (since interrupts
have high priority compared to just running user code of course), then
the watchdog daemon which of course runs in user space will never run
and hence the watchdog hardware times out and resets the system, as it
is designed to do.  There is no bug, just a problem of too many
interrupts generated by the network hardware.  NAPI elliminates the
receive interrupts when the system is busy, solving the problem at it's
root cause.

 But your example is just one instance.  Here is one without a comment:
 
 lp-a.write_csr(ioaddr, 4, 0x0915);

Hmm.  0x0915 =  1001 0001 0101 =
*Auto Pad Transmit (bit 11).  Enabled auto padding of packets.
*Missed Frame Counter Overflow Mask (bit 8):  Masks out interrupts on
 overflow of the missed frame counter.
*Receive Collision Counter Overflow Mask (bit 4):  Masks out interrupts on
 overflow of the receive collision counter.
*Transmit Start Mask (bit 2):  Masks out interrupts on start of
 transmit.

So every CSR has a different meaning for all its bits.  Defining each
one, and combining all of them could make a lot of the code really
messy.  Perhaps more comments on those places would be clearer.

 What is it doing?  Is it still needed?  Can it be done anywhere else?  
 Who knows, because it is magic.  The 4 can be defined as CSR0_STOP, per
 your example above, but what does value 0x0915 do?

No the 4 has a different meaning in CSR4.  It means stop in CSR0.  in
CSR4 it means Transmit Start Mask.  It masks interrupts on transmit
start.  I think the value is wrong, since my data sheet says bit 0 and 1
are reserved and should be written as 0.  0x0915 would write bit 0 as a
1 which violates the data sheet of the 972 at least.

 My point was that there are certain parts of the code which are
 non-intuative and should be commented and there are others which a
 good descrptive value would be nice.

Well I agree the code could get a bit better.  I did think overall that
the code was rather nice actually.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] pcnet32 NAPI changes

2006-06-19 Thread Lennart Sorensen
On Fri, Jun 16, 2006 at 12:11:54PM -0700, Don Fry wrote:
 This patch is a collection of changes to pcnet32 which does the
 following: 
 
 - Fix section mismatch warning.
 - fix set_ringparam to correctly handle memory allocation failures
 - fix off-by-one in get_ringparam.
 - cleanup at end of loopback_test when not up.
 - Add NAPI to driver, fixing set_ringparam and loopback_test to work
   correctly with poll.
 - for multicast, do not reset the chip unless cannot enter suspend mode
   to avoid race with poll.
 
 The set_ringparam code is larger than I would prefer, but it will not
 leave null pointers around for the code to stumble over when memory
 allocation fails.  If anyone has a better idea, please let me know.
 
 Some complexity could be avoided by allocating memory for the maximum
 number of tx and rx buffers at probe time.  Requiring 14k for the tx
 ring and arrays, and another 14k for rx; instead of about 10k total for
 the default sizes.

So 28k vs 10k?  Why are these adjustable if it makes that little
difference?  Is there any advantage to making them smaller?

 It is NAPI only, unlike Len Sorensen's version which allows for compile
 time selection.  Some drivers are NAPI only, others have compile
 options.  Which is preferred?

I just figured making it an option was less intrusive, although I can't
imagine a good reason for not wanting to use the NAPI version at all
times.  I certainly know I intend to use it that way.

 I have tested these changes with a 79C971, 973, 976, and 978 on a ppc64
 machine, and 970A, 972, 973, 975, and 976 on an x86 machine.
 
 I have not tested these changes with VMware or Xen.

I will give it a try with our system and see how it runs.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] pcnet32 NAPI changes

2006-06-19 Thread Lennart Sorensen
On Mon, Jun 19, 2006 at 03:41:40PM -0500, Jon Mason wrote:
 I believe it is preferred to be a compile option for non-gigabit
 drivers, given that it will be eating a lot of cycles for infrequent
 packets (especially for the 10Mb).  I believe there was a thread about
 this last year when e100 was having NAPI problems.

How does NAPI eat cycles?  It goes back to interrupt mode when the queue
is empty, and only on RX interrupt does it turn on polling again.

It is certainly possible that there are bugs in a NAPI conversion, which
I guess could be a reason to have the option to stick with the old
method, although then again not having the option ensures the bugs get
found sooner.

 A general nit.  There are ALOT of magic numbers in the code, most
 existing prior to this patch.  The driver would benefit from a little
 clean-up.
 
 Also nothing to do with this patch, but I noticed it when the code was
 moved.  A comment about why the following is necessary might be nice:
 lp-rx_ring[i].buf_length = le16_to_cpu(2 - PKT_BUF_SZ);

I suspect many drivers are in need of some cleanup.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Firewall question

2006-06-09 Thread Lennart Sorensen
On Fri, Jun 09, 2006 at 05:43:24AM +0200, Andi Kleen wrote:
 No one out on the internet, but it would be trivial for someone outside
 his house. All his traffic will be on a long unsecured cable. 
 
 That is why I would never bridge home ethernet traffic onto a DSL line.

Hmm, traffic sent between his machines would not go over the DSL since
the MAC address doesn't match the DSL modem (I would think so at
least).  It would be a mess if the DSL modem tried to forwards all
traffic on an ethernet segment (well it doesn't have the bandwidth for
sure).  Maybe I am incorrectly assuming the DSL modem only forwards the
PPPoE traffic being sent at it.  I could see broadcast traffic being
forwarded, although arps and such are generally not that interesting.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] pcnet32 driver NAPI support

2006-06-08 Thread Lennart Sorensen
On Wed, Jun 07, 2006 at 03:32:45PM -0700, Don Fry wrote:
 One other problem I ran into.  I applied the patch but it will not
 compile because rl_active is never defined.  I have worked around it but

Doh!  I thought I cleaned up all my weird code from my own version.
Because of the platform I work with having 4 pcnet32 ports, and a slow
poke 266MHz geode, we can't handle full traffic load, so to keep the
system responsive to pause processing receives when we pass a certain
number of packets per second.  rl_active is part of that.  I meant to
remove all of it, apparently I didn't read every line of my patch
carefully enough. :(

Well at least this ought to clean up my work a bit.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Firewall question

2006-06-08 Thread Lennart Sorensen
On Thu, Jun 08, 2006 at 11:57:12AM -0700, Alex Davis wrote:
 The scenario:
 I have a DSL modem in pass through (bridge) mode. The linux firewall/router 
 has a single ethernet card.  It is running pppoe. This gives two interfaces: 
 eth0 and ppp0. The firewall is running iptables. There are several machines 
 behind the firewall.
 
 Problem:
 I've been told that if someone whose public IP address is on the same
 network subnet as mine were to get my mac address, (s)he could bypass
 the firewall and talk directly to the machines behind it.
 
 Is this true?

Well the DSL modem only transfers whatever data the ISP end sends to it,
which in your case is just PPP packets (LCC or LCP I think).  No one out
on the internet would be able to send ethernet data over the DSL link,
so the only way to send data to another machine on your network (that
the DSL modem is connected to physically) is if you have other machines
on your local network which are also running PPPoE and listening for
that traffic.

So the worst thing I can see happening is that someone on your local
network could potentially take over your PPPoE session, but that's about
it.  I just can't see anything else that could happen.  I used to run
exactly the setup you describe before I had to drop the DSL connection
(I moved).

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] pcnet32 driver NAPI support

2006-06-07 Thread Lennart Sorensen
On Wed, Jun 07, 2006 at 11:20:40AM -0700, Don Fry wrote:
 I am also working on a NAPI version of the pcnet32 driver for many of
 the same reasons, and will compare what you have with my own
 implementation.  I probably won't be able to do much until Friday.  
 
 Just a couple of comments.  I am adding netdev@vger.kernel.org to the cc
 list, as most network driver discussion is done here rather than lkml.
 linux-kernel (and linux-net) should be deleted in future replies.

I must have picked the wrong place to cc.

 The 2.6.17-rc6 would be the correct source to patch against.  Since this
 is an enhancement it will not come out till 2.6.18.

I thought so.  That is why I did it against both 2.6.17-rc6 and 2.6.16
(since I use it with 2.6.16).

 I would not change the driver name from pcnet32 to pcnet32napi, but I
 would changes the version from 1.32 to 1.33NAPI or something like that.

Hmm, perhaps.  I just wanted something that made it obvious in dmesg
which driver I was running.  I see tulip actually does put it in the
version instead.  I don't remember where I got the driver name change
idea from.

 Some areas of concern that you may have addressed already, I have not
 scanned your changes yet, are what happens if the ring size is changed
 without bringing down the interface (via ethtool), or if the loopback
 test is run in a similar fashion, or a tx timeout occurs.

The same thing as if it was done before enabling napi.  From a few
messages I have seen, it doesn't work right now, and it won't work any
better with my changes.  I have never tried changing the ring size on
the fly, so I don't know.

It appears that the port is stopped before the ring size change is done,
although I can't really tell how it handles things if the queue is not
empty when it stops the port.  Does it try to handle anything left in
the ring first or does it just toss those packets? (That I would
consider wrong).

 The lp-lock MUST be held whenever accessing the csr or bcr registers as
 this is a multi-step process, and has been the source of problems in the
 past.  Even on UP systems.

Hmm, I just followed what appeared to be in pcnet32_rx and how tulip and
a few other drivers had done their napi conversions.  It certainly works
for me the way I did it.  Haven't seen any lockups yet.  I do see that I
am not holding the lock when I acknowledge IRQs in pcnet32_poll, which
pcnet32_rx doesn't need to worry about since it is called from the
interrupt handler which already holds the lock.  That should be fixed
then.

So I can do:
// Clear RX interrupts
spin_lock(lp-lock);
lp-a.write_csr (ioaddr, 0, 0x1400);
spin_unlock(lp-lock);
That part seems simple enough to protect.

Is this safe without holding the lock?
} while(lp-a.read_csr (ioaddr, 0)  0x1400);
Not sure how to wrap a lock around that one without holding the lock for
way too long.

perhaps:
spin_lock(lp-lock);
state=lp-a.read_csr (ioaddr, 0)  0x1400;
spin_unlock(lp-lock);
} while(state);
Does that seem more reasonable?

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pcnet32 devices with incorrect trident vendor ID

2006-01-12 Thread Lennart Sorensen
On Thu, Jan 12, 2006 at 08:49:42PM +, Daniel Drake wrote:
 On the subject of pcnet32 and the invalid vendor ID, you may find this 
 interesting:
 
 http://forums.gentoo.org/viewtopic-t-420013-highlight-trident.html
 
 The user saw the correct vendor ID (AMD) in 2.4, but when upgrading to 
 2.6, it changed to Trident.
 
 I guess this is still likely to be a hardware bug, but it demonstrates 
 that the Linux PCI layer has something to do with it (even if it is just 
 triggering it somehow).

Perhaps there is a significant different in the pcnet32.c files between
the two versions.  I also remember that there is some powerpc specific
code in there related to MAC address detection.  There are certainly
differences in 2.4 and 2.6's version of the driver, maybe something is
broken in the newer one when run on powerpc.  I don't run gentoo and
have no idea how to get a hold of the source of pcnet32.c from each of
those two.

It does seem odd that only the pcnet32 has the pci ID change, but at the
same time, somehow the driver is recognizing it and loading at boot
time, so the ID can't be wrong at that time.  Does the ID get mangled as
part of what makes the MAC addresses read incorrectly on your 2.6.14?
The 2.4 system shows all the cards overriding the MAC based on the PROM,
which I believe is what the driver code should do on a powerpc system.
On 2.6 that appears to only happen on one of the cards.  At least on
that device (pci 01:01) appears to agree what the MAC should be in both
cases.

Perhaps the lspci being wrong is just a side effect of the real problem.
Maybe the driver is broken and messing things up.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html