On Wed, May 14, 2008 at 10:25:48AM +0300, Eli Cohen wrote: > On Tue, 2008-05-13 at 18:21 -0700, [EMAIL PROTECTED] wrote: > > We're getting panics like this one on big clusters: > > > > skb_over_panic: text:ffffffff8821f32e len:160 put:100 head:ffff810372b0f000 > > data:ffff810372b0f01c tail:ffff810372b0f0bc end:ffff810372b0f080 dev:ib0 > > RX SKBs are large enough to contain 100 bytes... this looks like > corruption.
Exactly. > Can you give more information on OS, kernel version, OFED > version. SUSE Linux Enterprise Server 10 SP1 (x86_64) - Kernel 2.6.16.46-0.12-smp OFED 1.3 GA > ..... > >From NAPI_HOWTO.txt, although the file has been removed but I think the > statement is still valid: > > -Guarantee: Only one CPU at any time can call dev->poll(); this is > because only one CPU can pick the initial interrupt and hence the > initial netif_rx_schedule(dev); > Yes, you're correct. I missed the use of the __LINK_STATE_RX_SCHED bit in __netif_rx_schedule_prep()/netif_rx_complete() that serializes this. (Roland also pointed this out to me.) -- Arthur _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
