On Tue, Dec 08, 2009 at 04:37:39PM -0700, Tantilov, Emil S wrote:
> Neil Horman wrote:
> > Hey all-
> >     I was tracking down a memory corruptor lately in which, with
> > DEBUG_SLAB enabled we were getting several redzone violoations, which
> > was always followed by an oops in the rx receive path when using the
> > e1000e driver.  Looking at the cores, the slabs that had their
> > redzone violated were always adjacent to a size-2048 slab which was
> > allocated by __netdev_alloc_skb.  Doing some instrumentation led me
> > to see the following.  When the e1000e driver sets up its rctl
> > register, which defines the length of each of the buffers in the rx
> > ring, we do this: 
> > 
> > switch (adapter->rx_buffer_len) {
> >         case 256:
> >                 rctl |= E1000_RCTL_SZ_256;
> >                 rctl &= ~E1000_RCTL_BSEX;
> >                 break;
> >         case 512:
> >                 rctl |= E1000_RCTL_SZ_512;
> >                 rctl &= ~E1000_RCTL_BSEX;
> >                 break;
> >         case 1024:
> >                 rctl |= E1000_RCTL_SZ_1024;
> >                 rctl &= ~E1000_RCTL_BSEX;
> >                 break;
> >         case 2048:
> >         default:
> >                 rctl |= E1000_RCTL_SZ_2048;
> >                 rctl &= ~E1000_RCTL_BSEX;
> >                 break;
> >         case 4096:
> >                 rctl |= E1000_RCTL_SZ_4096;
> >                 break;
> >         case 8192:
> >                 rctl |= E1000_RCTL_SZ_8192;
> >                 break;
> >         case 16384:
> >                 rctl |= E1000_RCTL_SZ_16384;
> >                 break;
> >         }
> > }
> > 
> > This is problematic since rx_buffer_len is set to 1500, which causes
> > the rctl value to fall into the default case, which configures the
> > hardware to think it can dma up to 2048 bytes into each buffer,
> > despite the fact that the buffer is only 1500 bytes long.  If we
> > receive a frame that is longer than 1500 bytes, then the dma will
> > overrun the 1500 byte limit of the sk_buff's data block spilling into
> > the skb_shared_info structure (which is placed immediately after the
> > end of the skb->end pointer (or at skb->head + skb->end if
> > NET_SKBUFF_DATA_USES_OFFSET is configured).  In either case, the
> > spillover corrupts the skb_shared_info structure, and sets us up for
> > any number of subsequent corruptions/oopses/failures.  I've fixed
> > this by normalizing the rx_buffer_length value in the setup_rctl
> > function, which rounds up the length to the configured value of rctl,
> > forcing us to allocate buffers of a size that meet the hardwares
> > configuration needs.  
> > 
> > I've tested this on e1000e and confirmed that it fixes my redzone
> > violations and the observed oops.  Visual inspection indicates that
> > e1000 and ixgb also need this fix.  I've not explicitly tested them
> > though, so I've split this into three separate patches
> > 
> > Regards
> > Neil
> > 
> > Signed-off-by: Neil Horman <[email protected]>
> 
> Hi Neil,
> 
> I am trying to test the patches you submitted (thanks btw) and so far am not 
> able to reproduce the panic you described. When MTU is at 1500 RCTL.LPE (bit 
> 5) is set to 0 and the HW will not allow the reception of large packets 
> (>1522 bytes, which is what rx_buffer_len is set to). This is basically what 
> I am seeing in my tests - packets are discarded by the HW.
> 
Interestingly, the only time we could observe the errors was when SLAB_DEBUG was
configured, so my guess is timing plays an important role.  The problem also
only reproduced on hp-z200 series systems (with e1000e cards in them having pci
vendor device ids  14e4:1684), so perhaps this is restricted to certain hardware
revisions?

> Can you provide more details about the setup you used? Steps to reproduce and 
> also details (lspci -vvv, ethtool -e, ethtool -d) of the HW you used and 
> kernel config should help as well.

Basically I ran a kernel with CONFIG_SLAB_DEBUG turned on on one of the above
systms, and ran the connectathon suite on it.

I don't have access to the system at this minute but, I'll reserve it as soon as
I'm able and dig up the full lspci dump and ethtool info for you.

I will point out though, that in e1000_change_mtu for all the drivers I've
posted modifications for, this normalization of rx_buffer_len is already done,
so there seems to be some precident for doing this operation.

Thanks & Regards
Neil

> 
> Thanks,
> Emil--
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel

Reply via email to