Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-08 Thread Roland Dreier
On Fri, Jul 8, 2016 at 9:51 AM, Jason Gunthorpe wrote: > So, it appears, the dst and neigh can be used for all performances cases. > > For the non performance dst == null case, can we just burn cycles and > stuff the daddr in front of the packet at hardheader

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-08 Thread Jason Gunthorpe
On Fri, Jul 08, 2016 at 07:18:11AM -0700, Roland Dreier wrote: > On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpe > wrote: > > We have neighbour_priv, and ndo_neigh_construct/destruct now .. > > > > A first blush that would seem to be enough to let ipoib store the

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-08 Thread Roland Dreier
On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpe wrote: > We have neighbour_priv, and ndo_neigh_construct/destruct now .. > > A first blush that would seem to be enough to let ipoib store the AH > and other path information in the neigh and avoid the cb? At least

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-07 Thread Jason Gunthorpe
On Thu, Jul 07, 2016 at 03:01:40PM -0700, Roland Dreier wrote: > The reason we moved to the cb storage is that in the past, trying to > hide some data in the actual skb buffer that we don't actually send We have neighbour_priv, and ndo_neigh_construct/destruct now .. A first blush that would

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-07 Thread Alexander Duyck
On Thu, Jul 7, 2016 at 3:01 PM, Roland Dreier wrote: >>> struct skb_gso_cb { >>> int mac_offset; >>> int encap_level; >>> __u16 csum_start; >>> }; > >> This is based on an out-dated version of this struct. The 4.7 RC >> kernel has a few

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-07 Thread Roland Dreier
>> struct skb_gso_cb { >> int mac_offset; >> int encap_level; >> __u16 csum_start; >> }; > This is based on an out-dated version of this struct. The 4.7 RC > kernel has a few more fields that were added to support local checksum > offload for encapsulated

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-07 Thread Alexander Duyck
On Wed, Jul 6, 2016 at 11:25 PM, Roland Dreier wrote: > On Thu, Jan 7, 2016 at 3:00 AM, Konstantin Khlebnikov > wrote: >> Or just shift GSO CB and add couple checks like >> BUILD_BUG_ON(sizeof(SKB_GSO_CB(skb)->room) < sizeof(*IPCB(skb))); > >

Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-07 Thread Roland Dreier
On Thu, Jan 7, 2016 at 3:00 AM, Konstantin Khlebnikov wrote: > Or just shift GSO CB and add couple checks like > BUILD_BUG_ON(sizeof(SKB_GSO_CB(skb)->room) < sizeof(*IPCB(skb))); Resurrecting this old thread, because the patch that ultimately went upstream (commit 9207f9d45b0a

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-07 Thread Konstantin Khlebnikov
On Thu, Jan 7, 2016 at 2:00 PM, Konstantin Khlebnikov wrote: > On Thu, Jan 7, 2016 at 2:49 AM, Florian Westphal wrote: >> Florian Westphal wrote: >>> Thadeu Lima de Souza Cascardo wrote: >>> > On Wed, Jan 06, 2016 at

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-07 Thread Eric Dumazet
On Thu, Jan 7, 2016 at 6:38 AM, Konstantin Khlebnikov wrote: > > Also I've found strange thing: reason of expanding skb->cb from 40 to > 48 bypes in 2006 > 3e3850e989c5d2eb1aab6f0fd9257759f0f4cbc6 was that struct inet6_skb_parm does > not fit. But it's is only 24 bytes. Does

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-07 Thread Eric Dumazet
On Thu, Jan 7, 2016 at 7:04 AM, Konstantin Khlebnikov wrote: > On Thu, Jan 7, 2016 at 2:59 PM, Eric Dumazet wrote: >> On Thu, Jan 7, 2016 at 6:38 AM, Konstantin Khlebnikov >> wrote: >>> >>> Also I've found strange thing: reason of

[BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Konstantin Khlebnikov
I've got some of these: [84408.314676] BUG: unable to handle kernel NULL pointer dereference at (null) [84408.317324] IP: [] put_page+0x5/0x50 [84408.319985] PGD 0 [84408.322583] Oops: [#1] SMP [84408.325156] Modules linked in: ppp_mppe ppp_async ppp_generic slhc 8021q fuse nfsd

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Florian Westphal
Thadeu Lima de Souza Cascardo wrote: > On Wed, Jan 06, 2016 at 11:11:41PM +0300, Konstantin Khlebnikov wrote: > > On Wed, Jan 6, 2016 at 10:59 PM, Cong Wang wrote: > > > On Wed, Jan 6, 2016 at 11:15 AM, Konstantin Khlebnikov > >

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Florian Westphal
Florian Westphal wrote: > Thadeu Lima de Souza Cascardo wrote: > > On Wed, Jan 06, 2016 at 11:11:41PM +0300, Konstantin Khlebnikov wrote: [ skb_gso_segment uses skb->cb[], causes oops if ip_fragment is invoked on segmented skbs ] > > I have hit this as

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Cong Wang
On Wed, Jan 6, 2016 at 11:15 AM, Konstantin Khlebnikov wrote: > Looks like this happens because ip_options_fragment() relies on > correct ip options length in ip control block in skb. But in > ip_finish_output_gso() control block in segments is reused by > skb_gso_segment().

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Konstantin Khlebnikov
On Wed, Jan 6, 2016 at 10:59 PM, Cong Wang wrote: > On Wed, Jan 6, 2016 at 11:15 AM, Konstantin Khlebnikov > wrote: >> Looks like this happens because ip_options_fragment() relies on >> correct ip options length in ip control block in skb. But in >>

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Thadeu Lima de Souza Cascardo
On Wed, Jan 06, 2016 at 11:11:41PM +0300, Konstantin Khlebnikov wrote: > On Wed, Jan 6, 2016 at 10:59 PM, Cong Wang wrote: > > On Wed, Jan 6, 2016 at 11:15 AM, Konstantin Khlebnikov > > wrote: > >> Looks like this happens because ip_options_fragment()