Re: ix(4): align rx payloads to the end of the cluster

2019-02-25 Thread Claudio Jeker
On Mon, Feb 25, 2019 at 11:13:11AM +0100, Claudio Jeker wrote:
> On Mon, Feb 25, 2019 at 08:08:32PM +1000, David Gwynne wrote:
> > On Mon, Feb 25, 2019 at 08:44:35AM +0100, Claudio Jeker wrote:
> > > On Mon, Feb 25, 2019 at 10:49:16AM +1000, David Gwynne wrote:
> > > > the mcl2k2 pool, aka the intel mbuf cluster pool, gets set up to 
> > > > allocate
> > > > at least 2048 + 2 bytes, which gets rounded up by 64 bytes to 2112
> > > > bytes. this diff makes ix move the reception of packets to the end of
> > > > the 2112 byte allocation so there's space left at the front of the mbuf.
> > > > 
> > > > this in turn makes it more likely that an m_prepend at another point in
> > > > the system will work without an extra mbuf allocation. eg, if you're
> > > > bridging or routing between vlans and vlans on svlans somewhere else,
> > > > this will be a bit faster with this diff.
> > > > 
> > > > thoughts? ok?
> > > 
> > > I think using m_align() here may be benefitial. Since it does exactly
> > > that. Apart from that I have to agree, shifting the packet back makes a
> > > lot of sense.
> > 
> > Like this?
> 
> OK claudio@

Actually no, my bad. m_align() will align the buffer to a long word
boundary and that's not what we want. But your first diff is also not
quite right since m_len is modified by m_align and so the length is
getting < 2048. (which may not matter here but better be 100% correct).
Setting mp->m_len = mp->m_pkthdr.len = mp->m_ext.ext_size should do the
trick.

mp->m_len = mp->m_pkthdr.len = mp->m_ext.ext_size;
m_adj(mp, mp->m_ext.ext_size - sc->rx_mbuf_sz;

At least this should do what we want.

> > Index: if_ix.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
> > retrieving revision 1.153
> > diff -u -p -r1.153 if_ix.c
> > --- if_ix.c 21 Feb 2019 03:16:47 -  1.153
> > +++ if_ix.c 25 Feb 2019 10:06:59 -
> > @@ -2389,8 +2395,8 @@ ixgbe_get_buf(struct rx_ring *rxr, int i
> > if (!mp)
> > return (ENOBUFS);
> >  
> > +   m_align(mp, sc->rx_mbuf_sz);
> > mp->m_len = mp->m_pkthdr.len = sc->rx_mbuf_sz;
> > -   m_adj(mp, ETHER_ALIGN);
> >  
> > error = bus_dmamap_load_mbuf(rxr->rxdma.dma_tag, rxbuf->map,
> > mp, BUS_DMA_NOWAIT);
> > 
> 
> -- 
> :wq Claudio
> 

-- 
:wq Claudio



Re: ix(4): align rx payloads to the end of the cluster

2019-02-25 Thread Mark Kettenis
> Date: Mon, 25 Feb 2019 08:44:35 +0100
> From: Claudio Jeker 
> 
> On Mon, Feb 25, 2019 at 10:49:16AM +1000, David Gwynne wrote:
> > the mcl2k2 pool, aka the intel mbuf cluster pool, gets set up to allocate
> > at least 2048 + 2 bytes, which gets rounded up by 64 bytes to 2112
> > bytes. this diff makes ix move the reception of packets to the end of
> > the 2112 byte allocation so there's space left at the front of the mbuf.
> > 
> > this in turn makes it more likely that an m_prepend at another point in
> > the system will work without an extra mbuf allocation. eg, if you're
> > bridging or routing between vlans and vlans on svlans somewhere else,
> > this will be a bit faster with this diff.
> > 
> > thoughts? ok?
> 
> I think using m_align() here may be benefitial. Since it does exactly
> that. Apart from that I have to agree, shifting the packet back makes a
> lot of sense.

As long as this still guarantees that packets are still aligned
properly with the IP header on an 32-bit boundary...

Should we add a KASSERT for that here?

> > Index: dev/pci/if_ix.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
> > retrieving revision 1.152
> > diff -u -p -r1.152 if_ix.c
> > --- dev/pci/if_ix.c 22 Jun 2017 02:44:37 -  1.152
> > +++ dev/pci/if_ix.c 25 Feb 2019 00:40:47 -
> > @@ -2445,7 +2445,7 @@ ixgbe_get_buf(struct rx_ring *rxr, int i
> > return (ENOBUFS);
> >  
> > mp->m_len = mp->m_pkthdr.len = sc->rx_mbuf_sz;
> > -   m_adj(mp, ETHER_ALIGN);
> > +   m_adj(mp, mp->m_ext.ext_size - sc->rx_mbuf_sz);
> >  
> > error = bus_dmamap_load_mbuf(rxr->rxdma.dma_tag, rxbuf->map,
> > mp, BUS_DMA_NOWAIT);
> > 
> 
> -- 
> :wq Claudio
> 
> 



Re: ix(4): align rx payloads to the end of the cluster

2019-02-25 Thread Claudio Jeker
On Mon, Feb 25, 2019 at 08:08:32PM +1000, David Gwynne wrote:
> On Mon, Feb 25, 2019 at 08:44:35AM +0100, Claudio Jeker wrote:
> > On Mon, Feb 25, 2019 at 10:49:16AM +1000, David Gwynne wrote:
> > > the mcl2k2 pool, aka the intel mbuf cluster pool, gets set up to allocate
> > > at least 2048 + 2 bytes, which gets rounded up by 64 bytes to 2112
> > > bytes. this diff makes ix move the reception of packets to the end of
> > > the 2112 byte allocation so there's space left at the front of the mbuf.
> > > 
> > > this in turn makes it more likely that an m_prepend at another point in
> > > the system will work without an extra mbuf allocation. eg, if you're
> > > bridging or routing between vlans and vlans on svlans somewhere else,
> > > this will be a bit faster with this diff.
> > > 
> > > thoughts? ok?
> > 
> > I think using m_align() here may be benefitial. Since it does exactly
> > that. Apart from that I have to agree, shifting the packet back makes a
> > lot of sense.
> 
> Like this?

OK claudio@
 
> Index: if_ix.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
> retrieving revision 1.153
> diff -u -p -r1.153 if_ix.c
> --- if_ix.c   21 Feb 2019 03:16:47 -  1.153
> +++ if_ix.c   25 Feb 2019 10:06:59 -
> @@ -2389,8 +2395,8 @@ ixgbe_get_buf(struct rx_ring *rxr, int i
>   if (!mp)
>   return (ENOBUFS);
>  
> + m_align(mp, sc->rx_mbuf_sz);
>   mp->m_len = mp->m_pkthdr.len = sc->rx_mbuf_sz;
> - m_adj(mp, ETHER_ALIGN);
>  
>   error = bus_dmamap_load_mbuf(rxr->rxdma.dma_tag, rxbuf->map,
>   mp, BUS_DMA_NOWAIT);
> 

-- 
:wq Claudio



Re: ix(4): align rx payloads to the end of the cluster

2019-02-25 Thread David Gwynne
On Mon, Feb 25, 2019 at 08:44:35AM +0100, Claudio Jeker wrote:
> On Mon, Feb 25, 2019 at 10:49:16AM +1000, David Gwynne wrote:
> > the mcl2k2 pool, aka the intel mbuf cluster pool, gets set up to allocate
> > at least 2048 + 2 bytes, which gets rounded up by 64 bytes to 2112
> > bytes. this diff makes ix move the reception of packets to the end of
> > the 2112 byte allocation so there's space left at the front of the mbuf.
> > 
> > this in turn makes it more likely that an m_prepend at another point in
> > the system will work without an extra mbuf allocation. eg, if you're
> > bridging or routing between vlans and vlans on svlans somewhere else,
> > this will be a bit faster with this diff.
> > 
> > thoughts? ok?
> 
> I think using m_align() here may be benefitial. Since it does exactly
> that. Apart from that I have to agree, shifting the packet back makes a
> lot of sense.

Like this?

Index: if_ix.c
===
RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.153
diff -u -p -r1.153 if_ix.c
--- if_ix.c 21 Feb 2019 03:16:47 -  1.153
+++ if_ix.c 25 Feb 2019 10:06:59 -
@@ -2389,8 +2395,8 @@ ixgbe_get_buf(struct rx_ring *rxr, int i
if (!mp)
return (ENOBUFS);
 
+   m_align(mp, sc->rx_mbuf_sz);
mp->m_len = mp->m_pkthdr.len = sc->rx_mbuf_sz;
-   m_adj(mp, ETHER_ALIGN);
 
error = bus_dmamap_load_mbuf(rxr->rxdma.dma_tag, rxbuf->map,
mp, BUS_DMA_NOWAIT);



Re: ix(4): align rx payloads to the end of the cluster

2019-02-24 Thread Claudio Jeker
On Mon, Feb 25, 2019 at 10:49:16AM +1000, David Gwynne wrote:
> the mcl2k2 pool, aka the intel mbuf cluster pool, gets set up to allocate
> at least 2048 + 2 bytes, which gets rounded up by 64 bytes to 2112
> bytes. this diff makes ix move the reception of packets to the end of
> the 2112 byte allocation so there's space left at the front of the mbuf.
> 
> this in turn makes it more likely that an m_prepend at another point in
> the system will work without an extra mbuf allocation. eg, if you're
> bridging or routing between vlans and vlans on svlans somewhere else,
> this will be a bit faster with this diff.
> 
> thoughts? ok?

I think using m_align() here may be benefitial. Since it does exactly
that. Apart from that I have to agree, shifting the packet back makes a
lot of sense.
 
> Index: dev/pci/if_ix.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
> retrieving revision 1.152
> diff -u -p -r1.152 if_ix.c
> --- dev/pci/if_ix.c   22 Jun 2017 02:44:37 -  1.152
> +++ dev/pci/if_ix.c   25 Feb 2019 00:40:47 -
> @@ -2445,7 +2445,7 @@ ixgbe_get_buf(struct rx_ring *rxr, int i
>   return (ENOBUFS);
>  
>   mp->m_len = mp->m_pkthdr.len = sc->rx_mbuf_sz;
> - m_adj(mp, ETHER_ALIGN);
> + m_adj(mp, mp->m_ext.ext_size - sc->rx_mbuf_sz);
>  
>   error = bus_dmamap_load_mbuf(rxr->rxdma.dma_tag, rxbuf->map,
>   mp, BUS_DMA_NOWAIT);
> 

-- 
:wq Claudio



ix(4): align rx payloads to the end of the cluster

2019-02-24 Thread David Gwynne
the mcl2k2 pool, aka the intel mbuf cluster pool, gets set up to allocate
at least 2048 + 2 bytes, which gets rounded up by 64 bytes to 2112
bytes. this diff makes ix move the reception of packets to the end of
the 2112 byte allocation so there's space left at the front of the mbuf.

this in turn makes it more likely that an m_prepend at another point in
the system will work without an extra mbuf allocation. eg, if you're
bridging or routing between vlans and vlans on svlans somewhere else,
this will be a bit faster with this diff.

thoughts? ok?

Index: dev/pci/if_ix.c
===
RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.152
diff -u -p -r1.152 if_ix.c
--- dev/pci/if_ix.c 22 Jun 2017 02:44:37 -  1.152
+++ dev/pci/if_ix.c 25 Feb 2019 00:40:47 -
@@ -2445,7 +2445,7 @@ ixgbe_get_buf(struct rx_ring *rxr, int i
return (ENOBUFS);
 
mp->m_len = mp->m_pkthdr.len = sc->rx_mbuf_sz;
-   m_adj(mp, ETHER_ALIGN);
+   m_adj(mp, mp->m_ext.ext_size - sc->rx_mbuf_sz);
 
error = bus_dmamap_load_mbuf(rxr->rxdma.dma_tag, rxbuf->map,
mp, BUS_DMA_NOWAIT);