Andrew Gallatin wrote:
> Garrett D'Amore writes:
>  > Andrew Gallatin wrote:
>  > > Garrett D'Amore writes:
>  > >  > The problem here is that the only reason to lower the MTU is to deal 
>  > >  > with cases where Path MTU discovery fails.  For example, lowering the 
>  > >  > MTU because your upstream provider doesn't properly deal with frames 
>  > >  > larger than a PPP size or somesuch.
>  > >  > 
>  > >  > Its frustrating that these cases still exist, but they do.  In 
> general, 
>  > >  > I agree, that lowering the MTU should not be necessary.  And indeed, 
>  > >  > frankly nobody should need to touch the values provided by the media 
>  > >  > drivers when everything works properly.
>  > >
>  > > You may want to touch the values in order to reduce memory useage if
>  > > you know you cannot use jubmo frames.  Since most drivers manage their
>  > > own receive buffers, this can add up.  For example, my 10GbE driver,
>  > > depending on load, may allocate up to a (tunable) maximum of 4096
>  > > receive buffers.  The difference between 4096 1500b and 9000b frames
>  > > is nearly 30MB.
>  > >
>  > > It would be nice if the driver could be notified that the MTU is
>  > > changing so that it can re-allocate appropriately sized receive
>  > > buffers.  Every other *nix that I've worked with does this.
>  > >   
>  > 
>  > Okay, fair enough. :-)
>  > 
>  > Btw, I am *hopeful* that one day in the future Nemo will provide buffer 
>  > management on behalf of drivers.  This will address some of the 
>  > long-standing races with "loan-up", and free drivers from making poor 
>  > decisions as to when to bcopy or use loan up.  (Or maybe just allocate a 
>  > new DMA or DVMA buffer....)
>
> Or maybe just fix the IOMMU problem..
>
> The main reason drivers have to do any of this loaning or bcopying
> nonsense is because translating a kernel virtual to a DMA address on
> IOMMU infected systems is so horribly expensive.  The one (only?)
> thing MacOSX got right in its network buffer management is that it
> pre-enters all network buffers into the IOMMU(s), so that obtaining a
> DMA address is a just a simple table lookup, without any hardware
> interaction.  
>   

But some Sun drivers do this as well... hence dvma_reserve().

The problem, as I understand it, is that even this requires buffers to 
be reused.  For packets that are loaned up in the stack, there is no 
guarantee that they will be returned in a timely fashion to the driver.  
So we still wind up seeing the cost of bcopy come up from time to time.

Of course, in general, the stack does return large buffers back to 
userland ... it is most likely to "hang on" to smaller packets, which 
may be better served by a bcopy anyway.

    -- Garrett
> Drew
>   


Reply via email to