Re: [networking-discuss] LRO Implementation.

Garrett D'Amore Wed, 04 Jun 2008 09:17:28 -0700

Kacheong Poon wrote:
> Erik Nordmark wrote:
>
>   
>> I wasn't just concerned about the complexity in the driver - I am 
>> concerned about the total system complexity caused by the MDT 
>> implementation.  The amount of code that needs to know about M_MULTIDATA 
>> is scary, and in many cases there are different code paths to deal with 
>> those which makes understanding, supporting, and bug fixing much more 
>> complex.
>>     
>
>
> I think one reason for the above is that we must be
> backward compatible, hence we need to keep the good
> old path forever.  The sad truth is that we will
> always be limited by the existing mblk construct if
> we cannot accept different code paths.  Note that I
> am not promoting multiple code paths.
>


However, MDT is *not* a public API, so if we could ever get the Cassini 
driver updated to something a bit more modern (like GLDv3!), then we 
*could* eliminate MDT.  This is one of the reasons I tried to champion 
the effort to port Cassini to GLDv3.  (And yes, I'm still bitter about 
the fact that NSN was so 100% totally closed minded about even the 
*possibility* of entertaining a GLDv3 port.   I still have about 80% of 
the GLDv3 conversion work done -- its probably another couple of man 
weeks to finish it, but I don't think it will ever be picked up.)

Cleaning up MDT is one of the *major* benefits that the effort would, 
and the *entire* networking stack would benefit. (Recall at the time I 
was really trying to improve the PPS numbers for Solaris.)

>
>   
>> Architecturally it makes more sense to have everything about GLD just 
>> view everything as TCP LSO. In the case the hardware doesn't handle LSO 
>> it is quite efficient to convert the LSO format to an "MDT format". By 
>> this I mean take LSO's 'one TCP/IP header, one large payload' into 
>> 'multiple TCP/IP headers, separate payloads but on the same pages'. That 
>> means you'd get the performance benefit of doing DMA/IOMMU setup for the 
>> single large payload and page with N TCP/IP headers.
>>     
>
>
> As Jim stated, the question is whether we want to do
> the above given the already known problems.  For example,
> suppose TCP wants to do better PMTUd and wants to change
> the segment size on the fly.  In order to recover faster
> in case PMTU has not changed, it decides to send alternate
> small and big segments.  I think the above GLD LSO scheme
> will not allow this easily.  TCP will need to do multiple
> sends just like today.  And I guess the above GLD LSO
> scheme still won't solve the issues I gave in my previous
> email.  So maybe we can just do the simple thing and forget
> about this GLD LSO thingy.  And just make the code path
> simple and quick enough.
>   

The one thing I'll add, is based on my own analysis, far and away the 
longest portion of the code paths are actually in the device drivers 
(from the point the driver's send routine is called).  Simplifying the 
code in the device drivers is likely to gain the best improvement.  That 
said, if there were a way to amortize dma setup, teardown, and buffer 
management (especially to get it outside of the locks used by the 
driver), I think that it may be worth doing.  TCP could still be doing 
the header creation, but if it premapped a large segment for the driver, 
then at least for ordinary size packets, there would likely be a 
significant win.)

The challenge is to find a way to do this, that is not so invasive to 
the stack.  I'm not entirely sure how to achieve that.  But MDT as it is 
implemented today is *not* that way.

    -- Garrett
_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] LRO Implementation.

Reply via email to