On 03/12/10 09:29 PM, Mahesh.Vardhamanaiah at Emulex.Com wrote:
>
> Tx we use the bcopy mode if the packet /fragment size   is less than 
> 512 bytes and use Direct DMA for other sizes.
>
> In the Rx we use bcopy if the packet size is less than 128 bytes and 
> use the preallocated, premapped  driver
>
> Buffer pool for other packet sizes.
>
> There is significant difference in Tx/Rx  Tx is normally 6+ G but Rx 
> is only 2+ G.
>
> Do you think using DVMA will help here ??
>

I suspect your tradeoff at 128 bytes is too small.  I'd bcopy all the 
way up to ~1K, maybe even up to full 1500 byte frames.

     - Garrett

> -Mahesh
>
> *From:* crossbow-discuss-bounces at opensolaris.org 
> [mailto:crossbow-discuss-bounces at opensolaris.org] *On Behalf Of 
> *Krishna Yenduri
> *Sent:* Thursday, March 11, 2010 12:45 AM
> *To:* crossbow-discuss at opensolaris.org
> *Subject:* [crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver 
> Performance on sparc
>
>
>
> -------- Original Message --------
>
> *Subject: *
>
>       
>
> Re: [osol-code] GLDv3 NIC driver Performance on sparc
>
> *Date: *
>
>       
>
> Wed, 10 Mar 2010 11:10:12 -0800
>
> *From: *
>
>       
>
> Garrett D'Amore <garrett at damore.org> <mailto:garrett at damore.org>
>
> *To: *
>
>       
>
> opensolaris-code at opensolaris.org <mailto:opensolaris-code at 
> opensolaris.org>
>
> On 03/10/10 10:48 AM, Mahesh wrote:
> >  Hi all,
> >  
> >    I need some help in debugging the gldv3 driver performance issue on 
> > sparc. The driver has single Tx queue and 4 Rx queues and performs at 
> > almost the line rate(10G) on Sun intel boxes but the same driver performs 
> > very badly on sparc. The code is identical for sparc and intel except 
> > swapping involved since the hardware is little endian . Any idea how to 
> > debug this issue ??
> >  The machine i tried to bench mark is T5440  and i have tried setting 
> > ip_soft_rings_count = 16 on T5440 but result is same .
> >     
>   
> What is "badly"?
>   
> Note that T5440 hardware uses individual cores which are probably quite
> a bit slower than an x86 core.  Additionally, there could be resource
> contention (caches, etc.) due to different Niagra architecture here.
>   
> Note also that I've been told that "bcopy" performs a bit slower on
> Niagra than on other SPARC or x86 architectures -- are you using bcopy
> to copy packet data, or are you using direct DMA?  (Also, unless you
> take care, DMA setup and teardown on SPARC systems -- which use an IOMMU
> -- is quite expensive.   In order to get good performance with direct
> DMA you really have to use loan up or something like it.  Its tricky to
> get this right.)
>   
> Some other questions: what size MTU are you using?  Are you sure that
> you're hitting each of your 4 RX rings basically "equally" by using
> different streams and making sure that traffic from a single stream
> stays on the same h/w ring?
>   
> Is there a significant difference between TX and RX performance?
>   
>      - Garrett
>   
>   
>   
> >  
> >  Thanks
> >  Mahesh
> >     
>   
> _______________________________________________
> opensolaris-code mailing list
> opensolaris-code at opensolaris.org  <mailto:opensolaris-code at 
> opensolaris.org>
> http://mail.opensolaris.org/mailman/listinfo/opensolaris-code
>
>
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>    

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/crossbow-discuss/attachments/20100312/2be3850d/attachment-0001.html>

Reply via email to