On 03/12/10 09:29 PM, Mahesh.Vardhamanaiah at Emulex.Com wrote:
>
> Tx we use the bcopy mode if the packet /fragment size is less than
> 512 bytes and use Direct DMA for other sizes.
>
> In the Rx we use bcopy if the packet size is less than 128 bytes and
> use the preallocated, premapped driver
>
> Buffer pool for other packet sizes.
>
> There is significant difference in Tx/Rx Tx is normally 6+ G but Rx
> is only 2+ G.
>
> Do you think using DVMA will help here ??
>
I suspect your tradeoff at 128 bytes is too small. I'd bcopy all the
way up to ~1K, maybe even up to full 1500 byte frames.
- Garrett
> -Mahesh
>
> *From:* crossbow-discuss-bounces at opensolaris.org
> [mailto:crossbow-discuss-bounces at opensolaris.org] *On Behalf Of
> *Krishna Yenduri
> *Sent:* Thursday, March 11, 2010 12:45 AM
> *To:* crossbow-discuss at opensolaris.org
> *Subject:* [crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver
> Performance on sparc
>
>
>
> -------- Original Message --------
>
> *Subject: *
>
>
>
> Re: [osol-code] GLDv3 NIC driver Performance on sparc
>
> *Date: *
>
>
>
> Wed, 10 Mar 2010 11:10:12 -0800
>
> *From: *
>
>
>
> Garrett D'Amore <garrett at damore.org> <mailto:garrett at damore.org>
>
> *To: *
>
>
>
> opensolaris-code at opensolaris.org <mailto:opensolaris-code at
> opensolaris.org>
>
> On 03/10/10 10:48 AM, Mahesh wrote:
> > Hi all,
> >
> > I need some help in debugging the gldv3 driver performance issue on
> > sparc. The driver has single Tx queue and 4 Rx queues and performs at
> > almost the line rate(10G) on Sun intel boxes but the same driver performs
> > very badly on sparc. The code is identical for sparc and intel except
> > swapping involved since the hardware is little endian . Any idea how to
> > debug this issue ??
> > The machine i tried to bench mark is T5440 and i have tried setting
> > ip_soft_rings_count = 16 on T5440 but result is same .
> >
>
> What is "badly"?
>
> Note that T5440 hardware uses individual cores which are probably quite
> a bit slower than an x86 core. Additionally, there could be resource
> contention (caches, etc.) due to different Niagra architecture here.
>
> Note also that I've been told that "bcopy" performs a bit slower on
> Niagra than on other SPARC or x86 architectures -- are you using bcopy
> to copy packet data, or are you using direct DMA? (Also, unless you
> take care, DMA setup and teardown on SPARC systems -- which use an IOMMU
> -- is quite expensive. In order to get good performance with direct
> DMA you really have to use loan up or something like it. Its tricky to
> get this right.)
>
> Some other questions: what size MTU are you using? Are you sure that
> you're hitting each of your 4 RX rings basically "equally" by using
> different streams and making sure that traffic from a single stream
> stays on the same h/w ring?
>
> Is there a significant difference between TX and RX performance?
>
> - Garrett
>
>
>
> >
> > Thanks
> > Mahesh
> >
>
> _______________________________________________
> opensolaris-code mailing list
> opensolaris-code at opensolaris.org <mailto:opensolaris-code at
> opensolaris.org>
> http://mail.opensolaris.org/mailman/listinfo/opensolaris-code
>
>
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/crossbow-discuss/attachments/20100312/2be3850d/attachment-0001.html>