On 03/12/10 21:29, Mahesh.Vardhamanaiah at Emulex.Com wrote: > Tx we use the bcopy mode if the packet /fragment size is less than 512 > bytes and use Direct DMA for other sizes. > In the Rx we use bcopy if the packet size is less than 128 bytes and use the > preallocated, premapped driver > Buffer pool for other packet sizes. > > There is significant difference in Tx/Rx Tx is normally 6+ G but Rx is only > 2+ G.
So you have 1 Tx ring and 4 Rx rings. Have you ported your driver to Crossbow framework? If not, the Rx/Tx rings won't be exposed to mac layer and Rs side won't be able to take part in Crossbow features like polling. The Rx/Tx ring capability are exposed via MAC_CAPAB_RINGS capability. Also ip_soft_rings_cnt tunable you mention below is no more available in Opensolaris. -krgopi > > Do you think using DVMA will help here ?? > > > -Mahesh > > From: crossbow-discuss-bounces at opensolaris.org > [mailto:crossbow-discuss-bounces at opensolaris.org] On Behalf Of Krishna > Yenduri > Sent: Thursday, March 11, 2010 12:45 AM > To: crossbow-discuss at opensolaris.org > Subject: [crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance > on sparc > > > > -------- Original Message -------- > Subject: > > Re: [osol-code] GLDv3 NIC driver Performance on sparc > > Date: > > Wed, 10 Mar 2010 11:10:12 -0800 > > From: > > Garrett D'Amore <garrett at damore.org><mailto:garrett at damore.org> > > To: > > opensolaris-code at opensolaris.org<mailto:opensolaris-code at > opensolaris.org> > > > > On 03/10/10 10:48 AM, Mahesh wrote: > >> Hi all, > > >> I need some help in debugging the gldv3 driver performance issue on sparc. >> The driver has single Tx queue and 4 Rx queues and performs at almost the >> line rate(10G) on Sun intel boxes but the same driver performs very badly on >> sparc. The code is identical for sparc and intel except swapping involved >> since the hardware is little endian . Any idea how to debug this issue ?? > >> The machine i tried to bench mark is T5440 and i have tried setting >> ip_soft_rings_count = 16 on T5440 but result is same . > > > > > What is "badly"? > > > > Note that T5440 hardware uses individual cores which are probably quite > > a bit slower than an x86 core. Additionally, there could be resource > > contention (caches, etc.) due to different Niagra architecture here. > > > > Note also that I've been told that "bcopy" performs a bit slower on > > Niagra than on other SPARC or x86 architectures -- are you using bcopy > > to copy packet data, or are you using direct DMA? (Also, unless you > > take care, DMA setup and teardown on SPARC systems -- which use an IOMMU > > -- is quite expensive. In order to get good performance with direct > > DMA you really have to use loan up or something like it. Its tricky to > > get this right.) > > > > Some other questions: what size MTU are you using? Are you sure that > > you're hitting each of your 4 RX rings basically "equally" by using > > different streams and making sure that traffic from a single stream > > stays on the same h/w ring? > > > > Is there a significant difference between TX and RX performance? > > > > - Garrett > > > > > > > > >> Thanks > >> Mahesh > > > > > _______________________________________________ > > opensolaris-code mailing list > > opensolaris-code at opensolaris.org<mailto:opensolaris-code at > opensolaris.org> > > http://mail.opensolaris.org/mailman/listinfo/opensolaris-code > > > > ------------------------------------------------------------------------ > > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss --
