Re: Net: ucc_geth ethernet driver optimization space

2009-05-27 Thread Joakim Tjernlund


linuxppc-dev-bounces+joakim.tjernlund=transmode...@ozlabs.org wrote on 
27/05/2009 07:08:07:

 Guys,

 The ucc_geth ethernet driver have dozens of strong sync read/write
 operation, such as in_be32/16/8, out_be32/16/8.

 all of them is sync read/write, it is very expensive for performance.

 For the critical patch, we can remove some unnecessary in_be(x),
 out_be(x) with normal memory operation, and keep some necessary
 memory barrier.

 eg: BD access in the interrupt handler and start_xmit.

 The BD operation only need the memory barrier between length/buffer
 and status.

 struct buffer descriptor {
u16 status;
u16 length;
u32 buffer;
 } __attribute__ ((packed));

 struct buffer descriptor *BD;

 BD-length = ;
 BD-buffer = ;
 wmb();
 BD-status = ;

 For powerpc, eieio is enough for 60x, mbar 1 is enough for e500.
 Of couse, also need the memory clobber to avoid the compiler
 reorder between them.

 Thanks, Dave

Yes, pretty please :)

You might want to combine status and length into one U32 though:
BD-buffer = ;
wmb();
BD-stat_len =   16 | ;

  Jocke

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Net: ucc_geth ethernet driver optimization space

2009-05-27 Thread Li Yang
On Wed, May 27, 2009 at 1:08 PM, Liu Dave-R63238 dave...@freescale.com wrote:
 Guys,

 The ucc_geth ethernet driver have dozens of strong sync read/write
 operation, such as in_be32/16/8, out_be32/16/8.

 all of them is sync read/write, it is very expensive for performance.


Totally agree.  That's one of my concerns right from the beginning.

 For the critical patch, we can remove some unnecessary in_be(x),
 out_be(x) with normal memory operation, and keep some necessary
 memory barrier.

 eg: BD access in the interrupt handler and start_xmit.

 The BD operation only need the memory barrier between length/buffer
 and status.

 struct buffer descriptor {
        u16 status;
        u16 length;
        u32 buffer;
 } __attribute__ ((packed));

 struct buffer descriptor *BD;

 BD-length = ;
 BD-buffer = ;
 wmb();
 BD-status = ;

The BD can reside either in memory or memory mapped region, which
makes the case more complex.

MMIO accesses need to use IO accessors for the sparse checking.  We
might make use of the __raw_*() accessors, but I'm not sure if it's
suitable for non-PCI buses on powerpc.  And also we need to pay
special attention to the problem described here:
http://lwn.net/Articles/198988/

- Leo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Net: ucc_geth ethernet driver optimization space

2009-05-26 Thread Liu Dave-R63238
Guys,

The ucc_geth ethernet driver have dozens of strong sync read/write
operation, such as in_be32/16/8, out_be32/16/8.

all of them is sync read/write, it is very expensive for performance.

For the critical patch, we can remove some unnecessary in_be(x),
out_be(x) with normal memory operation, and keep some necessary
memory barrier.

eg: BD access in the interrupt handler and start_xmit.

The BD operation only need the memory barrier between length/buffer
and status.

struct buffer descriptor {
u16 status;
u16 length;
u32 buffer;
} __attribute__ ((packed));

struct buffer descriptor *BD;

BD-length = ;
BD-buffer = ;
wmb();
BD-status = ;

For powerpc, eieio is enough for 60x, mbar 1 is enough for e500.
Of couse, also need the memory clobber to avoid the compiler
reorder between them.

Thanks, Dave


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev