Re: fast bcopy...

2012-05-04 Thread Luigi Rizzo
On Fri, May 04, 2012 at 09:44:15AM +1000, Andrew Reilly wrote: On Wed, May 02, 2012 at 08:25:57PM +0200, Luigi Rizzo wrote: as part of my netmap investigations, i was looking at how expensive are memory copies, and here are a couple of findings (first one is obvious, the second one less so)

Re: fast bcopy...

2012-05-03 Thread Steven Atreju
K. Macy wrote [2012-05-03 02:58+0200]: It's highly chipset and processor dependent what works best. Yes, of course. Though i was kinda, even shocked, once i've seen this first: http://marc.info/?l=dragonfly-commitsm=132241713812022w=2 So we don't use our assembler version for new gccs and

Re: fast bcopy...

2012-05-03 Thread Attilio Rao
2012/5/3, Steven Atreju snatr...@googlemail.com: K. Macy wrote [2012-05-03 02:58+0200]: It's highly chipset and processor dependent what works best. Yes, of course. Though i was kinda, even shocked, once i've seen this first: http://marc.info/?l=dragonfly-commitsm=132241713812022w=2 So

Re: fast bcopy...

2012-05-03 Thread Gabor Kovesdan
Em 03-05-2012 12:28, Steven Atreju escreveu: Yes, of course. Though i was kinda, even shocked, once i've seen this first: http://marc.info/?l=dragonfly-commitsm=132241713812022w=2 I also experimented a bit with some trivial libc functions when testing a change for memcpy (still in queue,

RE: fast bcopy...

2012-05-03 Thread rozhuk . im
guess this is a good time to thank the FreeBSD hackers for that FPU stack FILD/FISTP idea! I'll append the copy related notes of our doc/memperf.txt. Thanks, I made an implementation of fpu unwinding and mmx copy to see if they were really making a difference years ago (reimplementing

Re: fast bcopy...

2012-05-03 Thread Andrew Reilly
On Wed, May 02, 2012 at 08:25:57PM +0200, Luigi Rizzo wrote: as part of my netmap investigations, i was looking at how expensive are memory copies, and here are a couple of findings (first one is obvious, the second one less so) Most C compilers (well, the ones I regularly use) inline small,

fast bcopy...

2012-05-02 Thread Luigi Rizzo
as part of my netmap investigations, i was looking at how expensive are memory copies, and here are a couple of findings (first one is obvious, the second one less so) 1. especially on 64bit machines, always use multiple of at least 8 bytes (possibly even larger units). The bcopy code in

Re: fast bcopy...

2012-05-02 Thread Alex Dupre
Luigi Rizzo ha scritto: For small blocks and multiples of 32-64 bytes, i noticed that the following is a lot faster (breaking even at about 1 KBytes) static inline void fast_bcopy(void *_src, void *_dst, int l) { uint64_t *src = _src;

Re: fast bcopy...

2012-05-02 Thread Steven Atreju
Luigi Rizzo wrote: 2. apparently, bcopy is not the fastest way to copy memory. http://now.cs.berkeley.edu/Td/bcopy.html Best Regards. Steven. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To

Re: fast bcopy...

2012-05-02 Thread K. Macy
It's highly chipset and processor dependent what works best. Intel now has non-temporal loads and stores which work much better in some cases but provide little benefit in others. -Kip On Wed, May 2, 2012 at 11:52 PM, Steven Atreju snatr...@googlemail.com wrote: Luigi Rizzo wrote: 2.

Re: fast bcopy...

2012-05-02 Thread Arnaud Lacombe
Hi, On Wed, May 2, 2012 at 5:52 PM, Steven Atreju snatr...@googlemail.com wrote: Luigi Rizzo wrote: 2. apparently, bcopy is not the fastest way to copy memory. http://now.cs.berkeley.edu/Td/bcopy.html Pentium 166, Triton Chipset, EDO memory... ahem. - Arnaud Best Regards. Steven.