Re: Is 2X faster large memcpy interesting?

Thomas Moran Thu, 26 Mar 2009 16:35:12 -0700

On 26/03/2009 20:08, Don wrote:

BTW: I tested the memcpy() code provided in AMD's 1992 optimisation
manual, and in Intel's 2007 manual. Only one of them actually gave any
benefit when run on a 2008 Intel Core2 -- which was it? (Hint: it wasn't
Intel!)
I've noticed that AMD's docs are usually greatly superior to Intels, but
this time the difference is unbelievable.

Don, have you seen Agner Fog's memcpy() and memmove() implementationsincluded with the most recent versions of his manuals? In the unalignedcase they read two XMM words and shift/combine them into the targetalignment, so all loads and stores are aligned. Pretty cool.


He says (modestly):

; This method is 2 - 6 times faster than the implementations in the
; standard C libraries (MS, Gnu) when src or dest are misaligned.
; When src and dest are aligned by 16 (relative to each other) then this
; function is only slightly faster than the best standard libraries.

Re: Is 2X faster large memcpy interesting?

Reply via email to