On Tue Feb 23 07:55:26 PST 2016, kennylevin...@gmail.com wrote: > A benchmark was supposedly made of the new duffcopy/duffzero which claimed > significant speedup for larger copies: > https://github.com/golang/go/commit/5cf281a9b791f0f10efd1574934cbb19ea1b33da > > I have no clue whether this holds true or not. My intention to reenable > duffcopy and continue to use duffzero is mostly to avoid differences and > ensure that the note handlers are floating point free in the future. Whether > the duffcopy/duffzero’s current form is an actual optimization or just a > complexity, I cannot say. A test was made in #cat-v out of annoyance where > the result seemed to be that it was indeed faster to use MOVUPS, but I don’t > remember the details.
that post is a speedup relative to the original asm, which might not be as good as the best non-sse versions, and it is also for amd64. - erik