Re: Small memcpy optimization

2012-11-10 Thread Stefan Fritsch
On Thu, 8 Nov 2012, Mark Kettenis wrote: On Tuesday 21 August 2012, Stefan Fritsch wrote: On x86, the xchg operation between reg and mem has an implicit lock prefix, i.e. it is a relatively expensive atomic operation. This is not needed here. OKs, anyone? What you say makes sense, although

Re: Small memcpy optimization

2012-11-10 Thread Mark Kettenis
Date: Sat, 10 Nov 2012 18:10:53 +0100 (CET) From: Stefan Fritsch s...@sfritsch.de On Thu, 8 Nov 2012, Mark Kettenis wrote: On Tuesday 21 August 2012, Stefan Fritsch wrote: On x86, the xchg operation between reg and mem has an implicit lock prefix, i.e. it is a relatively expensive

Re: Small memcpy optimization

2012-11-08 Thread Mark Kettenis
From: Stefan Fritsch s...@sfritsch.de Date: Thu, 1 Nov 2012 22:43:33 +0100 On Tuesday 21 August 2012, Stefan Fritsch wrote: On x86, the xchg operation between reg and mem has an implicit lock prefix, i.e. it is a relatively expensive atomic operation. This is not needed here. OKs,

Re: Small memcpy optimization

2012-11-08 Thread Ted Unangst
On Thu, Nov 01, 2012 at 22:43, Stefan Fritsch wrote: On Tuesday 21 August 2012, Stefan Fritsch wrote: On x86, the xchg operation between reg and mem has an implicit lock prefix, i.e. it is a relatively expensive atomic operation. This is not needed here. OKs, anyone? What do other

Re: Small memcpy optimization

2012-11-01 Thread Stefan Fritsch
On Tuesday 21 August 2012, Stefan Fritsch wrote: On x86, the xchg operation between reg and mem has an implicit lock prefix, i.e. it is a relatively expensive atomic operation. This is not needed here. OKs, anyone? --- a/sys/arch/i386/i386/locore.s +++ b/sys/arch/i386/i386/locore.s @@

Small memcpy optimization

2012-08-21 Thread Stefan Fritsch
On x86, the xchg operation between reg and mem has an implicit lock prefix, i.e. it is a relatively expensive atomic operation. This is not needed here. --- a/sys/arch/i386/i386/locore.s +++ b/sys/arch/i386/i386/locore.s @@ -802,8 +802,9 @@ ENTRY(bcopy) */ ENTRY(memcpy) movl