Re: [PATCH] [15/58] i386: Rewrite sched_clock (cmpxchg8b)

2007-07-19 Thread Mathieu Desnoyers
* Nick Piggin ([EMAIL PROTECTED]) wrote: > Mathieu Desnoyers wrote: > > >I tried it with and without the LOCK prefix on my Pentium 4. > > > >Locked cmpxchg8b : 90 cycles > >Non locked cmpxchg8b: 30 cycles > >sti: 166 cycles > >cli: 159 cycles > > > >So, hrm, even if we use the locked version, it

Re: [PATCH] [15/58] i386: Rewrite sched_clock (cmpxchg8b)

2007-07-19 Thread Nick Piggin
Mathieu Desnoyers wrote: I tried it with and without the LOCK prefix on my Pentium 4. Locked cmpxchg8b : 90 cycles Non locked cmpxchg8b: 30 cycles sti: 166 cycles cli: 159 cycles So, hrm, even if we use the locked version, it is still much faster than the sti/cli. I am thoughtful about the

Re: [PATCH] [15/58] i386: Rewrite sched_clock (cmpxchg8b)

2007-07-19 Thread Mathieu Desnoyers
* Mathieu Desnoyers ([EMAIL PROTECTED]) wrote: > I just want to rectify a detail: local_t uses type "long", which is 32 > bits on x86_32 and 64 bits on x86_64. > > Using a cmpxchg8b on i386 seems to require the LOCK prefix to be taken, > so it may degrate performances too much. Therefore, you

Re: [PATCH] [15/58] i386: Rewrite sched_clock (cmpxchg8b)

2007-07-19 Thread Mathieu Desnoyers
* Mathieu Desnoyers ([EMAIL PROTECTED]) wrote: I just want to rectify a detail: local_t uses type long, which is 32 bits on x86_32 and 64 bits on x86_64. Using a cmpxchg8b on i386 seems to require the LOCK prefix to be taken, so it may degrate performances too much. Therefore, you may

Re: [PATCH] [15/58] i386: Rewrite sched_clock (cmpxchg8b)

2007-07-19 Thread Nick Piggin
Mathieu Desnoyers wrote: I tried it with and without the LOCK prefix on my Pentium 4. Locked cmpxchg8b : 90 cycles Non locked cmpxchg8b: 30 cycles sti: 166 cycles cli: 159 cycles So, hrm, even if we use the locked version, it is still much faster than the sti/cli. I am thoughtful about the

Re: [PATCH] [15/58] i386: Rewrite sched_clock (cmpxchg8b)

2007-07-19 Thread Mathieu Desnoyers
* Nick Piggin ([EMAIL PROTECTED]) wrote: Mathieu Desnoyers wrote: I tried it with and without the LOCK prefix on my Pentium 4. Locked cmpxchg8b : 90 cycles Non locked cmpxchg8b: 30 cycles sti: 166 cycles cli: 159 cycles So, hrm, even if we use the locked version, it is still much