Re: LOCK overheads (was Re: objtrm problem probably found)

1999-07-19 Thread Julian Elischer
A bit late, but some more data points. 90MHz Pentium, FreeBSD 2.2.7 mode 0 60.80 ns/loop nproc=1 lcks=EMPTY mode 1 91.13 ns/loop nproc=1 lcks=no mode 2 91.11 ns/loop nproc=2 lcks=no mode 3 242.59 ns/loop nproc=1 lcks=yes mode 4 242.69 ns/loop nproc=2 lcks=yes mode 5 586.27 ns/loop

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-13 Thread Alan Cox
Before this thread on "cache coherence" and "memory consistency" goes any further, I'd like to suggest a time-out to read something like http://www-ece.rice.edu/~sarita/Publications/models_tutorial.ps. A lot of what I'm reading has a grain of truth but isn't quite right. This paper appeared as a

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-13 Thread Mike Smith
On Mon, Jul 12, 1999 at 10:38:03PM -0700, Mike Smith wrote: I said: than indirect function calls on some architectures: inline branched code. So you still have a global variable selecting locked/non-locked, but it's a boolean, rather than a pointer. Your atomic macros are then {

Re: LOCK overheads (was Re: objtrm problem probably found)

1999-07-13 Thread Ollivier Robert
According to Matthew Dillon: Wow, now that *is* expensive! The K6 must be implementing it in microcode for it to be that bad. K6-200: 244 [21:57] roberto@keltia:src/C ./locktest 0 ... empty 26.84 ns/loop 1proc 22.62 ns/loop 2proc 22.64 ns/loop empty w/locks 17.58 ns/loop 1proc

Re: LOCK overheads (was Re: objtrm problem probably found)

1999-07-13 Thread Peter Jeremy
Matthew Dillon [EMAIL PROTECTED] wrote: :mode 1 17.99 ns/loop nproc=1 lcks=no :mode 3 166.33 ns/loop nproc=1 lcks=yes ... :This is a K6-2 350. Locks are pretty expensive on them. Wow, now that *is* expensive! The K6 must be implementing it in microcode for it to be that bad. I

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Peter Wemm
Doug Rabson wrote: On Mon, 12 Jul 1999, Peter Jeremy wrote: Mike Haertel [EMAIL PROTECTED] wrote: Um. FYI on x86, even if the compiler generates the RMW form "addl $1, foo", it's not atomic. If you want it to be atomic you have to precede the opcode with a LOCK prefix 0xF0.

Re: objtrm problem probably found

1999-07-12 Thread Peter Jeremy
Doug Rabson [EMAIL PROTECTED] wrote: We don't need the lock prefix for the current SMP implementation. A lock prefix would be needed in a multithreaded implementation but should not be added unless the kernel is an SMP kernel otherwise UP performance would suffer. Modulo the issue of UP vs SMP

Re: objtrm problem probably found

1999-07-12 Thread Matthew Dillon
:Or (maybe more clearly): : :#ifdef SMP :#defineSMP_LOCK"lock; " :#else :#defineSMP_LOCK :#endif : :#define ATOMIC_ASM(type,op)\ :__asm __volatile (SMP_LOCK op : "=m" (*(type *)p) : "ir" (v), "0" (*(type *)p)) Yes, precisely. :I believe the API to the

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Mike Smith
Although function calls are more expensive than inline code, they aren't necessarily a lot more so, and function calls to non-locked RMW operations are certainly much cheaper than inline locked RMW operations. This is a fairly key statement in context, and an opinion here would count for a

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Matthew Dillon
: : Although function calls are more expensive than inline code, : they aren't necessarily a lot more so, and function calls to : non-locked RMW operations are certainly much cheaper than : inline locked RMW operations. : :This is a fairly key statement in context, and an opinion here would

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Mike Smith
: : Although function calls are more expensive than inline code, : they aren't necessarily a lot more so, and function calls to : non-locked RMW operations are certainly much cheaper than : inline locked RMW operations. : :This is a fairly key statement in context, and an opinion here

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Matthew Dillon
:I assumed too much in asking the question; I was specifically :interested in indirect function calls, since this has a direct impact :on method-style implementations. Branch prediction caches are typically PC-sensitive. An indirect method call will never be as fast as a direct call,

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Peter Jeremy
Mike Smith [EMAIL PROTECTED] wrote: Although function calls are more expensive than inline code, they aren't necessarily a lot more so, and function calls to non-locked RMW operations are certainly much cheaper than inline locked RMW operations. This is a fairly key statement in context, and

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Matthew Dillon
: :I'm not sure there's any reason why you shouldn't. If you changed the :semantics of a stack segment so that memory addresses below the stack :pointer were irrelevant, you could implement a small, 0-cycle, on-chip :stack (that overflowed into memory). I don't know whether this :semantic

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Matthew Dillon
: :Based on general computer architecture principles, I'd say that a lock :prefix is likely to become more expensive[1], whilst a function call :will become cheaper[2] over time. :... : :[1] A locked instruction implies a synchronous RMW cycle. In order :to meet write-ordering guarantees

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Andrew Reilly
On Mon, Jul 12, 1999 at 07:09:58PM -0700, Mike Smith wrote: Although function calls are more expensive than inline code, they aren't necessarily a lot more so, and function calls to non-locked RMW operations are certainly much cheaper than inline locked RMW operations. This is a fairly

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Matthew Dillon
:... I would also like to add a few more notes in regards to write pipelines. Write pipelines are not used any more, at least not long ones. The reason is simply the cache coherency issue again. Until the data is actually written into the L1 cache, it is acoherent.

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Mike Haertel
Second answer: in the real world, we're nearly always hitting the cache on stack operations associated with calls and argument passing, but not less often on operations in the procedure body. So, in ^^^ typo Urk. I meant to say "less often", delete the "not". To Unsubscribe: send mail

Re: objtrm problem probably found (was Re: Stuck in objtrm)

1999-07-12 Thread Mike Haertel
This is a fairly key statement in context, and an opinion here would count for a lot; are function calls likely to become more or less expensive in time? Ambiguous question. First answer: Assume we're hitting the cache, taking no branch mispredicts, and everything is generally going at "the