> From: "Leopold Toetsch" <[EMAIL PROTECTED]>

> Seems that we got a problem on alphas then. I can see several solutions
> to accomodate such CPUs:

>From my point of view, these solutions have the following merits and
demerits:

> - use only PMCs that don't cross cache lines

+) No need for memory barriers to preserve the order of read and write
operations
+) Lock-free read access, and thus, no CPU perfomance degradation
-) This would bloat the PMC size

> - insert rmb_if_needed() for all string vtables that don't lock

+) No constraint on PMC's alignment
-) In the first place, this variant is a big headache -- it's problematic to
say on which architectures such code protection has to be aplied. Even
compilers, x86-, and AMD-based hardware do harmless reodering to speed the
things up. In addition, you never know which memory ordering model future
architectures are going to choose. For example, on Itanium-based hardware,
it appears that read and write ordering is not preserved when seen from the
perspectives of different processors. It's not only Alpha is that weird on
this point.
-) It might turn out that not all OSes provide rmb()-like primitives. For
instance, KeMemoryBarrierXxx, the windows API functions, are defined in the
DDK (Driver Development Kit) only for Windows Server 2003. So they aren't
portbale across all Windows and plus defined in the DDK that stands for
"It's not your business, software develover". What Microsoft advises
software developers is sort of "When the order is important, use standard
operating system locking mechanisms whenever possible. The standard
mechanisms have implied memory barriers"
-) Barriers incur pipeline flushes, the overhead of which increases with
pipeline length; ditto to the following variant

> - lock all string vtable access

+) No need for memory barriers, since operating system's locking mechanisms
all have implied those
-) Acquiring a lock is a damn waste of time, the acquiring CPU is forced to
stall until the lock is released and reaches its cache.

> - move string pointer into PMC_EXT

+-) Being a nice idea on the whole, it's a lop-sided variant concerning the
other data primitives that a PMC can hold
-) It will break the current implementation of perstring.pmc, sarray.pmc,
scalar.pmc, and probably other code and pmc classes
-) Besides that I might expect that there are more disvantages

Summing up the above said, I would vote for "use only PMCs that don't cross
cache lines"

> leo

0x4C56

Reply via email to