Matt Burke <[email protected]> wrote:
> I probably should have researched memory barriers a bit more.
> I knew a little bit about them but wasn't sure if they were needed here.
> The problem may also exist for the rest of the shared memory.
For the purposes of running VMS, what really and ultimately matters is not what
actual 78x hardware does, but what VMS expects it to do. And this latter bit
can be learned from VMS sources, even if we do not have (and may be unable to
obtain) sufficient knowledge about the hardware.
Precise understanding of VMS expectations would require a more thorough look
into VMS sources than I could have until now, however my hunch is that VMS
desires are a subset of a stronger general statement:
[SS:] "When VCPU1 and VCPU2 [in a shared-memory or cache-coherent
multiprocessor] communicate via an IPI from VCPU1 to VCPU2, VCPU1 [most often]
wants to pass its view of memory to VCPU2".
VMS may quite possibly expect less than that, however we do not (yet) know
exactly what... it might take a night or more with the listings to understand
the code well enough to figure this out exactly. (Comments in the code next to
BBSSI/BBCCI instructions suggest that VMS uses them purposefully to flush and
control the cache "manually"... hence my questions in the previous message.)
Also this "less" might turn out to be more difficult to implement in practice
than (both stronger but comparatively simpler) SS.
My *hunch* however is that if SS were in place, VMS will most likely be happy
with it. There is a slim chance this might prove wrong, only code reading might
tell, but I think this chance is pretty slim... I'd place money on it. Thus
assuming it for now:
SS is easy to implement. There is a flag that CPU1 sets and CPU2 reads, to pass
an IPI. If this flag is protected by a lock both on read and write sides, this
would ensure SS, since locking primitives issue memory barriers (and compiler
barriers as well -- another important thing to have in mind).
Right now ipc_send_int uses ipc_lock/ipc_unlock but ipc_poll_int does not. If
ipc_poll_int used locking (matching one in ipc_send_int), that would provide
memory barriers (along with compiler barriers) and provide SS.
Alternatively, it is possible to insert WMB (or full MB) primitive into
ipc_send_int before setting an IPI flag, and insert a RMB (or full MB)
primitive in ipc_poll_int after reading the flag, but these are
platform-specific. (RMB and WMB designations here also imply compiler barrier
included, on both sides of a hardware barrier). VAX MP out of necessity has an
implementation for smp_rmb() and smp_wmb() that also include compiler barrier
(barrier(), COMPILER_BARRIER), but you do not really want to get into this host
platform specific (and for X86, also host CPU model specific and bitness
specific) mess unless really necessary... and for 782 it is really not. Using
locks is a much neater solution for 782 purposes, I think.
_______________________________________________
Simh mailing list
[email protected]
http://mailman.trailing-edge.com/mailman/listinfo/simh