Challenge
Implement SSE3 instructions to improve
synchronization between multiple agents. This technique is
targeted for use by system software to provide more efficient
thread-synchronization primitives.
Solution
Use the MONITOR and MWAIT instructions.
MONITOR defines an address range used to monitor
write-back stores. MWAIT is used to indicate that the
software thread is waiting for a write-back store to the address range
defined by the MONITOR instruction.
Software should know the exact length of the region that will be
monitored for writes by the MONITOR/MWAIT
instructions. Allocating and using a region smaller in length than the
triggering area for the processor could lead to false wake-ups
(resulting from writes to data variables that are incorrectly located
in the triggering area). Conversely, allocating a region greater in
length than the triggering area could lead to the processor not waking
appropriately. CPUID allows for the determination of
the exact length of the triggering area. This length has no
relationship to any cache-line size in the system, and software should
not make any assumptions to that effect. Based on the size provided by CPUID,
the OS/software should dynamically allocate structures with appropriate
padding. If correct allocation causes issues, choose not to use MONITOR/MWAIT.
While
a single length should suffice for single cluster based systems,
setting up the data layout for systems with multiple clusters will most
likely be more complicated. Depending on the mechanism implemented by
the chipset in such a system, a single monitor-line size may not
suffice.
Typically, software will have a set of data variables
that it monitors for writes. It will be necessary to locate these in
the monitor-triggering area. To eliminate false wake-ups due to writes
to other variables, software will need to add padding around the
monitored variables. This is referred to as the padded area.
Multiple events other than a write to the triggering address range can
cause a processor that executed MWAIT to wake up.
These include the following:
-
External interrupts: NMI, SML,
INIT, BINIT, MCERR
Power-management-related
events such as Thermal Monitor, Enhanced Intel SpeedStep® technology
transitions or chipset-driven STP-CLK# assertion will not cause the
Monitor event pending bit to be cleared. Debug traps and faults will
not cause the Monitor event-pending bit to be cleared.
The example below shows the typical usage of MONITOR/MWAIT:
// Trigger[MONITORDATARANGE] is the memory address range that will be
// used as the trigger data range Trigger[0] = 0;
If ( trigger[0] != TRIGERRDATAVALUE) {
EAX = &trigger[0]
ECX = 0
EDX = 0
MONITOR EAX, ECX, EDX
If (trigger[0] != TRIGERRDATAVALUE ) {
EAX = 0
ECX = 0
MWAIT EAX, ECX
}
}
Source
Next
Gener
ation Intel® Processor: Software Developers Guide.