On Wednesday, May 3, 2017 at 6:08:42 AM UTC-7, Yichao Yu wrote:
>
> > address space. There are no per-process mprotect semantics (at least not 
> in 
> > most OSs I know of). 
>
> OT, but there's now `pkey_mprotect`, which is thread local IIRC. 
>
> > - The effect of mprotect is not guaranteed to (and should never expected 
> to) 
> > appear atomically to all threads. The order in which protection changes 
> > applies can vary, and is virtually impossible to predict. 
>
> Yeah, it's the per-CPU TLB cache invalidation what I try to (mentally) 
> model using a mprotect_on_thread. (or on CPU). 
>
> > - In practice, protect implementations typically impose their semantic 
> > changes by changing a memory-resident page table, followed by 
> TLB-invalidate 
> > request signals (an interrupt, typically) to all processor (logical) 
> cores 
> > that are executing threads in the process. Once all involved cores 
> respond 
> > to the TLB invalidate request, the change is known to be committed, as 
> no 
> > thread in the process can observe the pre-change page table entry state. 
>
> So I assume it's also this request that (at least effectively) flushes 
> the necessary pipeline/buffers/caches to make sure the TLB 
> invalidation is ordered wrt other memory operations on the affected 
> thread (CPU/core). 
>
> > So while I think that your more specific statements about case (2) above 
> > will hold, the (3) transitive thing between thread that observed a fault 
> and 
> > one that didn't is unlikely. 
>
> Thanks. Cool! That matches what I expected now based on your explanation. 
>
> > BTW, one of the interesting APIs we use for performance with the C4 
> > collector is a set of no-TLB-invalidating semantic parallels for 
> mprotect(), 
> > mermap(), and munmap(). We separate TLB invalidation from address space 
> > mapping changes, and enforce TLB invalidation only at very coarse, 
> > explicitly requested boundaries. The collector accepts that the page 
> table 
> > may be potentially inconsistent across most operations, and enforces 
> > consistency (via explicit TLB invalidate requests) only at points where 
> it 
> > actually needs it. Since TLB invalidates represent the bulk of execution 
> > time cost for mprotect(), mermap(), and munmap() calls, this provides us 
> > with dramatically higher MBs-of-address-space-affected-per-second 
> metrics. 
> > You can find some early discussion and old numbers in the C4 paper , 
> > including some reasoning for why a high map-changing rate is needed for 
> > sustaining reasonable allocation rates in collectors that perform such 
> > changes in the main/common case compaction paths (see section 5 i n the 
> > paper). 
>
> Yeah, I've read that paper before though it was very hard for me to 
> reason what can be expected after such a no_invalidation mprotect, 
> especially since (IIUC) the change could still be seen by other thread 
> before the batch invalidation since the TLB on other thread can be 
> evacuated for other reasons. Is it similar to reasoning about a 

relaxed memory model? 
>

You can think of it as a relaxed memory model, I guess. We don't care if 
the other threads see the changes or not. Until we do care. And then we 
just need to know that the changes have applied everywhere before we do 
something that needs that fact to be known. We establish it not just with 
the TLB invalidate, but with some sort of additional explicit 
synchronization that follows it. E.g. we may require a global safepoint, or 
a checkpoint (where each thread acknowledges crossing a thread-local 
safepoint), and safepoint boundaries (in executed code) adhere to some 
specific ordering rules. These ordering rules (for C4 at least) tend to not 
care much about the normal load/store ordering and visibility stuff. They 
dictate ordering between reference loads, LVB operations, and safepoint 
opportunities. E.g. "every reference load from memory will be followed by 
an LVB operation on the loaded reference value, which will occur both 
before any use of the loaded reference value, and before any subsequent 
safepoint boundary is crossed." 

 

We don't rely on the TLB stuff being strong But we establish the boundaries 
with explicit 
 

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to