On Sun, Sep 18, 2016 at 10:37:22PM +0200, Nadav Har'El wrote:
> Hi Gleb, I have a couple of questions (CCed to the OSv mailing list) about
> your OSv commit 7e38453, maybe you remember something (or be reminded of
> something when you look at the commit).
> This commit is apparently causing
> https://github.com/cloudius-systems/osv/issues/790 so now we're trying to
> figure out how to most properly fix it. The problem is that it appears that
> the scheduler on one CPU is handed sched::thread objects from another CPU.
> These thread objects might live in mmap()ed areas, but we may have delayed
> the required TLB flush on the target CPU.
> So the questions I find myself asking, perhaps you remember:
> 1. Do you remember if this commit was an important performance advantage
> for a workload, or just an optimistic fix?
It is an optimization that exists on all OSes and doubly important on
guest OSes since IPIs there are much more expensive. If workload is mmap
heavy you will have a lot of IPIs without the optimization.
> 2. I'm afraid the scheduler thing might only be the tip of the iceberg of
> problems caused by this lazy TLB thing. Could we have, for example (and
> this is just a hypothetical example) one thread doing a write() to disk of
> some data from an mmap'ed area, and this data is supposed to be read by a
> ZFS thread which runs on a different CPU - and because it is labeled a
> "system thread", it won't do a TLB flush before reading the mmap'ed area?
write() copies data into ZFS ARC.
> Why are we confident that "system threads" never need to read user's
> mmap'ed data?
If they do this is a bug as you discovered. They may access mmaped
memory, but they should do so through their own mappings. This is the
design, not something that has to be this way.
> 3. This commit 7e38453 starts flush_tlb_all() with setting the
> lazy_flush_tlb flag to true, but resets it back to false when it decides to
> send an IPI. If the other CPU is right now in the scheduler we can have the
> code leave the flag at false (if the out-going thread was an app thread)
> and send an IPI which will be delayed - so the scheduler has no way of
> knowing it needs to do a TLB flush before accessing the sched::thread.
> Couldn't we live the flag at true *in addition* to the IPI? The IPI handler
> could then zero it (if not already zero)?
If IPI can be delayed why the same bug cannot happen without
lazy_flush_tlb optimization at all? Thread A mmaps its stack, sends
flush IPI which is delayed, allocates B's thread struct on the stack,
cpu 1 tries to access it -> boom.
You received this message because you are subscribed to the Google Groups "OSv
To unsubscribe from this group and stop receiving emails from it, send an email
For more options, visit https://groups.google.com/d/optout.