On 2015-10-30 at 17:02 "'Davide Libenzi' via Akaros"
<[email protected]> wrote:
> To me, it can be a pretty big win to avoid CR3 reloads on high
> frequency event deliveries like network, for example.

Yeah, that's a good point.  One of the things we do in a couple places
is keep a cr3 around for a while even if it's not being used (or at
least I used to do this, I'd have to look around).  The kernel can use
any processes address space, so as an optimization, we can be lazy
about clearing the cr3 (and thus 'current') if we're asked to
switch_back to 0.  The tradeoff is that processes may sit around in the
DYING state for a while (the only cost is memory, I think).

Another mitigating factor is that there might not be exactly one event
for every packet.  For most of the events that we send, we're already
in the address space of the process.  Specifically, our most common
event (I think!) is the "blocked syscall has completed" event.  This is
done by a kthread that was already in the process's address space, so
the switch_to is mostly harmless in those cases.

> If we have only one major application using the NIC, we could route
> IRQs to the cores owned by the application, so that when an IRQ
> triggers an event delivery, the switch_to already finds the target
> task as "current", so not CR3 reload happen.

This is part of the long range plan for network intensive workloads.
The rule for MCPs is that there are no *unexpected* interrupts.  A
process (in theory, not in the current code) can request to have
particular IRQs serviced on its cores.  Right now, we mostly just
service them on Core 0, though the code exists to route_irqs() - we
just haven't designed or built the user interfaces right yet.

> Maybe the structure of the event queue could be made simpler?
> From the user POV, I guess this is already abstracted by APIs, right?

For an application, there is an API to extract a message from the
mailbox, and in general the address space issues don't affect them
much.  

>From the kernel's perspective, one idea I had was to use uva2kva to
translate user pointers to kernel pointers.  That basically is a page
table walk in software.  The danger is that the user's address space
changes after we've walked the page tables.  The solution there is to
use some form of deferred-destruction when it comes to munmap (the
main concern is that the page isn't *reused* after being munmapped
while the kernel thinks its still the user's memory).  As usual, I have
some notes on that too, which we can pull out if the switch_tos pop up
as a pain point for an application's performance.

Barret

-- 
You received this message because you are subscribed to the Google Groups 
"Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to