[seL4] Re: User-level proofs and capability revocation

Gernot Heiser via Devel Wed, 26 Mar 2025 17:54:22 -0700

On 27 Mar 2025, at 05:22, Gernot Heiser <ger...@unsw.edu.au> wrote:

I guess, in seL4
world one structures the system like this: there is a protocol which
requires the client VM to work with the TCB along these lines:  give up
N pages from it's memory allocation back to the TCB which then "moves"
them into the driver. The TCB has to be "below" the client in the
capability chain to make sure that the client looses the ability to
revoke the pages from the driver,


That would be one way of doing it in seL4 if the client *does* own the pages.

Note that even if you use a protocol where the client supplies shared-memory 
frames directly to the server, the server can still protect itself from the 
client.

If the client unmaps the shared page, the server will trigger a page fault on 
its next access.
That invokes the server’s page-fault handler (normally part of the resource 
manager, but could be a thread inside the server).
The page-fault handler finds that the fault is due to a page shared with the 
client and takes appropriate action.
This would presumably involve revoking all further client access (i.e. revoking 
all of the client’s caps to server endpoints/notifications etc) – a clear case 
where revocation is essential to stop further requests from the mis-behaving 
client.
The fault handler would then clean up state the server holds on behalf of the 
client.
And it would recover the server by aborting whatever operation was faulted and 
re-set the server to a sane state.

In most designs, this work would be split between two handlers: The page-fault 
handler being part of a trusted resource manager (i.e. something higher up in 
the trust hierarchy) which invokes a server-internal handler for 
cleanup/recovery.

But theoretically it could be all internal to the server (which then needs to 
rely on owning its internal memory, i.e. can’t be revoked).

If the handler is external (part of a trusted resource manager) then that could 
implement transparent demand paging of the server too, it would have to do 
enough bookkeeping to be able to distinguish between faults on client-provided 
memory and faults triggered by server pages being swapped out.

Demand paging must be functionally transparent to the paged task. Meaning the 
resource manager must observe an invariant that if a page is unmapped (eg for 
swapping to backing store), a page fault triggered by that will only restart 
the paged task once that page is mapped to a frame that has the same content as 
the one that had been unmapped earlier. Then the functionality of the paged 
task is unaffected by paging.
Paging is still observable in the timing, of course, but that’s not different 
from being preempted by a higher-priority thread.


The design space is big – which is part of the challenge with seL4: it’s easy 
to do bad designs, and requires a lot of expertise to do good ones.
Which is why we’re working on sample designs.
LionsOS is a one with a static architecture, that avoids a lot of the 
complexity (and should ease verification).
But we also have a fully dynamic one in the works, but that one is at a very 
early stage.

Gernot
_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems

[seL4] Re: User-level proofs and capability revocation

Reply via email to