[solo5] Re: Memory management

mail Fri, 22 Mar 2019 05:31:38 -0700

Hi Martin,

thank you for your answers! I try to answer to both your emails here.

Sure, but we still need to arrive at some equation for determining a
sensible default stack size, while allowing for both small values of
"--mem" (e.g. 8MB, or even less?) and large values.

I agree. Ideally the (uni)kernel developer is aware of the absence of anunbounded stack and codes accordingly. I would assume that requiredstack size does scale logarithmically with the heap size. Given such arelationship, we are on the safe side soon. And if we have a separatestack region, the user will also be notified clearly of the stackoverflow.

Practically, I wonder where very long call chains can occur. For examplein Ocaml it seems List.map is not tail recursive to avoid reversing.What are they using as a limit?For reference, ulimit -s seems to be 8M on Linux x86_64. Furthermore,the default pthread stack size seems to be 2M. For sufficiently largeinstances (> 16M), we could just fix 2M. For smaller machines (minimum512K) somehow scale down to 64K??

In my mind the way to do this consistently for all targets is to makethe
"desired stack size" a property attached to the *unikernel binary* and
initially set by the unikernel developer, with the libOS build system
determining the default. The *operator* should be able to overridethis.
By "property of the unikernel binary" I mean something that is
declarative(!) and forms part of a "manifest" that is embedded into the
binary as an ELF note.

This sounds like a good solution. Maybe fix a default desired size of 2Mand allow the developerto overwrite it in the manifest. The operator can also adjust it bytuning some option.The application should then refuse to run if the desired size cannot beprovided,

or if the heap/stack ratio becomes too small.

I am not sure about the "dynamic part". I would either set the stackfromoutside (as discussed above), or allow some kind of memconfig call,which
can be executed only once after application startup.
Alternatively you could provide munmap and munmap_done. Dynamic peoplewould
just never call munmap_done...
I think we can forget about multiple stacks for now. Having thoughtabout
it, even having a single separate memory region for the stack is tricky
enough:

I wonder what the difference is between multiple stacks and multipleseparate memory regions?If multiple separate memory regions would be available, the applicationcould use those as stacksand perform switching between them. If that makes sense is a question tothe unikernel developers.

I thought a bit more about this idea of adding a munmap call and Idon't
really like it.
Maybe the goal should be extended to allow multiple different memoryareas,randomly distributed in the address space to leverage ASLR. One ofthose
ASLR is a separate topic in itself, see here for a rough plan of whatneeds
to be done:

https://github.com/Solo5/solo5/issues/304

In that issue you are talking about the randomization of the memorylocation of the binary (code & data)?I agree that this would be nice to have. But is that not a separateissue from how the heap memory regions are organized? Ideally all of itshould be randomized, code, data and heap.

No, this has already been discussed before. Dynamic memory allocationis
not on the cards.

Most recently
https://github.com/Solo5/solo5/issues/335#issuecomment-472499246 and
earlier https://github.com/Solo5/solo5/issues/223.


I looked quickly at 223 and 335, maybe I missed some things.
In 223 you threw out the malloc implementation from solo5, which makes
perfect sense, since malloc should be handled on the application level.
Fined grained memory allocation should not be provided by solo5.

In 335 you refuse to add mmap. But there seems to be agreement that
the application/unikernel has no control over the virtual memory layout.

This has to be managed by solo5. Does your refusal to add mmap alsoapply to the configuration

phase in a very restricted sense?

Providing just a "munmap()" and nothing else might be a simpler way togetguard pages, or it might not. Anyway, one thing at a time. As Imentionedin my other email, lets ignore multiple stacks for now and justconcentrate
on how a separate stack region could be done.

I agree, for me stack/heap separation is also the more important issueto avoid corruption.

Multiple memory regions or guarded regions are only nice have. Butinstead of adding munmapcalls to configure guard pages, I think it is better to add two callssolo5_mem_alloc()and solo5_mem_lock(), which also allows randomization of the addressesreturned by solo5_mem_alloc().This is what I proposed in the github PR. For now I just exposed thealready existing bump allocationscheme, but this could be randomized at some point. I argue that this isNOT dynamical memory allocation,but rather configuration of the memory layout until solo5_mem_lock() iscalled.


Compare option I:

1. app is informed of heap_start, heap_size, stack_start, stack_size

2. app calls solo5_munmap on parts of (heap_start, heap_start+heap_size)for guarding

3. app calls solo5_munmap_done to finish initialization

versus option II:

1. app is informed of stack_start, stack_size, mem_avail
2. app calls solo5_mem_alloc to allocate mem_avail until it is exhausted
3. app calls solo5_mem_lock to finish initialization

Basically both options are equivalent, but in option II moreresponsibility

is pushed away from the application to solo5.
On primitive targets without the ability to manipulate the page table,
there won't be any difference and solo5_munmap would be a nop.

However on targets which support modifying the page table, solo5 could

*automatically* guard the regions by adding gaps after eachsolo5_mem_alloc block.And even better, the memory block locations could also be randomized.This

is not possible in option I.

You mentioned in 335 that clone+munmap allowed privilege escalation. Butin both

schemes, option I and II you need mmap/munmap. However the memory

configuration should be finished by calling solo5_mem_lock afterinitialization.You could even enforce that the API is used correctly, by unlockingnetwork and blocks

only after solo5_mem_lock has been called ;)

Daniel

[solo5] Re: Memory management

Reply via email to