Re: [PATCH] Implement more space-efficient early memory allocation scheme

Nadav Har'El Sun, 09 Dec 2018 04:54:01 -0800

On Thu, Nov 22, 2018 at 8:28 AM Waldemar Kozaczuk <[email protected]>
wrote:


> This patch implements a naive but efficient scheme of allocating
> memory before SMP is enabled and regular malloc pools are setup.
> Before this patch every single pre-SMP allocation would
> be backed by full 4K page regardless if requested size
> was 8 or 2000 bytes.
>
> New pre-SMP allocation scheme is based on the fact that
> relatively few objects are allocated (~ 2,000) and rarely
> freed (< 20%) and most are very small (<50 bytes).
> The algorithm is as simple as finding next big enough aligned
> space in a 4K page of memory, marking its size and resetting
> offset to the next free area of the page otherwise allocating
> new 4K page. For details read comments in the code.
>
> Even though this scheme is very simplistic it is
> roughly 35 times more space efficient and allows
> us to save around 6M of memory. It was measured that the net sum of
> sizes (as is) of all requested allocations is typically around 125K
> and new scheme uses ~44 pages (176K) of memory to satisfy all requests.
>
> Fixes #270
>
> Signed-off-by: Waldemar Kozaczuk <[email protected]>
> ---
>  core/mempool.cc        | 133 ++++++++++++++++++++++++++++++++++++++++-
>  include/osv/mempool.hh |   5 ++
>  2 files changed, 136 insertions(+), 2 deletions(-)
>
> diff --git a/core/mempool.cc b/core/mempool.cc
> index 23dc67da..8c23e916 100644
> --- a/core/mempool.cc
> +++ b/core/mempool.cc
> @@ -284,6 +284,9 @@ void pool::free_same_cpu(free_object* obj, unsigned
> cpu_id)
>      trace_pool_free_same_cpu(this, object);
>
>      page_header* header = to_header(obj);
> +    if (!header->owner) {
> +        return;
>

Would be nice to have a comment saying what is this case, i.e., this is an
early allocation that we simply ignore its free().
However, I don't understand why this is the case - it seems that earlier in
the code, if the owner (pool::from_object()) is null,
you don't call this function, you call something else. So why bother with
this test here at all? Shouldn't we just crash if this
bug happens?

+    }
>      if (!--header->nalloc && have_full_pages()) {
>          if (header->local_free) {
>              _free->erase(_free->iterator_to(*header));
> @@ -321,6 +324,9 @@ void pool::free(void* object)
>
>          free_object* obj = static_cast<free_object*>(object);
>          page_header* header = to_header(obj);
> +        if (!header->owner) {
> +            return;
>

Ditto.

+        }
>          unsigned obj_cpu = header->cpu_id;
>          unsigned cur_cpu = mempool_cpuid();
>
> @@ -1432,6 +1438,113 @@ static void early_free_page(void* v)
>      auto pr = new (v) page_range(page_size);
>      free_page_range(pr);
>  }
> +//
> +// Following variables and functions implement simple
> +// early (pre-SMP) memory allocation scheme.
> +mutex early_alloc_lock;
> +// early_object_pages holds a pointer to the beginning of the current page
> +// intended to be used for next early object allocation
> +static void* early_object_page = nullptr;
> +// early_alloc_next_offset points to the 0-relative address of free
> +// memory within a page pointed by early_object_page
> +static size_t early_alloc_next_offset = 0;
> +
> +static early_page_header* to_early_page_header(void* object)
> +{
> +    return reinterpret_cast<early_page_header*>(
> +            reinterpret_cast<std::uintptr_t>(object) & ~(page_size - 1));
>

Looking at your code, it seems you are assuming that early_page_header and
page_header start with
the same "owner" field (with early allocations having 0 there, normal
allocations point to the pool), but
then have different content. Is that right? Maybe you should have a comment
about this next to
early_page_header, not just in setup_early_alloc_page()?

+}
> +
> +static void setup_early_alloc_page() {
> +    early_object_page = early_alloc_page();
> +    early_page_header *page_header =
> to_early_page_header(early_object_page);
> +    // Set the owner field to null so that functions that free objects
> +    // or compute object size can differentiate between post-SMP malloc
> pool
> +    // and early (pre-SMP) allocation
> +    page_header->owner = nullptr;
> +    page_header->allocations_count = 0;
> +    early_alloc_next_offset = sizeof(early_page_header);
> +}
> +
> +//
> +// This functions implements simple but effective scheme
> +// of allocating objects before SMP is setup. It does so by
> +// remembering where within a page free memory starts. Then it
> +// calculates next closest offset matching specified alignment
> +// and verifies there is enough space until end of the current
> +// page to allocate from. If not it allocates next full page
> +// to find enough requested space.
>
+static void* early_alloc_object(size_t size, size_t alignment)
> +{
> +    WITH_LOCK(early_alloc_lock) {
> +        if (!early_object_page) {
> +            setup_early_alloc_page();
> +        }
> +
> +        size_t offset = (early_alloc_next_offset / alignment + 1) *
> alignment;
>
+        if (offset - early_alloc_next_offset < sizeof(unsigned short)) {
> // Make sure enough place for size
>
+            offset += alignment; //Assume alignment >= 2
> +        }
>

I think you can replace the above 4 lines of calculations, with one clearer
line:

       offset = align_up(early_alloc_next_offset + sizeof(short),
alignment);

(but please check that I didn't misunderstand the calculations)


> +
> +        if (offset + size > page_size) {
> +            setup_early_alloc_page();
>

Looking below, it seems you don't support large early allocations at all
(more on this below).
If you did, then a possible optimization here could be, if size is large
(e.g., bigger than a page),
there is no reason to throw away the half-allocated page. We can just make
the new allocation
in full new pages and keep using the old half-full page.


> +            offset = (early_alloc_next_offset / alignment + 1) *
> alignment;
> +        }
> +
> +        assert(offset - early_alloc_next_offset >= sizeof(unsigned
> short));
>
+        assert(offset + size <= page_size); // Verify we have enough space
> to satisfy this allocation
>

But if one tries to malloc size>4096, this will fail here, no? Why is this
not a problem?
Or is it just that we assume that early on we only have small allocations,
and this assertion
verifies it (if this is the case, please add a comment about this assertion
saying why we
make this assumption).

+
> +        auto ret = early_object_page + offset;
> +        early_alloc_next_offset = offset + size;
> +
> +        // Save size of the allocated object 2 bytes before it address
>
+        *reinterpret_cast<unsigned short *>(ret - sizeof(unsigned short)) =
> +                static_cast<unsigned short>(size);
> +        to_early_page_header(early_object_page)->allocations_count++;
> +        return ret;
> +    }
> +}
> +
> +//
> +// This function fairly rarely actually frees previously
> +// allocated memory. It does so only if all objects
> +// have been freed in a page based on allocations_count or
> +// if the object being freed is the last one that was allocated.
>

You said we free only a minority of early memory allocated (20%) -
do we actually reach these specific cases?


+static void early_free_object(void *object)
> +{
> +    WITH_LOCK(early_alloc_lock) {
> +        early_page_header *page_header = to_early_page_header(object);
> +        assert(!page_header->owner);
> +        unsigned short *size_addr = reinterpret_cast<unsigned
> short*>(object - sizeof(unsigned short));
> +        unsigned short size = *size_addr;
> +        if (!size) {
> +            return;
> +        }
> +
> +        *size_addr = 0; // Reset size to 0 so that we know this object
> was freed and prevent from freeing again
>

This is nice protection against double-free, but I don't think we have such
protection even for normal
allocations, so it's definitely not important for early allocation. Nobody
expects it to be fine to free()
and already freed object and free() will just silently do nothing.
Then again, if we anyway have this "size" thing (and it's sad we have to...
I'm not sure we *really*
need to implement object_size() correctly for early allocations...), I
guess we can use it to get
double-free protection for free...

+        page_header->allocations_count--;
> +        if (page_header->allocations_count <= 0) { // Any early page
> +            early_free_page(page_header);
> +            if (early_object_page == page_header) {
> +                early_object_page = nullptr;
> +            }
> +        }
> +        else if(early_object_page == page_header) { // Current early page
> +            void *last_obj = static_cast<void *>(page_header) +
> (early_alloc_next_offset - size);
> +            // Check if we are freeing last allocated object (free
> followed by malloc)
> +            // and deallocate if so by moving the early_alloc_next_offset
> to the previous
> +            // position
> +            if (last_obj == object) {
> +                early_alloc_next_offset -= size;
>

Is it really -= size, or do you also need to undo the increase for
alignment, two-byte for size, etc.?


> +            }
> +        }
> +    }
> +}
> +
> +static size_t early_object_size(void* v)
> +{
> +    return *reinterpret_cast<unsigned short*>(v - sizeof(unsigned short));
> +}
>
>  static void* untracked_alloc_page()
>  {
> @@ -1559,6 +1672,10 @@ static inline void* std_malloc(size_t size, size_t
> alignment)
>          ret = translate_mem_area(mmu::mem_area::main,
> mmu::mem_area::mempool,
>                                   ret);
>          trace_memory_malloc_mempool(ret, size, 1 << n, alignment);
> +    } else if (size < mmu::page_size && alignment < mmu::page_size &&
> !smp_allocator) {
> +        ret = memory::early_alloc_object(size, alignment);
> +        ret = translate_mem_area(mmu::mem_area::main,
> mmu::mem_area::mempool,
> +                                 ret);
>      } else if (minimum_size <= mmu::page_size && alignment <=
> mmu::page_size) {
>          ret = mmu::translate_mem_area(mmu::mem_area::main,
> mmu::mem_area::page,
>                                         memory::alloc_page());
> @@ -1592,7 +1709,13 @@ static size_t object_size(void *object)
>      case mmu::mem_area::mempool:
>          object = mmu::translate_mem_area(mmu::mem_area::mempool,
>                                           mmu::mem_area::main, object);
> -        return memory::pool::from_object(object)->get_size();
> +        {
> +            auto pool = memory::pool::from_object(object);
> +            if (pool)
> +                return pool->get_size();
> +            else
> +                return memory::early_object_size(object);
>
+        }
>      case mmu::mem_area::page:
>          return mmu::page_size;
>      case mmu::mem_area::debug:
> @@ -1639,7 +1762,13 @@ void free(void* object)
>      case mmu::mem_area::mempool:
>          object = mmu::translate_mem_area(mmu::mem_area::mempool,
>                                           mmu::mem_area::main, object);
> -        return memory::pool::from_object(object)->free(object);
> +        {
> +            auto pool = memory::pool::from_object(object);
> +            if (pool)
> +                return pool->free(object);
> +            else
> +                return memory::early_free_object(object);
>
+        }
>      case mmu::mem_area::debug:
>          return dbg::free(object);
>      default:
> diff --git a/include/osv/mempool.hh b/include/osv/mempool.hh
> index 5dae7646..01bf9de2 100644
> --- a/include/osv/mempool.hh
> +++ b/include/osv/mempool.hh
> @@ -36,6 +36,11 @@ void setup_free_memory(void* start, size_t bytes);
>
>  namespace bi = boost::intrusive;
>
> +struct early_page_header {
> +    void* owner;
> +    unsigned short allocations_count;
> +};
> +
>  struct free_object {
>      free_object* next;
>  };
> --
> 2.19.1
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [PATCH] Implement more space-efficient early memory allocation scheme

Reply via email to