Ping. On Tuesday, November 27, 2018 at 7:36:30 AM UTC-5, Waldek Kozaczuk wrote: > > Any feedback, please? > > On Thursday, November 22, 2018 at 1:07:16 PM UTC-5, Waldek Kozaczuk wrote: >> >> As I noted in the commit message this patch may be not very elegant but >> seems to be pretty effective. Please suggest any other ideas of how to >> implement it better. Please my self-review comments below. >> >> On Thursday, November 22, 2018 at 1:28:32 AM UTC-5, Waldek Kozaczuk wrote: >>> >>> This patch implements a naive but efficient scheme of allocating >>> memory before SMP is enabled and regular malloc pools are setup. >>> Before this patch every single pre-SMP allocation would >>> be backed by full 4K page regardless if requested size >>> was 8 or 2000 bytes. >>> >>> New pre-SMP allocation scheme is based on the fact that >>> relatively few objects are allocated (~ 2,000) and rarely >>> freed (< 20%) and most are very small (<50 bytes). >>> The algorithm is as simple as finding next big enough aligned >>> space in a 4K page of memory, marking its size and resetting >>> offset to the next free area of the page otherwise allocating >>> new 4K page. For details read comments in the code. >>> >>> Even though this scheme is very simplistic it is >>> roughly 35 times more space efficient and allows >>> us to save around 6M of memory. It was measured that the net sum of >>> sizes (as is) of all requested allocations is typically around 125K >>> and new scheme uses ~44 pages (176K) of memory to satisfy all requests. >>> >>> Fixes #270 >>> >>> Signed-off-by: Waldemar Kozaczuk <[email protected]> >>> --- >>> core/mempool.cc | 133 ++++++++++++++++++++++++++++++++++++++++- >>> include/osv/mempool.hh | 5 ++ >>> 2 files changed, 136 insertions(+), 2 deletions(-) >>> >>> diff --git a/core/mempool.cc b/core/mempool.cc >>> index 23dc67da..8c23e916 100644 >>> --- a/core/mempool.cc >>> +++ b/core/mempool.cc >>> @@ -284,6 +284,9 @@ void pool::free_same_cpu(free_object* obj, unsigned >>> cpu_id) >>> trace_pool_free_same_cpu(this, object); >>> >>> page_header* header = to_header(obj); >>> + if (!header->owner) { >>> + return; >>> + } >>> >> Not sure if that check and in other place is necessary but in case we get >> here and it is early alloc page the code below will not execute that way as >> it does not apply. >> >>> if (!--header->nalloc && have_full_pages()) { >>> if (header->local_free) { >>> _free->erase(_free->iterator_to(*header)); >>> @@ -321,6 +324,9 @@ void pool::free(void* object) >>> >>> free_object* obj = static_cast<free_object*>(object); >>> page_header* header = to_header(obj); >>> + if (!header->owner) { >>> + return; >>> + } >>> unsigned obj_cpu = header->cpu_id; >>> unsigned cur_cpu = mempool_cpuid(); >>> >>> @@ -1432,6 +1438,113 @@ static void early_free_page(void* v) >>> auto pr = new (v) page_range(page_size); >>> free_page_range(pr); >>> } >>> +// >>> +// Following variables and functions implement simple >>> +// early (pre-SMP) memory allocation scheme. >>> +mutex early_alloc_lock; >>> +// early_object_pages holds a pointer to the beginning of the current >>> page >>> +// intended to be used for next early object allocation >>> +static void* early_object_page = nullptr; >>> +// early_alloc_next_offset points to the 0-relative address of free >>> +// memory within a page pointed by early_object_page >>> +static size_t early_alloc_next_offset = 0; >>> + >>> +static early_page_header* to_early_page_header(void* object) >>> +{ >>> + return reinterpret_cast<early_page_header*>( >>> + reinterpret_cast<std::uintptr_t>(object) & ~(page_size - >>> 1)); >>> +} >>> + >>> +static void setup_early_alloc_page() { >>> + early_object_page = early_alloc_page(); >>> + early_page_header *page_header = >>> to_early_page_header(early_object_page); >>> + // Set the owner field to null so that functions that free objects >>> + // or compute object size can differentiate between post-SMP malloc >>> pool >>> + // and early (pre-SMP) allocation >>> + page_header->owner = nullptr; >>> + page_header->allocations_count = 0; >>> + early_alloc_next_offset = sizeof(early_page_header); >>> +} >>> + >>> +// >>> +// This functions implements simple but effective scheme >>> +// of allocating objects before SMP is setup. It does so by >>> +// remembering where within a page free memory starts. Then it >>> +// calculates next closest offset matching specified alignment >>> +// and verifies there is enough space until end of the current >>> +// page to allocate from. If not it allocates next full page >>> +// to find enough requested space. >>> +static void* early_alloc_object(size_t size, size_t alignment) >>> +{ >>> >> Not sure if the mutex is really necessary as in pre-SMP there is only >> single thread (am I wrong?) >> but better be safe than sorry. >> >>> + WITH_LOCK(early_alloc_lock) { >>> + if (!early_object_page) { >>> + setup_early_alloc_page(); >>> + } >>> + >>> + size_t offset = (early_alloc_next_offset / alignment + 1) * >>> alignment; >>> + if (offset - early_alloc_next_offset < sizeof(unsigned short)) >>> { // Make sure enough place for size >>> + offset += alignment; //Assume alignment >= 2 >>> + } >>> + >>> + if (offset + size > page_size) { >>> + setup_early_alloc_page(); >>> + offset = (early_alloc_next_offset / alignment + 1) * >>> alignment; >>> + } >>> + >>> + assert(offset - early_alloc_next_offset >= sizeof(unsigned >>> short)); >>> + assert(offset + size <= page_size); // Verify we have enough >>> space to satisfy this allocation >>> + >>> + auto ret = early_object_page + offset; >>> + early_alloc_next_offset = offset + size; >>> + >>> + // Save size of the allocated object 2 bytes before it address >>> + *reinterpret_cast<unsigned short *>(ret - sizeof(unsigned >>> short)) = >>> + static_cast<unsigned short>(size); >>> >> The address of this 2 bytes field is effectively not 16-bytes aligned so >> I wonder >> if we should actually make this field be "ret - 16 bytes". >> >>> + to_early_page_header(early_object_page)->allocations_count++; >>> + return ret; >>> + } >>> +} >>> + >>> +// >>> +// This function fairly rarely actually frees previously >>> +// allocated memory. It does so only if all objects >>> +// have been freed in a page based on allocations_count or >>> +// if the object being freed is the last one that was allocated. >>> +static void early_free_object(void *object) >>> +{ >>> + WITH_LOCK(early_alloc_lock) { >>> + early_page_header *page_header = to_early_page_header(object); >>> + assert(!page_header->owner); >>> + unsigned short *size_addr = reinterpret_cast<unsigned >>> short*>(object - sizeof(unsigned short)); >>> + unsigned short size = *size_addr; >>> + if (!size) { >>> + return; >>> + } >>> + >>> + *size_addr = 0; // Reset size to 0 so that we know this object >>> was freed and prevent from freeing again >>> >> Not sure if checking that object was already freed is really necessary. >> >>> + page_header->allocations_count--; >>> + if (page_header->allocations_count <= 0) { // Any early page >>> + early_free_page(page_header); >>> + if (early_object_page == page_header) { >>> + early_object_page = nullptr; >>> + } >>> + } >>> + else if(early_object_page == page_header) { // Current early >>> page >>> + void *last_obj = static_cast<void *>(page_header) + >>> (early_alloc_next_offset - size); >>> + // Check if we are freeing last allocated object (free >>> followed by malloc) >>> + // and deallocate if so by moving the >>> early_alloc_next_offset to the previous >>> + // position >>> + if (last_obj == object) { >>> + early_alloc_next_offset -= size; >>> + } >>> + } >>> + } >>> +} >>> + >>> +static size_t early_object_size(void* v) >>> +{ >>> + return *reinterpret_cast<unsigned short*>(v - sizeof(unsigned >>> short)); >>> +} >>> >>> static void* untracked_alloc_page() >>> { >>> @@ -1559,6 +1672,10 @@ static inline void* std_malloc(size_t size, >>> size_t alignment) >>> ret = translate_mem_area(mmu::mem_area::main, >>> mmu::mem_area::mempool, >>> ret); >>> trace_memory_malloc_mempool(ret, size, 1 << n, alignment); >> >> With this patch and other one we have 2 more if branches. I wonder if >> performance wise >> we should re-order conditions in those ifs and check for smp_allocator >> first. >> >>> >>> + } else if (size < mmu::page_size && alignment < mmu::page_size && >>> !smp_allocator) { >>> + ret = memory::early_alloc_object(size, alignment); >>> + ret = translate_mem_area(mmu::mem_area::main, >>> mmu::mem_area::mempool, >>> + ret); >>> } else if (minimum_size <= mmu::page_size && alignment <= >>> mmu::page_size) { >>> ret = mmu::translate_mem_area(mmu::mem_area::main, >>> mmu::mem_area::page, >>> memory::alloc_page()); >>> @@ -1592,7 +1709,13 @@ static size_t object_size(void *object) >>> case mmu::mem_area::mempool: >>> object = mmu::translate_mem_area(mmu::mem_area::mempool, >>> mmu::mem_area::main, object); >>> - return memory::pool::from_object(object)->get_size(); >>> + { >>> + auto pool = memory::pool::from_object(object); >>> + if (pool) >>> + return pool->get_size(); >>> + else >>> + return memory::early_object_size(object); >>> + } >>> case mmu::mem_area::page: >>> return mmu::page_size; >>> case mmu::mem_area::debug: >>> @@ -1639,7 +1762,13 @@ void free(void* object) >>> case mmu::mem_area::mempool: >>> object = mmu::translate_mem_area(mmu::mem_area::mempool, >>> mmu::mem_area::main, object); >>> - return memory::pool::from_object(object)->free(object); >>> + { >>> + auto pool = memory::pool::from_object(object); >>> + if (pool) >>> + return pool->free(object); >>> + else >>> + return memory::early_free_object(object); >>> + } >>> case mmu::mem_area::debug: >>> return dbg::free(object); >>> default: >>> diff --git a/include/osv/mempool.hh b/include/osv/mempool.hh >>> index 5dae7646..01bf9de2 100644 >>> --- a/include/osv/mempool.hh >>> +++ b/include/osv/mempool.hh >>> @@ -36,6 +36,11 @@ void setup_free_memory(void* start, size_t bytes); >>> >>> namespace bi = boost::intrusive; >>> >>> +struct early_page_header { >>> + void* owner; >>> + unsigned short allocations_count; >>> +}; >>> + >>> struct free_object { >>> free_object* next; >>> }; >>> -- >>> 2.19.1 >>> >>>
-- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
