Ping.

On Tuesday, November 27, 2018 at 7:36:30 AM UTC-5, Waldek Kozaczuk wrote:
>
> Any feedback, please?
>
> On Thursday, November 22, 2018 at 1:07:16 PM UTC-5, Waldek Kozaczuk wrote:
>>
>> As I noted in the commit message this patch may be not very elegant but 
>> seems to be pretty effective. Please suggest any other ideas of how to 
>> implement it better. Please my self-review comments below.
>>
>> On Thursday, November 22, 2018 at 1:28:32 AM UTC-5, Waldek Kozaczuk wrote:
>>>
>>> This patch implements a naive but efficient scheme of allocating 
>>> memory before SMP is enabled and regular malloc pools are setup. 
>>> Before this patch every single pre-SMP allocation would 
>>> be backed by full 4K page regardless if requested size 
>>> was 8 or 2000 bytes. 
>>>
>>> New pre-SMP allocation scheme is based on the fact that 
>>> relatively few objects are allocated (~ 2,000) and rarely 
>>> freed (< 20%) and most are very small (<50 bytes). 
>>> The algorithm is as simple as finding next big enough aligned 
>>> space in a 4K page of memory, marking its size and resetting 
>>> offset to the next free area of the page otherwise allocating 
>>> new 4K page. For details read comments in the code. 
>>>
>>> Even though this scheme is very simplistic it is 
>>> roughly 35 times more space efficient and allows 
>>> us to save around 6M of memory. It was measured that the net sum of 
>>> sizes (as is) of all requested allocations is typically around 125K 
>>> and new scheme uses ~44 pages (176K) of memory to satisfy all requests. 
>>>
>>> Fixes #270 
>>>
>>> Signed-off-by: Waldemar Kozaczuk <[email protected]> 
>>> --- 
>>>  core/mempool.cc        | 133 ++++++++++++++++++++++++++++++++++++++++- 
>>>  include/osv/mempool.hh |   5 ++ 
>>>  2 files changed, 136 insertions(+), 2 deletions(-) 
>>>
>>> diff --git a/core/mempool.cc b/core/mempool.cc 
>>> index 23dc67da..8c23e916 100644 
>>> --- a/core/mempool.cc 
>>> +++ b/core/mempool.cc 
>>> @@ -284,6 +284,9 @@ void pool::free_same_cpu(free_object* obj, unsigned 
>>> cpu_id) 
>>>      trace_pool_free_same_cpu(this, object); 
>>>   
>>>      page_header* header = to_header(obj); 
>>> +    if (!header->owner) { 
>>> +        return; 
>>> +    } 
>>>
>> Not sure if that check and in other place is necessary but in case we get 
>> here and it is early alloc page the code below will not execute that way as 
>> it does not apply. 
>>
>>>      if (!--header->nalloc && have_full_pages()) { 
>>>          if (header->local_free) { 
>>>              _free->erase(_free->iterator_to(*header)); 
>>> @@ -321,6 +324,9 @@ void pool::free(void* object) 
>>>   
>>>          free_object* obj = static_cast<free_object*>(object); 
>>>          page_header* header = to_header(obj); 
>>> +        if (!header->owner) { 
>>> +            return; 
>>> +        } 
>>>          unsigned obj_cpu = header->cpu_id; 
>>>          unsigned cur_cpu = mempool_cpuid(); 
>>>   
>>> @@ -1432,6 +1438,113 @@ static void early_free_page(void* v) 
>>>      auto pr = new (v) page_range(page_size); 
>>>      free_page_range(pr); 
>>>  } 
>>> +// 
>>> +// Following variables and functions implement simple 
>>> +// early (pre-SMP) memory allocation scheme. 
>>> +mutex early_alloc_lock; 
>>> +// early_object_pages holds a pointer to the beginning of the current 
>>> page 
>>> +// intended to be used for next early object allocation 
>>> +static void* early_object_page = nullptr; 
>>> +// early_alloc_next_offset points to the 0-relative address of free 
>>> +// memory within a page pointed by early_object_page 
>>> +static size_t early_alloc_next_offset = 0; 
>>> + 
>>> +static early_page_header* to_early_page_header(void* object) 
>>> +{ 
>>> +    return reinterpret_cast<early_page_header*>( 
>>> +            reinterpret_cast<std::uintptr_t>(object) & ~(page_size - 
>>> 1)); 
>>> +} 
>>> + 
>>> +static void setup_early_alloc_page() { 
>>> +    early_object_page = early_alloc_page(); 
>>> +    early_page_header *page_header = 
>>> to_early_page_header(early_object_page); 
>>> +    // Set the owner field to null so that functions that free objects 
>>> +    // or compute object size can differentiate between post-SMP malloc 
>>> pool 
>>> +    // and early (pre-SMP) allocation 
>>> +    page_header->owner = nullptr; 
>>> +    page_header->allocations_count = 0; 
>>> +    early_alloc_next_offset = sizeof(early_page_header); 
>>> +} 
>>> + 
>>> +// 
>>> +// This functions implements simple but effective scheme 
>>> +// of allocating objects before SMP is setup. It does so by 
>>> +// remembering where within a page free memory starts. Then it 
>>> +// calculates next closest offset matching specified alignment 
>>> +// and verifies there is enough space until end of the current 
>>> +// page to allocate from. If not it allocates next full page 
>>> +// to find enough requested space. 
>>> +static void* early_alloc_object(size_t size, size_t alignment) 
>>> +{ 
>>>
>> Not sure if the mutex is really necessary as in pre-SMP there is only 
>> single thread (am I wrong?)
>> but better be safe than sorry. 
>>
>>> +    WITH_LOCK(early_alloc_lock) { 
>>> +        if (!early_object_page) { 
>>> +            setup_early_alloc_page(); 
>>> +        } 
>>> + 
>>> +        size_t offset = (early_alloc_next_offset / alignment + 1) * 
>>> alignment; 
>>> +        if (offset - early_alloc_next_offset < sizeof(unsigned short)) 
>>> { // Make sure enough place for size 
>>> +            offset += alignment; //Assume alignment >= 2 
>>> +        } 
>>> + 
>>> +        if (offset + size > page_size) { 
>>> +            setup_early_alloc_page(); 
>>> +            offset = (early_alloc_next_offset / alignment + 1) * 
>>> alignment; 
>>> +        } 
>>> + 
>>> +        assert(offset - early_alloc_next_offset >= sizeof(unsigned 
>>> short)); 
>>> +        assert(offset + size <= page_size); // Verify we have enough 
>>> space to satisfy this allocation 
>>> + 
>>> +        auto ret = early_object_page + offset; 
>>> +        early_alloc_next_offset = offset + size; 
>>> + 
>>> +        // Save size of the allocated object 2 bytes before it address 
>>> +        *reinterpret_cast<unsigned short *>(ret - sizeof(unsigned 
>>> short)) = 
>>> +                static_cast<unsigned short>(size); 
>>>
>> The address of this 2 bytes field is effectively not 16-bytes aligned so 
>> I wonder
>> if we should actually make this field be "ret - 16 bytes". 
>>
>>> +        to_early_page_header(early_object_page)->allocations_count++; 
>>> +        return ret; 
>>> +    } 
>>> +} 
>>> + 
>>> +// 
>>> +// This function fairly rarely actually frees previously 
>>> +// allocated memory. It does so only if all objects 
>>> +// have been freed in a page based on allocations_count or 
>>> +// if the object being freed is the last one that was allocated. 
>>> +static void early_free_object(void *object) 
>>> +{ 
>>> +    WITH_LOCK(early_alloc_lock) { 
>>> +        early_page_header *page_header = to_early_page_header(object); 
>>> +        assert(!page_header->owner); 
>>> +        unsigned short *size_addr = reinterpret_cast<unsigned 
>>> short*>(object - sizeof(unsigned short)); 
>>> +        unsigned short size = *size_addr; 
>>> +        if (!size) { 
>>> +            return; 
>>> +        } 
>>> + 
>>> +        *size_addr = 0; // Reset size to 0 so that we know this object 
>>> was freed and prevent from freeing again 
>>>
>> Not sure if checking that object was already freed is really necessary. 
>>
>>> +        page_header->allocations_count--; 
>>> +        if (page_header->allocations_count <= 0) { // Any early page 
>>> +            early_free_page(page_header); 
>>> +            if (early_object_page == page_header) { 
>>> +                early_object_page = nullptr; 
>>> +            } 
>>> +        } 
>>> +        else if(early_object_page == page_header) { // Current early 
>>> page 
>>> +            void *last_obj = static_cast<void *>(page_header) + 
>>> (early_alloc_next_offset - size); 
>>> +            // Check if we are freeing last allocated object (free 
>>> followed by malloc) 
>>> +            // and deallocate if so by moving the 
>>> early_alloc_next_offset to the previous 
>>> +            // position 
>>> +            if (last_obj == object) { 
>>> +                early_alloc_next_offset -= size; 
>>> +            } 
>>> +        } 
>>> +    } 
>>> +} 
>>> + 
>>> +static size_t early_object_size(void* v) 
>>> +{ 
>>> +    return *reinterpret_cast<unsigned short*>(v - sizeof(unsigned 
>>> short)); 
>>> +} 
>>>   
>>>  static void* untracked_alloc_page() 
>>>  { 
>>> @@ -1559,6 +1672,10 @@ static inline void* std_malloc(size_t size, 
>>> size_t alignment) 
>>>          ret = translate_mem_area(mmu::mem_area::main, 
>>> mmu::mem_area::mempool, 
>>>                                   ret); 
>>>          trace_memory_malloc_mempool(ret, size, 1 << n, alignment);
>>
>> With this patch and other one we have 2 more if branches. I wonder if 
>> performance wise 
>> we should re-order conditions in those ifs and check for  smp_allocator 
>> first.
>>
>>>
>>> +    } else if (size < mmu::page_size && alignment < mmu::page_size && 
>>> !smp_allocator) { 
>>> +        ret = memory::early_alloc_object(size, alignment); 
>>> +        ret = translate_mem_area(mmu::mem_area::main, 
>>> mmu::mem_area::mempool, 
>>> +                                 ret); 
>>>      } else if (minimum_size <= mmu::page_size && alignment <= 
>>> mmu::page_size) { 
>>>          ret = mmu::translate_mem_area(mmu::mem_area::main, 
>>> mmu::mem_area::page, 
>>>                                         memory::alloc_page()); 
>>> @@ -1592,7 +1709,13 @@ static size_t object_size(void *object) 
>>>      case mmu::mem_area::mempool: 
>>>          object = mmu::translate_mem_area(mmu::mem_area::mempool, 
>>>                                           mmu::mem_area::main, object); 
>>> -        return memory::pool::from_object(object)->get_size(); 
>>> +        { 
>>> +            auto pool = memory::pool::from_object(object); 
>>> +            if (pool) 
>>> +                return pool->get_size(); 
>>> +            else 
>>> +                return memory::early_object_size(object); 
>>> +        } 
>>>      case mmu::mem_area::page: 
>>>          return mmu::page_size; 
>>>      case mmu::mem_area::debug: 
>>> @@ -1639,7 +1762,13 @@ void free(void* object) 
>>>      case mmu::mem_area::mempool: 
>>>          object = mmu::translate_mem_area(mmu::mem_area::mempool, 
>>>                                           mmu::mem_area::main, object); 
>>> -        return memory::pool::from_object(object)->free(object); 
>>> +        { 
>>> +            auto pool = memory::pool::from_object(object); 
>>> +            if (pool) 
>>> +                return pool->free(object); 
>>> +            else 
>>> +                return memory::early_free_object(object); 
>>> +        } 
>>>      case mmu::mem_area::debug: 
>>>          return dbg::free(object); 
>>>      default: 
>>> diff --git a/include/osv/mempool.hh b/include/osv/mempool.hh 
>>> index 5dae7646..01bf9de2 100644 
>>> --- a/include/osv/mempool.hh 
>>> +++ b/include/osv/mempool.hh 
>>> @@ -36,6 +36,11 @@ void setup_free_memory(void* start, size_t bytes); 
>>>   
>>>  namespace bi = boost::intrusive; 
>>>   
>>> +struct early_page_header { 
>>> +    void* owner; 
>>> +    unsigned short allocations_count; 
>>> +}; 
>>> + 
>>>  struct free_object { 
>>>      free_object* next; 
>>>  }; 
>>> -- 
>>> 2.19.1 
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to