Re: [PATCH] Implement more space-efficient early memory allocation scheme

Waldek Kozaczuk Tue, 11 Dec 2018 06:56:41 -0800

I have just sent latest V3 patch that should address all your comments.

On Monday, December 10, 2018 at 12:33:14 PM UTC-5, Nadav Har'El wrote:
>
> On Sun, Dec 9, 2018 at 10:42 PM Waldek Kozaczuk <[email protected] 
> <javascript:>> wrote:
>
>> Please see my answers below. As I can tell you do not have any major 
>> objections so I will make the changes and send a revised patch. I probably 
>> need to adjust some of my comments.
>>
>> On Sunday, December 9, 2018 at 7:53:38 AM UTC-5, Nadav Har'El wrote:
>>>
>>> On Thu, Nov 22, 2018 at 8:28 AM Waldemar Kozaczuk <[email protected]> 
>>> wrote:
>>>
>>>> This patch implements a naive but efficient scheme of allocating
>>>> memory before SMP is enabled and regular malloc pools are setup.
>>>> Before this patch every single pre-SMP allocation would
>>>> be backed by full 4K page regardless if requested size
>>>> was 8 or 2000 bytes.
>>>>
>>>> New pre-SMP allocation scheme is based on the fact that 
>>>> relatively few objects are allocated (~ 2,000) and rarely
>>>> freed (< 20%) and most are very small (<50 bytes).
>>>> The algorithm is as simple as finding next big enough aligned
>>>> space in a 4K page of memory, marking its size and resetting
>>>> offset to the next free area of the page otherwise allocating
>>>> new 4K page. For details read comments in the code.
>>>>
>>>> Even though this scheme is very simplistic it is
>>>> roughly 35 times more space efficient and allows
>>>> us to save around 6M of memory. It was measured that the net sum of
>>>> sizes (as is) of all requested allocations is typically around 125K
>>>> and new scheme uses ~44 pages (176K) of memory to satisfy all requests.
>>>>
>>>> Fixes #270
>>>>
>>>> Signed-off-by: Waldemar Kozaczuk <[email protected]>
>>>> ---
>>>>  core/mempool.cc        | 133 ++++++++++++++++++++++++++++++++++++++++-
>>>>  include/osv/mempool.hh |   5 ++
>>>>  2 files changed, 136 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/core/mempool.cc b/core/mempool.cc
>>>> index 23dc67da..8c23e916 100644
>>>> --- a/core/mempool.cc
>>>> +++ b/core/mempool.cc
>>>> @@ -284,6 +284,9 @@ void pool::free_same_cpu(free_object* obj, unsigned 
>>>> cpu_id)
>>>>      trace_pool_free_same_cpu(this, object);
>>>>
>>>>      page_header* header = to_header(obj);
>>>> +    if (!header->owner) {
>>>> +        return;
>>>>
>>>
>>> Would be nice to have a comment saying what is this case, i.e., this is 
>>> an early allocation that we simply ignore its free().
>>> However, I don't understand why this is the case - it seems that earlier 
>>> in the code, if the owner (pool::from_object()) is null,
>>> you don't call this function, you call something else. So why bother 
>>> with this test here at all? Shouldn't we just crash if this
>>> bug happens?
>>>
>>> You are right we should never get here. This is me being obsessively 
>> defensive - just in case we somehow get here. You saying I should remove 
>> this for clarity?  
>>
>
> So perhaps it should be an assert(header->owner), not a silent return?
> But even that is not really necessary, as if header->owner is null, we'll 
> just crash below when it's dereferenced, no?
>  
>
>>
>>> +static early_page_header* to_early_page_header(void* object)
>>>> +{
>>>> +    return reinterpret_cast<early_page_header*>(
>>>> +            reinterpret_cast<std::uintptr_t>(object) & ~(page_size - 
>>>> 1));
>>>>
>>>
>>> Looking at your code, it seems you are assuming that early_page_header 
>>> and page_header start with
>>> the same "owner" field (with early allocations having 0 there, normal 
>>> allocations point to the pool), but
>>> then have different content. Is that right? Maybe you should have a 
>>> comment about this next to
>>> early_page_header, not just in setup_early_alloc_page()?
>>>
>> Right. Ideally we could have some sort of "data interface" construct 
>> specifying that both page_header and early_page_header start with same 
>> field. Maybe we can introduce this as a 1-field struct that both other 
>> struct share, or comment is enough?
>>
>
> I think a comment is enough.
>
>
>>> +static void* early_alloc_object(size_t size, size_t alignment)
>>>> +{
>>>> +    WITH_LOCK(early_alloc_lock) {
>>>> +        if (!early_object_page) {
>>>> +            setup_early_alloc_page();
>>>> +        }
>>>> +
>>>> +        size_t offset = (early_alloc_next_offset / alignment + 1) * 
>>>> alignment;
>>>>
>>> +        if (offset - early_alloc_next_offset < sizeof(unsigned short)) 
>>>> { // Make sure enough place for size
>>>>
>>> +            offset += alignment; //Assume alignment >= 2
>>>> +        }
>>>>
>>>
>>> I think you can replace the above 4 lines of calculations, with one 
>>> clearer line:
>>>
>>>        offset = align_up(early_alloc_next_offset + sizeof(short), 
>>> alignment);
>>>
>>> (but please check that I didn't misunderstand the calculations)
>>>
>> I am not 100% sure as my calculations make sure that there is 2 bytes 
>> left (sizeof(usigned short) between end of previous object and beginning of 
>> next one to to store size of of the object. Effectively each malloc'ed 
>> object is preceded by 2 bytes where the size is stored. Obviously the 
>> alignment requirement has to be satisfied as well.
>>
>
> I think that's exactly what the align_up() I suggested doesn, no? I asks 
> for an address after early_alloc_next_offset + sizeof(short) - so at least 
> two extra empty bytes, *and*, aligns it to alignment. I think it's so much 
> easier to understand than your four lines of calculations. It took me a 
> while to understand your calculation because it does something strange: the 
> "+ 1" in the calculation adds sometimes an extra alignment, but sometimes 
> less, and in the worst case can add just one byte so you need the next if() 
> to make sure you added at least two. But the align_up() formula just does 
> this correctly at the first go, and doesn't need an additional if().
> Unless I'm misunderstanding something.
>  
>
>>  
>>>
>>>> +
>>>> +        if (offset + size > page_size) {
>>>> +            setup_early_alloc_page();
>>>>
>>>
>>> Looking below, it seems you don't support large early allocations at all 
>>> (more on this below).
>>> If you did, then a possible optimization here could be, if size is large 
>>> (e.g., bigger than a page),
>>> there is no reason to throw away the half-allocated page. We can just 
>>> make the new allocation
>>> in full new pages and keep using the old half-full page.
>>>  
>>>
>>>> +            offset = (early_alloc_next_offset / alignment + 1) * 
>>>> alignment;
>>>>
>>>
> By the way, another instance of this calculation. Here the "+1" is always 
> enough (when alignment >=2), you don't need the extra if, but wouldn't it 
> be clearer to use the same align_up(next_offset + sizeof(short)) as above?
>
>
> +        }
>>>> +
>>>> +        assert(offset - early_alloc_next_offset >= sizeof(unsigned 
>>>> short));
>>>>
>>> +        assert(offset + size <= page_size); // Verify we have enough 
>>>> space to satisfy this allocation
>>>>
>>>
>>> But if one tries to malloc size>4096, this will fail here, no? Why is 
>>> this not a problem? 
>>> Or is it just that we assume that early on we only have small 
>>> allocations, and this assertion
>>> verifies it (if this is the case, please add a comment about this 
>>> assertion saying why we
>>> make this assumption).
>>>
>> It is OK because this code is not meant to handle > 4096 allocations. If 
>> you look at the if statements in std_malloc() the >= 4096 early allocations 
>> should still be handled by malloc_large(). 
>>
>
> I didn't notice this. However, it seems to me this code will fail for 
> lower numbers than 4096. What will malloc(4095) do? Maybe over some size it 
> should fallback to allocating an entire page (as before)?
>
>
>> +
>>>> +        *size_addr = 0; // Reset size to 0 so that we know this object 
>>>> was freed and prevent from freeing again
>>>>
>>>
>>> This is nice protection against double-free, but I don't think we have 
>>> such protection even for normal
>>> allocations, so it's definitely not important for early allocation. 
>>> Nobody expects it to be fine to free()
>>> and already freed object and free() will just silently do nothing.
>>> Then again, if we anyway have this "size" thing (and it's sad we have 
>>> to... I'm not sure we *really*
>>> need to implement object_size() correctly for early allocations...), I 
>>> guess we can use it to get
>>> double-free protection for free...
>>>
>> If you think it makes unnecessarily more complicated and possibly prone 
>> to bugs I can remove it. 
>>
>
> I don't really care. Initially I thought that you only needed "size" for 
> this protection, but when I understood
> you need it for other things as well (although I'm not sure it was 
> *really* needed to be supported, I think
> the old code didn't support it either...) this protection can stay as 
> well...
>  
>
>>
>>> +        page_header->allocations_count--;
>>>> +        if (page_header->allocations_count <= 0) { // Any early page
>>>> +            early_free_page(page_header);
>>>> +            if (early_object_page == page_header) {
>>>> +                early_object_page = nullptr;
>>>> +            }
>>>> +        }
>>>> +        else if(early_object_page == page_header) { // Current early 
>>>> page
>>>> +            void *last_obj = static_cast<void *>(page_header) + 
>>>> (early_alloc_next_offset - size);
>>>> +            // Check if we are freeing last allocated object (free 
>>>> followed by malloc)
>>>> +            // and deallocate if so by moving the 
>>>> early_alloc_next_offset to the previous
>>>> +            // position
>>>> +            if (last_obj == object) {
>>>> +                early_alloc_next_offset -= size;
>>>>
>>>
>>> Is it really -= size, or do you also need to undo the increase for 
>>> alignment, two-byte for size, etc.?
>>>  
>>>
>> I think I am fine because here I am only detecting if this is the case of 
>> free() being called on the last malloc()-ed object which means that 
>> early_alloc_next_offset has not yet been adjusted for next malloc(), right? 
>> And given that in early_alloc_object() I have this logic:
>> auto ret = early_object_page + offset; 
>> early_alloc_next_offset = offset + size; 
>> subtracting size from current early_alloc_next_offset should give me 
>> offset of the last malloc-ed object in the current page. Is this wrong 
>> assumption? 
>>
>
> I don't understand why this is right, consider we ask for 16-byte 
> alignment, and we start with offset=32:
> 1. allocating 16 bytes with 16-byte alignment will first move offset to 48 
> (for the two-byte size...) and return the object at offset 48. The new 
> offset will become 64.
> 2. freeing the 16-byte object at 48 will subtract only 16 from 64. The new 
> offset will be 48. It did *not* return to 32 as it was before the 
> allocation.
>  
> What am I missing?
>
>


-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [PATCH] Implement more space-efficient early memory allocation scheme

Reply via email to