On 3/6/19 1:12 PM, Michael S. Tsirkin wrote:
> On Wed, Mar 06, 2019 at 01:07:50PM -0500, Nitesh Narayan Lal wrote:
>> On 3/6/19 11:09 AM, Michael S. Tsirkin wrote:
>>> On Wed, Mar 06, 2019 at 10:50:42AM -0500, Nitesh Narayan Lal wrote:
>>>> The following patch-set proposes an efficient mechanism for handing freed 
>>>> memory between the guest and the host. It enables the guests with no page 
>>>> cache to rapidly free and reclaims memory to and from the host 
>>>> respectively.
>>>>
>>>> Benefit:
>>>> With this patch-series, in our test-case, executed on a single system and 
>>>> single NUMA node with 15GB memory, we were able to successfully launch 5 
>>>> guests(each with 5 GB memory) when page hinting was enabled and 3 without 
>>>> it. (Detailed explanation of the test procedure is provided at the bottom 
>>>> under Test - 1).
>>>>
>>>> Changelog in v9:
>>>>    * Guest free page hinting hook is now invoked after a page has been 
>>>> merged in the buddy.
>>>>         * Free pages only with order FREE_PAGE_HINTING_MIN_ORDER(currently 
>>>> defined as MAX_ORDER - 1) are captured.
>>>>    * Removed kthread which was earlier used to perform the scanning, 
>>>> isolation & reporting of free pages.
>>>>    * Pages, captured in the per cpu array are sorted based on the zone 
>>>> numbers. This is to avoid redundancy of acquiring zone locks.
>>>>         * Dynamically allocated space is used to hold the isolated guest 
>>>> free pages.
>>>>         * All the pages are reported asynchronously to the host via virtio 
>>>> driver.
>>>>         * Pages are returned back to the guest buddy free list only when 
>>>> the host response is received.
>>>>
>>>> Pending items:
>>>>         * Make sure that the guest free page hinting's current 
>>>> implementation doesn't break hugepages or device assigned guests.
>>>>    * Follow up on VIRTIO_BALLOON_F_PAGE_POISON's device side support. (It 
>>>> is currently missing)
>>>>         * Compare reporting free pages via vring with vhost.
>>>>         * Decide between MADV_DONTNEED and MADV_FREE.
>>>>    * Analyze overall performance impact due to guest free page hinting.
>>>>    * Come up with proper/traceable error-message/logs.
>>>>
>>>> Tests:
>>>> 1. Use-case - Number of guests we can launch
>>>>
>>>>    NUMA Nodes = 1 with 15 GB memory
>>>>    Guest Memory = 5 GB
>>>>    Number of cores in guest = 1
>>>>    Workload = test allocation program allocates 4GB memory, touches it via 
>>>> memset and exits.
>>>>    Procedure =
>>>>    The first guest is launched and once its console is up, the test 
>>>> allocation program is executed with 4 GB memory request (Due to this the 
>>>> guest occupies almost 4-5 GB of memory in the host in a system without 
>>>> page hinting). Once this program exits at that time another guest is 
>>>> launched in the host and the same process is followed. We continue 
>>>> launching the guests until a guest gets killed due to low memory condition 
>>>> in the host.
>>>>
>>>>    Results:
>>>>    Without hinting = 3
>>>>    With hinting = 5
>>>>
>>>> 2. Hackbench
>>>>    Guest Memory = 5 GB 
>>>>    Number of cores = 4
>>>>    Number of tasks         Time with Hinting       Time without Hinting
>>>>    4000                    19.540                  17.818
>>>>
>>> How about memhog btw?
>>> Alex reported:
>>>
>>>     My testing up till now has consisted of setting up 4 8GB VMs on a system
>>>     with 32GB of memory and 4GB of swap. To stress the memory on the system 
>>> I
>>>     would run "memhog 8G" sequentially on each of the guests and observe how
>>>     long it took to complete the run. The observed behavior is that on the
>>>     systems with these patches applied in both the guest and on the host I 
>>> was
>>>     able to complete the test with a time of 5 to 7 seconds per guest. On a
>>>     system without these patches the time ranged from 7 to 49 seconds per
>>>     guest. I am assuming the variability is due to time being spent writing
>>>     pages out to disk in order to free up space for the guest.
>>>
>> Here are the results:
>>
>> Procedure: 3 Guests of size 5GB is launched on a single NUMA node with
>> total memory of 15GB and no swap. In each of the guest, memhog is run
>> with 5GB. Post-execution of memhog, Host memory usage is monitored by
>> using Free command.
>>
>> Without Hinting:
>>                  Time of execution    Host used memory
>> Guest 1:        45 seconds            5.4 GB
>> Guest 2:        45 seconds            10 GB
>> Guest 3:        1  minute               15 GB
>>
>> With Hinting:
>>                 Time of execution     Host used memory
>> Guest 1:        49 seconds            2.4 GB
>> Guest 2:        40 seconds            4.3 GB
>> Guest 3:        50 seconds            6.3 GB
> OK so no improvement. OTOH Alex's patches cut time down to 5-7 seconds
> which seems better. Want to try testing Alex's patches for comparison?
>
I realized that the last time I reported the memhog numbers, I didn't
enable the swap due to which the actual benefits of the series were not
shown.
I have re-run the test by including some of the changes suggested by
Alexander and David:
    * Reduced the size of the per-cpu array to 32 and minimum hinting
threshold to 16.
    * Reported length of isolated pages along with start pfn, instead of
the order from the guest.
    * Used the reported length to madvise the entire length of address
instead of a single 4K page.
    * Replaced MADV_DONTNEED with MADV_FREE.

Setup for the test:
NUMA node:1
Memory: 15GB
Swap: 4GB
Guest memory: 6GB
Number of core: 1

Process: A guest is launched and memhog is run with 6GB. As its
execution is over next guest is launched. Everytime memhog execution
time is monitored.   
Results:
    Without Hinting:
                 Time of execution
    Guest1:    22s
    Guest2:    24s
    Guest3: 1m29s

    With Hinting:
                Time of execution
    Guest1:    24s
    Guest2:    25s
    Guest3:    28s

When hinting is enabled swap space is not used until memhog with 6GB is
ran in 6th guest.


-- 
Regards
Nitesh

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to