On 3/6/19 1:12 PM, Michael S. Tsirkin wrote: > On Wed, Mar 06, 2019 at 01:07:50PM -0500, Nitesh Narayan Lal wrote: >> On 3/6/19 11:09 AM, Michael S. Tsirkin wrote: >>> On Wed, Mar 06, 2019 at 10:50:42AM -0500, Nitesh Narayan Lal wrote: >>>> The following patch-set proposes an efficient mechanism for handing freed >>>> memory between the guest and the host. It enables the guests with no page >>>> cache to rapidly free and reclaims memory to and from the host >>>> respectively. >>>> >>>> Benefit: >>>> With this patch-series, in our test-case, executed on a single system and >>>> single NUMA node with 15GB memory, we were able to successfully launch 5 >>>> guests(each with 5 GB memory) when page hinting was enabled and 3 without >>>> it. (Detailed explanation of the test procedure is provided at the bottom >>>> under Test - 1). >>>> >>>> Changelog in v9: >>>> * Guest free page hinting hook is now invoked after a page has been >>>> merged in the buddy. >>>> * Free pages only with order FREE_PAGE_HINTING_MIN_ORDER(currently >>>> defined as MAX_ORDER - 1) are captured. >>>> * Removed kthread which was earlier used to perform the scanning, >>>> isolation & reporting of free pages. >>>> * Pages, captured in the per cpu array are sorted based on the zone >>>> numbers. This is to avoid redundancy of acquiring zone locks. >>>> * Dynamically allocated space is used to hold the isolated guest >>>> free pages. >>>> * All the pages are reported asynchronously to the host via virtio >>>> driver. >>>> * Pages are returned back to the guest buddy free list only when >>>> the host response is received. >>>> >>>> Pending items: >>>> * Make sure that the guest free page hinting's current >>>> implementation doesn't break hugepages or device assigned guests. >>>> * Follow up on VIRTIO_BALLOON_F_PAGE_POISON's device side support. (It >>>> is currently missing) >>>> * Compare reporting free pages via vring with vhost. >>>> * Decide between MADV_DONTNEED and MADV_FREE. >>>> * Analyze overall performance impact due to guest free page hinting. >>>> * Come up with proper/traceable error-message/logs. >>>> >>>> Tests: >>>> 1. Use-case - Number of guests we can launch >>>> >>>> NUMA Nodes = 1 with 15 GB memory >>>> Guest Memory = 5 GB >>>> Number of cores in guest = 1 >>>> Workload = test allocation program allocates 4GB memory, touches it via >>>> memset and exits. >>>> Procedure = >>>> The first guest is launched and once its console is up, the test >>>> allocation program is executed with 4 GB memory request (Due to this the >>>> guest occupies almost 4-5 GB of memory in the host in a system without >>>> page hinting). Once this program exits at that time another guest is >>>> launched in the host and the same process is followed. We continue >>>> launching the guests until a guest gets killed due to low memory condition >>>> in the host. >>>> >>>> Results: >>>> Without hinting = 3 >>>> With hinting = 5 >>>> >>>> 2. Hackbench >>>> Guest Memory = 5 GB >>>> Number of cores = 4 >>>> Number of tasks Time with Hinting Time without Hinting >>>> 4000 19.540 17.818 >>>> >>> How about memhog btw? >>> Alex reported: >>> >>> My testing up till now has consisted of setting up 4 8GB VMs on a system >>> with 32GB of memory and 4GB of swap. To stress the memory on the system >>> I >>> would run "memhog 8G" sequentially on each of the guests and observe how >>> long it took to complete the run. The observed behavior is that on the >>> systems with these patches applied in both the guest and on the host I >>> was >>> able to complete the test with a time of 5 to 7 seconds per guest. On a >>> system without these patches the time ranged from 7 to 49 seconds per >>> guest. I am assuming the variability is due to time being spent writing >>> pages out to disk in order to free up space for the guest. >>> >> Here are the results: >> >> Procedure: 3 Guests of size 5GB is launched on a single NUMA node with >> total memory of 15GB and no swap. In each of the guest, memhog is run >> with 5GB. Post-execution of memhog, Host memory usage is monitored by >> using Free command. >> >> Without Hinting: >> Time of execution Host used memory >> Guest 1: 45 seconds 5.4 GB >> Guest 2: 45 seconds 10 GB >> Guest 3: 1 minute 15 GB >> >> With Hinting: >> Time of execution Host used memory >> Guest 1: 49 seconds 2.4 GB >> Guest 2: 40 seconds 4.3 GB >> Guest 3: 50 seconds 6.3 GB > OK so no improvement. OTOH Alex's patches cut time down to 5-7 seconds > which seems better. Want to try testing Alex's patches for comparison? > I realized that the last time I reported the memhog numbers, I didn't enable the swap due to which the actual benefits of the series were not shown. I have re-run the test by including some of the changes suggested by Alexander and David: * Reduced the size of the per-cpu array to 32 and minimum hinting threshold to 16. * Reported length of isolated pages along with start pfn, instead of the order from the guest. * Used the reported length to madvise the entire length of address instead of a single 4K page. * Replaced MADV_DONTNEED with MADV_FREE.
Setup for the test: NUMA node:1 Memory: 15GB Swap: 4GB Guest memory: 6GB Number of core: 1 Process: A guest is launched and memhog is run with 6GB. As its execution is over next guest is launched. Everytime memhog execution time is monitored. Results: Without Hinting: Time of execution Guest1: 22s Guest2: 24s Guest3: 1m29s With Hinting: Time of execution Guest1: 24s Guest2: 25s Guest3: 28s When hinting is enabled swap space is not used until memhog with 6GB is ran in 6th guest. -- Regards Nitesh
signature.asc
Description: OpenPGP digital signature

