On Thu, Mar 19, 2009 at 09:53:39PM -0700, Avantika Mathur wrote: > I appreciate any comments! >
Patch cleaniness issues. Not a massive deal here but a kernel submission patch would attract complaint and we try and keep to the same standard. Rename subject to something like hugeadm: Retry allocation of hugepages to static pool on failure when --hard is specified > Add the --hard flag to hugeadm that is set with --pool-pages-(min|max). When > the flag is set, try multiple times adjust the pool size to the count > specified. > Actually, it should only apply to --pool-pages-min as --pool-pages-max is only setting the overcommit value. As no actual pages are allocated until mmap(), it should never fail ===== Growing the hugepage pool with --pool-pages-min can fail due to the fragmentation within the system but one failure is not a guarantee of future failure, particularly if a very large amount of memory has to be reclaimed to satisfy the allocation. This patch adds a --hard flag to retry allocations multiple times. It tries up to 5 times to resize the pool. ===== or something. > Signed-off-by: Avantika Mathur <mat...@us.ibm.com> > --- > Index: libhugetlbfs/hugeadm.c > =================================================================== > --- libhugetlbfs.orig/hugeadm.c 2009-03-19 20:52:23.000000000 -0700 > +++ libhugetlbfs/hugeadm.c 2009-03-19 21:46:54.000000000 -0700 > @@ -1,3 +1,4 @@ > + Spurious whitespace change there. > /*************************************************************************** > * User front end for using huge pages Copyright (C) 2008, IBM * > * * > @@ -67,6 +68,9 @@ > > OPTION("--list-all-mounts", "List all current hugetlbfs mount points"); > OPTION("--pool-list", "List all pools"); > + OPTION("--hard", "specified before --pool-pages-min to make"); Is it required it be specified before? or just in conjunction. > + CONT("multiple attempts at adjusting the pool size to the"); > + CONT("specified count"); Similar comments as in the changelog. Multiple attempts are only made on failure. > OPTION("--pool-pages-min <size>:[+|-]<count>", ""); > CONT("Adjust pool 'size' lower bound"); > OPTION("--pool-pages-max <size>:[+|-]<count>", ""); > @@ -97,6 +101,7 @@ > } > > int opt_dry_run = 0; Total aside, opt_dry_run appears to be totally dead in this C file. Who sets it? > +int opt_hard = 0; > > /* > * getopts return values for options which are long only. > @@ -518,16 +523,23 @@ > INFO("setting HUGEPAGES_OC to %ld\n", (max - min)); > set_huge_page_counter(page_size, HUGEPAGES_OC, (max - min)); > } > - if (pools[pos].minimum != min) { > + > + if (opt_hard) > + cnt = 5; > + else > + cnt = 1; > + > + while ((pools[pos].minimum != min) && (cnt > 0)) { > INFO("setting HUGEPAGES_TOTAL to %ld\n", min); This will print the message multiple times without indication that it's retrying due to allocation failure when --hard is specified. > set_huge_page_counter(page_size, HUGEPAGES_TOTAL, min); > + get_pool_size(page_size, &pools[pos]); > + cnt--; Just retrying 5 times seems limited as reclaim, particularly on older kernels, can take a long time. Also, we are not delaying between retries at all giving and giving IO a change to complete. Maybe something roughly like the following? INFO("setting HUGEPAGES_TOTAL to %ld\n", min); set_huge_page_counter(page_size, HUGEPAGES_TOTAL, min); get_pool_size(page_size, &pools[pos]); /* If we fail to make an allocation, retry if user requests */ last_pool_value = pools[pos].minimum; while ((pools[pos].minimum != min) && (cnt > 0)) { /* Make note if progress is being made and sleep for IO */ if (last_pool_value == pools[pos].minimum) cnt--; else cnt = 5; sleep(6); INFO("Retrying allocation HUGEPAGES_TOTAL to %ld current %ld\n", min, pools[pos].minimum != min)); set_huge_page_counter(page_size, HUGEPAGES_TOTAL, min); get_pool_size(page_size, &pools[pos]); } cnt would need to be -1 in the !--hard case obviously but you get the idea. Sleep between attempts and retry up to 5 times for half a minute and only give up if no progress is being made. > } > /* > * HUGEPAGES_TOTAL is not guarenteed to check to exactly the figure > * requested should there be insufficient pages. Check the new > * value and adjust HUGEPAGES_OC accordingly. > */ > - get_pool_size(page_size, &pools[pos]); > if (pools[pos].minimum != min) { > WARNING("failed to set pool minimum to %ld became %ld\n", > min, pools[pos].minimum); > @@ -582,6 +594,7 @@ > {"page-sizes", no_argument, NULL, LONG_PAGE_SIZES}, > {"page-sizes-all", no_argument, NULL, LONG_PAGE_AVAIL}, > {"dry-run", no_argument, NULL, 'd'}, > + {"hard", no_argument, NULL, 'r'}, The 'r' of the switch appears to be undocumented. Leave it as --hard in the same style as the other long switches. > > {0}, > }; > @@ -619,6 +632,10 @@ > case -1: > break; > > + case 'r': > + opt_hard = 1; > + continue; > + > case LONG_LIST_ALL_MOUNTS: > mounts_list_all(); > break; -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab ------------------------------------------------------------------------------ Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel