[Libhugetlbfs-devel] O Segredo é Acreditar e Agir SUCESSO
<<< text/html; charset="UTF-8": Unrecognized >>> -- Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_APR___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
Re: [Libhugetlbfs-devel] problem with preview 3, on fedora 5, with gcc
On Mon, 2006-08-07 at 13:05 -0500, Steve Fox wrote: > On Sat, 2006-08-05 at 08:23 -0500, Bill Buros wrote: > You should be able to 'yum install libhugetlbfs' on Fedora Core 5 and > get the package from the Extras repository (which is enabled by > default). > > > Does the package work with OpenSuSE as well? > > It's possible. We tried to avoid doing anything Red Hat specific during > the packaging. I'm really unfamiliar with SUSE though, so I can't say > for sure. Just want to clarify one thing here... The *binary* package is definitely not advisable for use on SUSE, but there's a good chance the SRPM can be rebuilt on SUSE, producing a suitable binary package.. -- Jarod Wilson [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
Re: [Libhugetlbfs-devel] problem with preview 3, on fedora 5, with gcc
On Mon, 2006-08-07 at 15:47 -0500, Bill Buros wrote: > > > You should be able to 'yum install libhugetlbfs' on Fedora > Core 5 and > > get the package from the Extras repository (which is enabled > by > > default). > > Hadn't used the Extras or yum before. Pretty nice > > And interesting. It says the kernel with Fedora Core 5 > (2.6.15-1.2054_FC5) is in conflict with the requirements for > libhugetlbfs ( < 2.6.16). Yes, that is by design. Some of the necessary kernel-level functionality may not be there in kernels prior to 2.6.16. That's when most of it went upstream, and outside of RHEL[1], we try to stick closely to upstream whenever possible. [1] We run libhugetlbfs on 2.6.9-based RHEL4 boxes regularly, but we back-ported the necessary kernel bits. > The x86-64bit system was installed with downloaded CDs for Fedora Core > 5... If you 'yum install kernel' or 'yum upgrade', you'll get offered a 2.6.17 kernel to install. > > > Does the package work with OpenSuSE as well? > > > > It's possible. We tried to avoid doing anything Red Hat > specific during > > the packaging. I'm really unfamiliar with SUSE though, so I > can't say > > for sure. > > Just want to clarify one thing here... The *binary* package is > definitely not advisable for use on SUSE, but there's a good > chance the > SRPM can be rebuilt on SUSE, producing a suitable binary > package.. > > That's fair and understandable.. I hadn't been picturing the Extras > package.. > I'll have to be more careful with the phrasing of which package is > being used. No problem. > I'll try yum and libhugetlbfs on a ppc64 system later this week... Note that ppc64 packages aren't available, as we currently build nothing but ppc32 userland packages in Fedora Extras. Only a select few packages in Fedora Core are even built both ppc32 and ppc64. ppc64 systems are usually only the kernel and a few libs that are actually 64-bit, the rest is 32-bit (for performance reasons, or so I'm told). Of course, you can simply grab the srpm and do and rpmbuild --rebuild on it on a 64-bit box, and should be able to build yourself a ppc64 binary (which should be installable in parallel with the ppc32 package). Clear as mud? :) -- Jarod Wilson [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [RFC] hugetlbfs setup script
Hey folks, We (Red Hat) get the occasional complaint, particularly from jboss folks, that setting up large page access for java app servers is too difficult. As a result, I was asked to throw together a scriptlet to make life easier for such people. The attached python script has been used successfully on Red Hat Enterprise Linux 5, Fedora 11 and Fedora 12, and is likely to work for other distros (though possibly with some minor tweaking required). I believe we're intending to include it in an upcoming RHEL release, and if at all possible and appropriate, I'd appreciate seeing it included in the official libhugetlbfs distribution for the benefit of others (and ultimately for RHEL too, so we don't have to carry anything out-of-tree), even if it were just under a not-installed-by-default contrib/ directory. Comments welcomed... Even if they are "wow, this sucks horribly, please go away" (but hopefully they aren't). :) -- Jarod Wilson ja...@redhat.com #!/usr/bin/python # # Tool to set up Linux large page support with minimal effort # # by Jarod Wilson # (c) Red Hat, Inc., 2009 # import os debug = False # config files we need access to sysctlConf = "/etc/sysctl.conf" if not os.access(sysctlConf, os.W_OK): print "Cannot access %s" % sysctlConf limitsConf = "/etc/security/limits.d/hugepages.conf" if not os.access(limitsConf, os.W_OK): print "Cannot access %s" % limitsConf # Figure out what we've got in the way of memory memInfo = open("/proc/meminfo").readlines() memTotal = 0 hugePages = 0 hugePageSize = 0 for line in memInfo: if line.startswith("MemTotal:"): memTotal = int(line.split()[1]) break for line in memInfo: if line.startswith("HugePages_Total:"): hugePages = int(line.split()[1]) break for line in memInfo: if line.startswith("Hugepagesize:"): hugePageSize = int(line.split()[1]) break # Get initial sysctl settings shmmax = 0 nr_hugepages = 0 hugeGID = 0 sysctlCur = os.popen("/sbin/sysctl -a").readlines() for line in sysctlCur: if line.startswith("kernel.shmmax = "): shmmax = int(line.split()[2]) break for line in sysctlCur: if line.startswith("vm.nr_hugepages = "): nr_hugepages = int(line.split()[2]) break for line in sysctlCur: if line.startswith("vm.hugetlb_shm_group = "): hugeGID = int(line.split()[2]) break # translate group into textual version hugeGIDName = "null" groupNames = os.popen("/usr/bin/getent group").readlines() for line in groupNames: curGID = int(line.split(":")[2]) if curGID == hugeGID: hugeGIDName = line.split(":")[0] break # dump system config as we see it before we start tweaking it print "Current configuration:" print " * Total System Memory..: %6d MB" % (memTotal / 1024) print " * Shared Mem Max Mapping...: %6d MB" % (shmmax / (1024 * 1024)) print " * System Huge Page Size: %6d MB" % (hugePageSize / 1024) print " * Number of Huge Pages.: %6d"% hugePages print " * Total size of Huge Pages.: %6d MB" % (hugePages * hugePageSize / 1024) print " * Remaining System Memory..: %6d MB" % ((memTotal / 1024) - (hugePages * hugePageSize / 1024)) print " * Huge Page User Group.: %s (%d)" % (hugeGIDName, hugeGID) print # determine some sanity safeguards halfOfMem = memTotal / 2 allMemLess2G = memTotal - 2048000 if halfOfMem >= allMemLess2G: maxHugePageReqKB = halfOfMem else: maxHugePageReqKB = allMemLess2G maxHugePageReqMB = maxHugePageReqKB / 1024 maxHugePageReq = maxHugePageReqKB / hugePageSize # ask how memory they want to allocate for huge pages userIn = None while not userIn: try: userIn = raw_input("How much memory would you like to allocate for huge pages? " "(input in MB, unless postfixed with GB): ") if userIn[-2:] == "GB": userHugePageReqMB = int(userIn[0:-2]) * 1024 elif userIn[-1:] == "G": userHugePageReqMB = int(userIn[0:-1]) * 1024 elif userIn[-2:] == "MB": userHugePageReqMB = int(userIn[0:-2]) elif userIn[-1:] == "MB": userHugePageReqMB = int(userIn[0:-1]) else: userHugePageReqMB = int(userIn) if userHugePageReqMB > maxHugePageReqMB: userIn = None print "Sorry, the most I'll let you allocate is %d MB, try again!" % maxHugePageReqMB else: break except ValueError: userIn = None print "Input must be an integer, please try again!" userHugePageReqKB = userHugePageReqMB * 1024 userHugePagesReq = userHugePageReqKB / hugePa
Re: [Libhugetlbfs-devel] [RFC] hugetlbfs setup script
On 09/09/2009 08:16 PM, David Gibson wrote: > On Wed, Sep 09, 2009 at 11:04:57AM -0400, Jarod Wilson wrote: >> Hey folks, >> >> We (Red Hat) get the occasional complaint, particularly from jboss >> folks, that setting up large page access for java app servers is too >> difficult. As a result, I was asked to throw together a scriptlet to >> make life easier for such people. >> >> The attached python script has been used successfully on Red Hat >> Enterprise Linux 5, Fedora 11 and Fedora 12, and is likely to work for >> other distros (though possibly with some minor tweaking required). I >> believe we're intending to include it in an upcoming RHEL release, and >> if at all possible and appropriate, I'd appreciate seeing it included in >> the official libhugetlbfs distribution for the benefit of others (and >> ultimately for RHEL too, so we don't have to carry anything >> out-of-tree), even if it were just under a not-installed-by-default >> contrib/ directory. >> >> Comments welcomed... Even if they are "wow, this sucks horribly, please >> go away" (but hopefully they aren't). :) > > I thought this sort of hugepage setup was the job of hugeadm. But I > haven't been around much since hugeadm came into existence. If what > this script does shouldn't be added to hugeadm, it at least looks as > if some of the things it's doing it should do by invoking hugeadm, > rather than directly. Ugh. I swear I looked around for something like this a few months ago when this was first written, and didn't find anything. At first glance, it does seem that hugeadm will do the bulk of things for me. Okay, I'll go back to the drawing board for a bit, and see if this is at all necessary any more (or if it simply needs to be updated to use hugeadm for some of the things its doing now). -- Jarod Wilson ja...@redhat.com -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
Re: [Libhugetlbfs-devel] [RFC] hugetlbfs setup script
On 09/09/2009 08:16 PM, David Gibson wrote: On Wed, Sep 09, 2009 at 11:04:57AM -0400, Jarod Wilson wrote: Hey folks, We (Red Hat) get the occasional complaint, particularly from jboss folks, that setting up large page access for java app servers is too difficult. As a result, I was asked to throw together a scriptlet to make life easier for such people. The attached python script has been used successfully on Red Hat Enterprise Linux 5, Fedora 11 and Fedora 12, and is likely to work for other distros (though possibly with some minor tweaking required). I believe we're intending to include it in an upcoming RHEL release, and if at all possible and appropriate, I'd appreciate seeing it included in the official libhugetlbfs distribution for the benefit of others (and ultimately for RHEL too, so we don't have to carry anything out-of-tree), even if it were just under a not-installed-by-default contrib/ directory. Comments welcomed... Even if they are "wow, this sucks horribly, please go away" (but hopefully they aren't). :) I thought this sort of hugepage setup was the job of hugeadm. But I haven't been around much since hugeadm came into existence. If what this script does shouldn't be added to hugeadm, it at least looks as if some of the things it's doing it should do by invoking hugeadm, rather than directly. I'm pretty sure some of the things this script does should NOT be added to hugeadm itself, but I've now done a minor rewrite of the script to utilize hugeadm (and pagesize) for a variety of the tasks previously done inside the script by directly touching stuff in /proc. Nb: RHEL5 ships with an older libhugetlbfs (v1.3) that has no hugeadm, but the initial script version works just fine there. -- Jarod Wilson ja...@redhat.com #!/usr/bin/python # # Tool to set up Linux large page support with minimal effort # # by Jarod Wilson # (c) Red Hat, Inc., 2009 # import os debug = True # config files we need access to sysctlConf = "/etc/sysctl.conf" if not os.access(sysctlConf, os.W_OK): print "Cannot access %s" % sysctlConf if debug == False: os._exit(1) limitsConf = "/etc/security/limits.d/hugepages.conf" if not os.access(limitsConf, os.W_OK): print "Cannot access %s" % limitsConf if debug == False: os._exit(1) # Rudimentary check to bail on NUMA systems memInfo = open("/proc/meminfo").readlines() for line in memInfo: if line.startswith("Node 0 HugePages_Total:"): print "This script does not support NUMA systems!" os._exit(1) # Figure out what we've got in the way of memory memTotal = 0 hugePageSize = 0 hugePages = 0 freeInfo = os.popen("/usr/bin/free").readlines() for line in freeInfo: if line.startswith("Mem:"): memTotal = int(line.split()[1]) break # Pick the largest available huge page size hugePageSizes = os.popen("/usr/bin/pagesize -H").readlines() for line in hugePageSizes: tmp = int(line.split()[0]) if tmp > hugePageSize: hugePageSize = tmp # Now figure out current count for largest huge page size hugeadmOut = os.popen("/usr/bin/hugeadm --pool-list").readlines() for line in hugeadmOut: if line.startswith(str(hugePageSize)): hugePages = int(line.split()[2]) break # Get initial sysctl settings shmmax = 0 nr_hugepages = 0 hugeGID = 0 sysctlCur = os.popen("/sbin/sysctl -a").readlines() for line in sysctlCur: if line.startswith("kernel.shmmax = "): shmmax = int(line.split()[2]) break for line in sysctlCur: if line.startswith("vm.nr_hugepages = "): nr_hugepages = int(line.split()[2]) break for line in sysctlCur: if line.startswith("vm.hugetlb_shm_group = "): hugeGID = int(line.split()[2]) break # translate group into textual version hugeGIDName = "null" groupNames = os.popen("/usr/bin/getent group").readlines() for line in groupNames: curGID = int(line.split(":")[2]) if curGID == hugeGID: hugeGIDName = line.split(":")[0] break # dump system config as we see it before we start tweaking it print "Current configuration:" print " * Total System Memory..: %6d MB" % (memTotal / 1024) print " * Shared Mem Max Mapping...: %6d MB" % (shmmax / (1024 * 1024)) print " * System Huge Page Size: %6d MB" % (hugePageSize / (1024 * 1024)) print " * Number of Huge Pages.: %6d"% hugePages print " * Total size of Huge Pages.: %6d MB" % (hugePages * hugePageSize / (1024 * 1024)) print " * Remaining System Memory..: %6d MB" % ((memTotal / 1024) - (hugePages * hugePageSize / (1024 * 1024))) print " * Huge Page User Group.: %s (%d)" % (hugeGIDName, hugeGID)
Re: [Libhugetlbfs-devel] [RFC] hugetlbfs setup script
On 09/17/2009 06:46 AM, Mel Gorman wrote: > On Wed, Sep 09, 2009 at 11:04:57AM -0400, Jarod Wilson wrote: >> Hey folks, >> >> We (Red Hat) get the occasional complaint, particularly from jboss >> folks, that setting up large page access for java app servers is too >> difficult. As a result, I was asked to throw together a scriptlet to >> make life easier for such people. >> > > We'd heard similar complaints. It's what led to the development of > hugeadm. It was intended to understand the wide variety of proc files > and be in the position to explain the state of the system. It's a work > in progress. > > In your defence, it's fairly new and easily missed particularly. Yeah, and it doesn't exist in RHEL5's libhugetlbfs, which is particularly relevant for me... :) >> The attached python script has been used successfully on Red Hat >> Enterprise Linux 5, Fedora 11 and Fedora 12, and is likely to work for >> other distros (though possibly with some minor tweaking required). I >> believe we're intending to include it in an upcoming RHEL release, and >> if at all possible and appropriate, I'd appreciate seeing it included in >> the official libhugetlbfs distribution for the benefit of others (and >> ultimately for RHEL too, so we don't have to carry anything >> out-of-tree), even if it were just under a not-installed-by-default >> contrib/ directory. >> > > My preference if possible would be to integrate as much as possible into > hugeadm and have this script converted to using hugeadm where > appropriate. So based on earlier feedback from David Gibson, I rewrote it slightly to make more use of hugeadm and pagesize, but I've talked it over w/my manager, and have the approval to go ahead and work on integrating as much as possible of what this script does into hugeadm itself. Pretty much everything you've proposed sounds sane, and I'll work on getting them added. A few questions and/or notes interspersed below.. ... >> # see if group already exists, use it if it does, if not, create it >> userGIDReq = -1 >> for line in groupNames: >> curGroupName = line.split(":")[0] >> if curGroupName == userGroupReq: >> userGIDReq = int(line.split(":")[2]) >> break >> >> if userGIDReq> -1: >> print "Group %s (gid %d) already exists, we'll use it" % (userGroupReq, >> userGIDReq) >> else: >> if debug == False: >> os.popen("/usr/sbin/groupadd %s" % userGroupReq) >> else: >> print "/usr/sbin/groupadd %s" % userGroupReq >> groupNames = os.popen("/usr/bin/getent group %s" % >> userGroupReq).readlines() >> for line in groupNames: >> curGroupName = line.split(":")[0] >> if curGroupName == userGroupReq: >> userGIDReq = int(line.split(":")[2]) >> break >> print "Created group %s (gid %d) for huge page use" % (userGroupReq, >> userGIDReq) >> print >> > > I think creating groups might be beyond the scope of hugeadm. This is > possibly the most distro-specific part of the entire script so I'd be a > little more wary of integrating it. Agreed. Creating users and groups definitely doesn't belong in hugeadm. My thought is that anything not belonging in there can still reside in an updated version of this script which does everything specific to huge pages using hugeadm. It'd be much more of a wrapper to hugeadm and {user,group}{add,mod} -- and possibly sysctl. >> # write out sysctl config changes to persist across reboot >> if debug == False: >> sysctlConfLines = "# sysctl configuration\n" >> if os.access(sysctlConf, os.W_OK): >> try: >> sysctlConfLines = open(sysctlConf).readlines() >> os.rename(sysctlConf, sysctlConf + ".backup") >> print("Saved original %s as %s.backup" % (sysctlConf, >> sysctlConf)) >> except: >> pass >> >> fd = open(sysctlConf, "w") >> for line in sysctlConfLines: >> if line.startswith("kernel.shmmax"): >> continue >> elif line.startswith("vm.nr_hugepages"): >> continue >> elif line.startswith("vm.hugetlb_shm_group"): >> continue >> else: >> fd.write(line); >> >> fd.write("kernel.shmmax = %d\n" % (memTo
[Libhugetlbfs-devel] [PATCH] hugeadm enhancements (was: [RFC] hugetlbfs setup script)
On 09/18/2009 03:43 AM, Mel Gorman wrote: On Thu, Sep 17, 2009 at 04:59:15PM -0400, Jarod Wilson wrote: The attached python script has been used successfully on Red Hat Enterprise Linux 5, Fedora 11 and Fedora 12, and is likely to work for other distros (though possibly with some minor tweaking required). ... My preference if possible would be to integrate as much as possible into hugeadm and have this script converted to using hugeadm where appropriate. So based on earlier feedback from David Gibson, I rewrote it slightly to make more use of hugeadm and pagesize, but I've talked it over w/my manager, and have the approval to go ahead and work on integrating as much as possible of what this script does into hugeadm itself. Great stuff. Attaching a full diff that implements the bulk of the things that aren't terribly hard to add to hugeadm itself. Semi-sanely broken out patches available here: http://people.redhat.com/jwilson/misc/hugeadm-enhancements/ I think creating groups might be beyond the scope of hugeadm. This is possibly the most distro-specific part of the entire script so I'd be a little more wary of integrating it. Agreed. Creating users and groups definitely doesn't belong in hugeadm. My thought is that anything not belonging in there can still reside in an updated version of this script which does everything specific to huge pages using hugeadm. It'd be much more of a wrapper to hugeadm and {user,group}{add,mod} -- and possibly sysctl. Haven't yet rewritten the script, but it should only have to wrap hugeadm and the user/group add/mod bits now, I think... Perhaps there is some scope for libhugetlbfs installing silently the first time and have a forced reinstallion present some configuration options such as creating a group and adding users as this script does? I'm inclined to say no. At least in the RHEL world, people will primarily be installing via packages, and anything interactive at install/uninstall/reinstall/etc is pretty much no-go. Now, we *could* theoretically have the package create something like a hugepage group at install time, and even set hugetlb_shm_group, but not in a persistent way (at least not w/o munging /etc/sysctl.conf directly from the package install scriptlet, which would also probably be frowned upon). I'm inclined to leave all of this to the user to configure after installation -- though with the possible aid of said script, once rewritten... hmm can't decide on this one. Not sure whether hugeadm should know to to make settings persist or if it should be recommended that hugeadm invocations be put into an rc script. Yeah, having hugeadm write to sysctl.conf doesn't sound like the best idea to me either. What about having hugeadm simply inform the user what sysctl settings they would need to add to have the settings persist? That makes sense. It could be suggested by --explain which I just noticed has no manual page entry. I should fix that. I've added a bit to --set-recommended-shmmax and --set-shm-group to spit out a warning "add foo to /etc/sysctl.conf to make these settings persist" for now. Didn't add anything to --explain though. So in in the limits.conf case, its a stand-alone file in /etc/security/limits.d/, so maybe its okay to scribble on this file? Certainly less contentious than munging sysctl.conf anyway. I'd view them as being very similar. I think we should be able to persist all settings or none at all. Maybe that's just me though. Ideally, yeah, all or none... But its a bit murkier, if in one case we're editing a system-wide file, vs. editing a file that could be part of the libhugetlbfs distribution itself. For example, /etc/security/limits.d/hugetlbfs.conf could be a file created by the libhugetlbfs rpm on RHEL, in which case, we're definitely free and clear to do with it as we please. But /etc/sysctl.conf isn't "ours". Bleah. We need an /etc/sysctl.d/hugetlbfs.conf. :) So hopefully, I've not butchered anything *too* badly... -- Jarod Wilson ja...@redhat.com hugeadm.c | 201 ++--- man/hugeadm.8 | 40 +--- 2 files changed, 222 insertions(+), 19 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index a793267..fbaebfd 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -67,12 +67,15 @@ extern char *optarg; #define PROCMOUNTS "/proc/mounts" #define PROCHUGEPAGES_MOVABLE "/proc/sys/vm/hugepages_treat_as_movable" #define PROCMINFREEKBYTES "/proc/sys/vm/min_free_kbytes" +#define PROCHUGETLBGROUP "/proc/sys/vm/hugetlb_shm_group" +#define PROCSHMMAX "/proc/sys/kernel/shmmax" #define PROCZONEINFO "/proc/zoneinfo" #define FS_NAME "hugetlbfs" #define MIN_COL 20 #define MAX_SIZE_MNTENT (64 + PATH_MAX + 32 + 128 + 2 * sizeof(int)) #define FORMAT_
Re: [Libhugetlbfs-devel] [PATCH] hugeadm enhancements
On 10/01/2009 09:07 AM, Mel Gorman wrote: > On Wed, Sep 30, 2009 at 12:22:34PM -0400, Jarod Wilson wrote: ... >> So hopefully, I've not butchered anything *too* badly... >> > > The entire diff is a bit of a mouthful so here is a patch-by-patch > commentary. It might be easier to post as a threaded patch series the > next time. Yeah, probably should have done that in the first place, but the series was only semi-sanely broken up, as you discovered. ;) Doing too many things at once. Round two will be more sanely organized, and sent to the list as a threaded patch series. > 0002-hugeadm-add-initial-support-for-showing-and-setting.patch > Minor, but I'd recommend renaming get_recommended_shmmax to > recommended_shmmax() so it roughly matches the same pair of functions > for setting min_free_kbytes. recommended_shmmax() should comment > why it has a recommended value. Will do. > Is it really a good idea fix shmmax as the total of maximum > memory. As this is about hugepages, would a better value for shmmax > be the maximum number of hugepages that can be allocated? i.e. > all statically allocate hugepages + the allower overcommit? > > Similarly, should check_shmmax() warn warn when shmmax is less > than the number of hugepages that can be allocated? This does sound much more sane. However, I'm presently clueless exactly how the max number of hugepages that can be allocated is determined... The later patch that saves 2G or half of memory was sort of a WAG at "leaving enough for the rest of the system", but if there's something actually built-in that will tell me the upper limit for how much we can allocate for hugepages, using that would definitely be better. > In the text of shmmax(), it makes mention of the "maximum heap size" > which is somewhat specific to JVMs. It would be better if it referred > to largest shared memory segment, possibly giving the heap size in > a JVM as an example. Yeah, the "maximum heap size" bit was in fact heisted from an internal doc specific to setting up jboss for use with huge pages, and that's where the current shmmax settings came from as well... > 0003-hugeadm-add-check_user-function-to-warn-if-user-is.patch > > Return value of getgrgid is not being checked. Whoops. Adding a if (!grp) warn that an invalid gid is set. > The group check is also incomplete. If the hugetlb_shm_group > is the user-only group, then the pw_gid in struct passwd will > contain the group id but it will not necessarily be in the > list returned by getgwuid. For example > > $ grep ^mel /etc/group > mel:x:1000: > > mel is not in the group but > > $ grep ^mel /etc/passwd > mel:x:1000:1000:mel,,,:/home/mel:/bin/bash > > See, my gid is 1000. > > So this needs a bit more work. Gack. Good catch. Working on this one now... > 0004-hugeadm-fix-typo.patch > Should be separate from the set but > > Acked-by: Mel Gorman Will break it out and send it by itself in a bit. > 0005-hugeadm-trim-recommended-shmmax-size.patch > > Minimally should be merged with patch 2. The "less 2G" feels a bit > arbitrary. How do you feel about the suggestion above on basing it > on the maximum number of hugepages that can possibly be in use by > the system? Works for me. Will hunt around to see where I can extract that maximum number from, but hints would be much appreciated. > 0006-hugeadm-suggest-sysctl.conf-addition-for-persistent.patch > > Maybe make this INFO level and suppress by default? The "done" > thing with utilities AFAIK is to be silent when successful Yeah, that sounds good. > 0007-hugeadm-support-pool-size-min-DEFAULT-2G-syntax.patch > The "Returning page count of" message should be DEBUG. Similar > for the page_size message. With those changed, it's a very nice > improvement to the utility. > > Acked-by: Mel Gorman Was pretty happy with how this one turned out myself. I'll flip those messages to DEBUG for round 2. > 0009-hugeadm-print-note-on-how-to-make-hugetlb_shm_group.patch > > I have similar concertns with this as Patch 6. Maybe it > could be made part of --explain to say how settings > can be persisted? How 'bout we make it an INFO-level print when its actually set, and add spew to --explain for both this and shmmax settings? But if we do, should we go so far as to try to look and see if they have already been added to sysctl.conf, and suppress the message if they have? Or can we call just printing out that they need to be in there
Re: [Libhugetlbfs-devel] [PATCH] hugeadm enhancements
On 10/01/2009 11:14 AM, Mel Gorman wrote: > On Thu, Oct 01, 2009 at 10:18:50AM -0400, Jarod Wilson wrote: ... >>> Is it really a good idea fix shmmax as the total of maximum >>> memory. As this is about hugepages, would a better value for shmmax >>> be the maximum number of hugepages that can be allocated? i.e. >>> all statically allocate hugepages + the allower overcommit? >>> >>> Similarly, should check_shmmax() warn warn when shmmax is less >>> than the number of hugepages that can be allocated? >> >> This does sound much more sane. However, I'm presently clueless exactly >> how the max number of hugepages that can be allocated is determined... > > Take a look at pool_list() as an example. Use hpool_sizes() to get a list > of pools configured. Don't bother sorting it, just iterate through the list > summing up pools[pos].maximum * pools[pos].size. Looking at it now... I presume you meant pools[pos].maximum * pools[pos].pagesize. So if we were to use this, one couldn't do anything meaningful with --set-recommended-shmmax until after hugepages had been allocated, and if more huge pages were later allocated, they would have to adjust it again. I was sort of gunning for a set-it-and-forget-it type option, but I suppose its not too unreasonable to expect huge pages to be configured before we recommend (or set) a shmmax value, and require reconfiguring if the amount of huge pages changes. One minor issue I'm now looking at... If --set-recommended-shmmax is run without huge pages configured, the recommendation is 0. Presently, I'm refusing to do anything if I get 0 back. However... If its run after huge pages have been torn down for some reason, it might be nice to set shmmax back to the system default. Not sure if 32MB is standard on all arches/distros/kernels/whatever though, and/or if we should just leave good enough alone. >>> 0009-hugeadm-print-note-on-how-to-make-hugetlb_shm_group.patch >>> >>> I have similar concertns with this as Patch 6. Maybe it >>> could be made part of --explain to say how settings >>> can be persisted? >> >> How 'bout we make it an INFO-level print when its actually set, and add >> spew to --explain for both this and shmmax settings? >> > > Ok, that's reasonable. They'll see the message then with -v but > otherwise it'll be quiet. > >> But if we do, should we go so far as to try to look and see if they have >> already been added to sysctl.conf, and suppress the message if they >> have? > > I don't think you need to consult sysctl.conf. You could check the > existing value for the values and only print something if it changes. > It's easier than sysconf. Okay, I'm currently rolling with that. Gotta run, but I've got about 99% of an updated patchset together now, which I'll get out the door tomorrow morning, if not later tonight. Thoughts on if we should do anything for --set-recommended-shmmax with no huge pages configured would be appreciated. I'm presently leaning towards just letting it be. -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [PATCH 5/5] hugeadm: add support for setting recommended shmmax value
- print notice in --explain output about current shmmax value - add --set-recommended-shmmax switch, which (after huges pages are configured) sets shmmax to a value equal to the sum of the maximum space that has been allocated for huge pages Signed-off-by: Jarod Wilson --- hugeadm.c | 78 + man/hugeadm.8 |9 ++ 2 files changed, 87 insertions(+), 0 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index 5e92de5..1666e27 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -67,6 +67,7 @@ extern char *optarg; #define PROCMOUNTS "/proc/mounts" #define PROCHUGEPAGES_MOVABLE "/proc/sys/vm/hugepages_treat_as_movable" #define PROCMINFREEKBYTES "/proc/sys/vm/min_free_kbytes" +#define PROCSHMMAX "/proc/sys/kernel/shmmax" #define PROCHUGETLBGROUP "/proc/sys/vm/hugetlb_shm_group" #define PROCZONEINFO "/proc/zoneinfo" #define FS_NAME "hugetlbfs" @@ -95,6 +96,8 @@ void print_usage() OPTION("--set-recommended-min_free_kbytes", ""); CONT("Sets min_free_kbytes to a recommended value to improve availability of"); CONT("huge pages at runtime"); + OPTION("--set-recommended-shmmax", "Sets shmmax to a recommended value to"); + CONT("maximise the size possible for shared memory pools"); OPTION("--set-shm-group ", "Sets hugetlb_shm_group to the"); CONT("specified group, which has permission to use hugetlb shared memory pools"); OPTION("--add-temp-swap[=count]", "Specified with --pool-pages-min to create"); @@ -139,6 +142,7 @@ int opt_dry_run = 0; int opt_hard = 0; int opt_movable = -1; int opt_set_recommended_minfreekbytes = 0; +int opt_set_recommended_shmmax = 0; int opt_set_hugetlb_shm_group = 0; int opt_temp_swap = 0; int opt_ramdisk_swap = 0; @@ -220,6 +224,7 @@ void verbose_expose(void) #define LONG_POOL_MAX_ADJ (LONG_POOL|'M') #define LONG_SET_RECOMMENDED_MINFREEKBYTES ('k' << 8) +#define LONG_SET_RECOMMENDED_SHMMAX('x' << 8) #define LONG_SET_HUGETLB_SHM_GROUP ('R' << 8) #define LONG_MOVABLE ('z' << 8) @@ -693,6 +698,70 @@ void check_minfreekbytes(void) } } +long recommended_shmmax(void) +{ + struct hpage_pool pools[MAX_POOLS]; + long recommended_shmmax = 0; + int pos, cnt; + + cnt = hpool_sizes(pools, MAX_POOLS); + if (cnt < 0) { + ERROR("unable to obtain pools list"); + exit(EXIT_FAILURE); + } + + for (pos = 0; cnt--; pos++) + recommended_shmmax += (pools[pos].maximum * pools[pos].pagesize); + + return recommended_shmmax; +} + +void set_recommended_shmmax(void) +{ + int ret; + long recommended = recommended_shmmax(); + + if (recommended == 0) { + printf("\n"); + WARNING("We can only set a recommended shmmax when huge pages are configured!\n"); + return; + } + + DEBUG("Setting shmmax to %ld\n", recommended); + ret = file_write_ulong(PROCSHMMAX, (unsigned long)recommended); + + if (!ret) { + INFO("To make shmmax settings persistent, add the following line to /etc/sysctl.conf:\n"); + INFO(" kernel.shmmax = %ld\n", recommended); + } +} + +void check_shmmax(void) +{ + long current_shmmax = file_read_ulong(PROCSHMMAX, NULL); + long recommended = recommended_shmmax(); + + if (current_shmmax != recommended) { + printf("\n"); + printf("A " PROCSHMMAX " value of %ld bytes may be sub-optimal. To maximise\n", current_shmmax); + printf("shared memory usage, this should be set to the size of the largest shared memory\n"); + printf("segment size you want to be able to use. Alternatively, set it to a size matching\n"); + printf("the maximum possible allocation size of all huge pages. This can be done\n"); + printf("automatically, using the --set-recommended-shmmax option.\n"); + } + + if (recommended == 0) { + printf("\n"); + WARNING("We can't make a shmmax recommendation until huge pages are configured!\n"); + return; + } + + printf("\n"); + printf("The recommended shmmax for your currently allocated huge pages is %ld bytes.\n", recommended); + printf("To make shmmax settings persistent, add the following line to /etc/sysctl.conf:\n"); + printf(" kernel.shmmax = %ld\n", recommended)
[Libhugetlbfs-devel] [PATCH 1/5] hugeadm: show amount of system memory in explain output
Signed-off-by: Jarod Wilson --- hugeadm.c | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index 5c110cf..21d7d5d 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -73,6 +73,7 @@ extern char *optarg; #define MAX_SIZE_MNTENT (64 + PATH_MAX + 32 + 128 + 2 * sizeof(int)) #define FORMAT_LEN 20 +#define MEM_TOTAL "MemTotal:" #define SWAP_FREE "SwapFree:" #define SWAP_TOTAL "SwapTotal:" @@ -589,6 +590,19 @@ void create_mounts(char *user, char *group, char *base, mode_t mode) } /** + * show_mem shouldn't change the behavior of any of its + * callers, it only prints a message to the user showing the + * total amount of memory in the system (in megabytes). + */ +void show_mem() +{ + long mem_total; + + mem_total = read_meminfo(MEM_TOTAL); + printf("Total System Memory: %ld MB\n\n", mem_total / 1024); +} + +/** * check_swap shouldn't change the behavior of any of its * callers, it only prints a message to the user if something * is being done that might fail without swap available. i.e. @@ -1003,6 +1017,7 @@ void page_sizes(int all) void explain() { + show_mem(); mounts_list_all(); printf("\nHuge page pools:\n"); pool_list(); -- 1.6.2.5 -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [PATCH] hugeadm: fix maximiuse typo
Signed-off-by: Jarod Wilson --- hugeadm.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index 5c110cf..a793267 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -668,7 +668,7 @@ void check_minfreekbytes(void) /* There should be at least one pageblock free per zone in the system */ if (recommended_min > min_free_kbytes) { printf("\n"); - printf("The " PROCMINFREEKBYTES " of %ld is too small. To maximise efficiency\n", min_free_kbytes); + printf("The " PROCMINFREEKBYTES " of %ld is too small. To maximiuse efficiency\n", min_free_kbytes); printf("of fragmentation avoidance, there should be at least one huge page free per zone\n"); printf("in the system which minimally requires a min_free_kbytes value of %ld\n", recommended_min); } -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [PATCH 0/5] hugeadm: assorted enhancements v2
Prematurely sent the first time... Sorry for the noise... Here comes an updated version of hugeadm enhancements, all updated to per discussion of the first series with Mel Gorman. [Patch 1/5] hugeadm: show amount of system memory in explain output [Patch 2/5] hugeadm: add check_user function to warn if user is not in hugetlb_shm_group [Patch 3/5] hugeadm: support --pool-size-min DEFAULT:2G syntax Allow the user to allocate x G, M or K of hugepages using the default huge page size, rather than having to specify page size and number of pages (both of which of course still work too). Mixture of explicit page sizes and page counts with memory sizes and default page sizes respectively should also work. i.e., the following... hugeadm --pool-size-min 2M:1024 hugeadm --pool-size-min DEFAULT:2G hugeadm --pool-size-min 2M:2G hugeadm --pool-size-min DEFAULT:1024 ...should all be functionally equivalent on a box with a default huge page size of 2M (such as my x86_64 test rig here). [Patch 4/5] hugeadm: add support for setting hugetlb_shm_group [Patch 5/5] hugeadm: add support for setting recommended shmmax value - print notice in --explain output about current shmmax value - add --set-recommended-shmmax switch, which (after huges pages are configured) sets shmmax to a value equal to the sum of the maximum space that has been allocated for huge pages -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [PATCH 2/5] hugeadm: add check_user function to warn if user is not in hugetlb_shm_group
Signed-off-by: Jarod Wilson --- hugeadm.c | 39 +++ 1 files changed, 39 insertions(+), 0 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index 21d7d5d..8871163 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -67,6 +67,7 @@ extern char *optarg; #define PROCMOUNTS "/proc/mounts" #define PROCHUGEPAGES_MOVABLE "/proc/sys/vm/hugepages_treat_as_movable" #define PROCMINFREEKBYTES "/proc/sys/vm/min_free_kbytes" +#define PROCHUGETLBGROUP "/proc/sys/vm/hugetlb_shm_group" #define PROCZONEINFO "/proc/zoneinfo" #define FS_NAME "hugetlbfs" #define MIN_COL 20 @@ -688,6 +689,43 @@ void check_minfreekbytes(void) } } +/* heisted from shadow-utils/libmisc/list.c::is_on_list() */ +static int user_in_group(char *const *list, const char *member) +{ + while (*list != NULL) { + if (strcmp(*list, member) == 0) { + return 1; + } + list++; + } + + return 0; +} + +void check_user(void) +{ + uid_t uid; + gid_t gid; + struct passwd *pwd; + struct group *grp; + + gid = (gid_t)file_read_ulong(PROCHUGETLBGROUP, NULL); + grp = getgrgid(gid); + if (!grp) { + printf("\n"); + WARNING("Group ID %d in hugetlb_shm_group doesn't appear to be a valid group!\n", gid); + return; + } + + uid = getuid(); + pwd = getpwuid(uid); + + if (gid != pwd->pw_gid && !user_in_group(grp->gr_mem, pwd->pw_name) && uid != 0) { + printf("\n"); + WARNING("User %s (uid: %d) is not a member of the hugetlb_shm_group %s (gid: %d)!\n", pwd->pw_name, uid, grp->gr_name, gid); + } +} + void add_temp_swap(long page_size) { char path[PATH_MAX]; @@ -1025,6 +1063,7 @@ void explain() page_sizes(0); check_minfreekbytes(); check_swap(); + check_user(); printf("\nNote: Permanent swap space should be preferred when dynamic " "huge page pools are used.\n"); } -- 1.6.2.5 -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [PATCH 3/5] hugeadm: support --pool-size-min DEFAULT:2G syntax
Allow the user to allocate x G, M or K of hugepages using the default huge page size, rather than having to specify page size and number of pages (both of which of course still work too). Mixture of explicit page sizes and page counts with memory sizes and default page sizes respectively should also work. i.e., the following... hugeadm --pool-size-min 2M:1024 hugeadm --pool-size-min DEFAULT:2G hugeadm --pool-size-min 2M:2G hugeadm --pool-size-min DEFAULT:1024 ...should all be functionally equivalent on a box with a default huge page size of 2M (such as my x86_64 test rig here). Signed-off-by: Jarod Wilson --- hugeadm.c | 42 ++ man/hugeadm.8 | 25 +++-- 2 files changed, 49 insertions(+), 18 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index 8871163..f53aff3 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -88,9 +88,9 @@ void print_usage() OPTION("--hard", "specified with --pool-pages-min to make"); CONT("multiple attempts at adjusting the pool size to the"); CONT("specified count on failure"); - OPTION("--pool-pages-min :[+|-]", ""); + OPTION("--pool-pages-min :[+|-]>", ""); CONT("Adjust pool 'size' lower bound"); - OPTION("--pool-pages-max :[+|-]", ""); + OPTION("--pool-pages-max :[+|-]>", ""); CONT("Adjust pool 'size' upper bound"); OPTION("--set-recommended-min_free_kbytes", ""); CONT("Sets min_free_kbytes to a recommended value to improve availability of"); @@ -880,18 +880,37 @@ enum { POOL_BOTH, }; -static long value_adjust(char *adjust_str, long base) +static long value_adjust(char *adjust_str, long base, long page_size) { long adjust; char *iter; /* Convert and validate the adjust. */ + errno = 0; adjust = strtol(adjust_str, &iter, 0); - if (*iter) { + /* Catch strtol errors and sizes that overflow the native word size */ + if (errno || adjust_str == iter) { + if (errno == ERANGE) + errno = EOVERFLOW; + else + errno = EINVAL; ERROR("%s: invalid adjustment\n", adjust_str); exit(EXIT_FAILURE); } + switch (*iter) { + case 'G': + case 'g': + adjust = size_to_smaller_unit(adjust); + case 'M': + case 'm': + adjust = size_to_smaller_unit(adjust); + case 'K': + case 'k': + adjust = size_to_smaller_unit(adjust); + adjust = adjust / page_size; + } + if (adjust_str[0] != '+' && adjust_str[0] != '-') base = 0; @@ -904,6 +923,8 @@ static long value_adjust(char *adjust_str, long base) } base += adjust; + DEBUG("Returning page count of %ld\n", base); + return base; } @@ -937,7 +958,12 @@ void pool_adjust(char *cmd, unsigned int counter) page_size_str, adjust_str, counter); /* Convert and validate the page_size. */ - page_size = parse_page_size(page_size_str); + if (strcmp(page_size_str, "DEFAULT") == 0) + page_size = kernel_default_hugepage_size(); + else + page_size = parse_page_size(page_size_str); + + DEBUG("Working with page_size of %ld\n", page_size); cnt = hpool_sizes(pools, MAX_POOLS); if (cnt < 0) { @@ -957,14 +983,14 @@ void pool_adjust(char *cmd, unsigned int counter) max = pools[pos].maximum; if (counter == POOL_BOTH) { - min = value_adjust(adjust_str, min); + min = value_adjust(adjust_str, min, page_size); max = min; } else if (counter == POOL_MIN) { - min = value_adjust(adjust_str, min); + min = value_adjust(adjust_str, min, page_size); if (min > max) max = min; } else { - max = value_adjust(adjust_str, max); + max = value_adjust(adjust_str, max, page_size); if (max < min) min = max; } diff --git a/man/hugeadm.8 b/man/hugeadm.8 index 6342980..853a741 100644 --- a/man/hugeadm.8 +++ b/man/hugeadm.8 @@ -2,7 +2,7 @@ .\" First parameter, NAME, should be all caps .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection .\" other parameters are allowed: see man(7), man(1) -.TH HUGEADM 8 "October 10, 2008" +.TH HUGEADM 8 "October 1, 2009" .\" Please adjust this date whenever revising the
[Libhugetlbfs-devel] [PATCH 4/5] hugeadm: add support for setting hugetlb_shm_group
Signed-off-by: Jarod Wilson --- hugeadm.c | 47 +++ man/hugeadm.8 |7 +++ 2 files changed, 54 insertions(+), 0 deletions(-) diff --git a/hugeadm.c b/hugeadm.c index f53aff3..5e92de5 100644 --- a/hugeadm.c +++ b/hugeadm.c @@ -95,6 +95,8 @@ void print_usage() OPTION("--set-recommended-min_free_kbytes", ""); CONT("Sets min_free_kbytes to a recommended value to improve availability of"); CONT("huge pages at runtime"); + OPTION("--set-shm-group ", "Sets hugetlb_shm_group to the"); + CONT("specified group, which has permission to use hugetlb shared memory pools"); OPTION("--add-temp-swap[=count]", "Specified with --pool-pages-min to create"); CONT("temporary swap space for the duration of the pool resize. Default swap"); CONT("size is 5 huge pages. Optional arg sets size to 'count' huge pages"); @@ -137,6 +139,7 @@ int opt_dry_run = 0; int opt_hard = 0; int opt_movable = -1; int opt_set_recommended_minfreekbytes = 0; +int opt_set_hugetlb_shm_group = 0; int opt_temp_swap = 0; int opt_ramdisk_swap = 0; int opt_swap_persist = 0; @@ -217,6 +220,7 @@ void verbose_expose(void) #define LONG_POOL_MAX_ADJ (LONG_POOL|'M') #define LONG_SET_RECOMMENDED_MINFREEKBYTES ('k' << 8) +#define LONG_SET_HUGETLB_SHM_GROUP ('R' << 8) #define LONG_MOVABLE ('z' << 8) #define LONG_MOVABLE_ENABLE(LONG_MOVABLE|'e') @@ -689,6 +693,19 @@ void check_minfreekbytes(void) } } +void set_hugetlb_shm_group(gid_t gid, char *group) +{ + int ret; + + DEBUG("Setting hugetlb_shm_group to %d (%s)\n", gid, group); + ret = file_write_ulong(PROCHUGETLBGROUP, (unsigned long)gid); + + if (!ret) { + INFO("To make hugetlb_shm_group settings persistent, add the following line to /etc/sysctl.conf:\n"); + INFO(" vm.hugetlb_shm_group = %d\n", gid); + } +} + /* heisted from shadow-utils/libmisc/list.c::is_on_list() */ static int user_in_group(char *const *list, const char *member) { @@ -723,6 +740,10 @@ void check_user(void) if (gid != pwd->pw_gid && !user_in_group(grp->gr_mem, pwd->pw_name) && uid != 0) { printf("\n"); WARNING("User %s (uid: %d) is not a member of the hugetlb_shm_group %s (gid: %d)!\n", pwd->pw_name, uid, grp->gr_name, gid); + } else { + printf("\n"); + printf("To make your hugetlb_shm_group settings persistent, add the following line to /etc/sysctl.conf:\n"); + printf(" vm.hugetlb_shm_group = %d\n", gid); } } @@ -1107,6 +1128,9 @@ int main(int argc, char** argv) int opt_global_mounts = 0, opt_pgsizes = 0, opt_pgsizes_all = 0; int opt_explain = 0, minadj_count = 0, maxadj_count = 0; int ret = 0, index = 0; + gid_t opt_gid = 0; + struct group *opt_grp = NULL; + int group_invalid = 0; struct option long_opts[] = { {"help", no_argument, NULL, 'h'}, {"verbose",required_argument, NULL, 'v' }, @@ -1116,6 +1140,7 @@ int main(int argc, char** argv) {"pool-pages-min", required_argument, NULL, LONG_POOL_MIN_ADJ}, {"pool-pages-max", required_argument, NULL, LONG_POOL_MAX_ADJ}, {"set-recommended-min_free_kbytes", no_argument, NULL, LONG_SET_RECOMMENDED_MINFREEKBYTES}, + {"set-shm-group", required_argument, NULL, LONG_SET_HUGETLB_SHM_GROUP}, {"enable-zone-movable", no_argument, NULL, LONG_MOVABLE_ENABLE}, {"disable-zone-movable", no_argument, NULL, LONG_MOVABLE_DISABLE}, {"hard", no_argument, NULL, LONG_HARD}, @@ -1233,6 +1258,25 @@ int main(int argc, char** argv) opt_set_recommended_minfreekbytes = 1; break; + case LONG_SET_HUGETLB_SHM_GROUP: + opt_grp = getgrnam(optarg); + if (!opt_grp) { + opt_gid = atoi(optarg); + if (opt_gid == 0 && strcmp(optarg, "0")) + group_invalid = 1; + opt_grp = getgrgid(opt_gid); + if (!opt_grp) + group_invalid = 1; + } else { + opt_gid = opt_grp->gr_gid; + } + if
[Libhugetlbfs-devel] [PATCH 0/5] hugeadm: assorted enhancements
hugeadm: show amount of system memory in explain output hugeadm: add check_user function to warn if user is not in hugetlb_shm_group hugeadm: support --pool-size-min DEFAULT:2G syntax Allow the user to allocate x G, M or K of hugepages using the default huge page size, rather than having to specify page size and number of pages (both of which of course still work too). Mixture of explicit page sizes and page counts with memory sizes and default page sizes respectively should also work. i.e., the following... hugeadm --pool-size-min 2M:1024 hugeadm --pool-size-min DEFAULT:2G hugeadm --pool-size-min 2M:2G hugeadm --pool-size-min DEFAULT:1024 ...should all be functionally equivalent on a box with a default huge page size of 2M (such as my x86_64 test rig here). hugeadm: add support for setting hugetlb_shm_group hugeadm: add support for setting recommended shmmax value - print notice in --explain output about current shmmax value - add --set-recommended-shmmax switch, which (after huges pages are configured) sets shmmax to a value equal to the sum of the maximum space that has been allocated for huge pages -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [RFC] updated huge page setup helper script
Here's an updated huge page setup helper script, now using all the facilities added to hugeadm by the 5-part hugeadm enhancements patch series. The short version: 1) script ask user how much memory they want for huge pages and what user(s) and group should have access 2) script creates and/or modifies users and group as needed 3) script calls hugeadm to do all the huge page setup dirty work, including setting shmmax, shmgroup, etc. 4) script edits two config files for persistence (/etc/sysctl.conf and /etc/security/limits.d/hugepages.conf). Interesting enough to include in the tree? (btw, you can set DEBUG = True at the top of the script to run it w/o actually doing anything) -- Jarod Wilson ja...@redhat.com #!/usr/bin/python # # Tool to set up Linux large page support with minimal effort # # by Jarod Wilson # (c) Red Hat, Inc., 2009 # # Requires hugeadm from libhugetlbfs 2.7 (or backported support) # import os debug = False # config files we need access to sysctlConf = "/etc/sysctl.conf" if not os.access(sysctlConf, os.W_OK): print "Cannot access %s" % sysctlConf if debug == False: os._exit(1) limitsConf = "/etc/security/limits.d/hugepages.conf" if not os.access(limitsConf, os.W_OK): print "Cannot access %s" % limitsConf if debug == False: os._exit(1) # Figure out what we've got in the way of memory memTotal = 0 hugePageSize = 0 hugePages = 0 hugeadmexplain = os.popen("/usr/bin/hugeadm --explain 2>/dev/null").readlines() for line in hugeadmexplain: if line.startswith("Total System Memory:"): memTotal = int(line.split()[3]) break if memTotal == 0: print "Your version of libhugetlbfs' hugeadm utility is too old!" os._exit(1) # Pick the default huge page size and see how many pages are allocated poolList = os.popen("/usr/bin/hugeadm --pool-list").readlines() for line in poolList: if line.split()[4] == '*': hugePageSize = int(line.split()[0]) hugePages = int(line.split()[2]) break # Get initial sysctl settings shmmax = 0 hugeGID = 0 for line in hugeadmexplain: if line.startswith("A /proc/sys/kernel/shmmax value of"): shmmax = int(line.split()[4]) break for line in hugeadmexplain: if line.strip().startswith("vm.hugetlb_shm_group = "): hugeGID = int(line.split()[2]) break # translate group into textual version hugeGIDName = "null" groupNames = os.popen("/usr/bin/getent group").readlines() for line in groupNames: curGID = int(line.split(":")[2]) if curGID == hugeGID: hugeGIDName = line.split(":")[0] break # dump system config as we see it before we start tweaking it print "Current configuration:" print " * Total System Memory..: %6d MB" % memTotal print " * Shared Mem Max Mapping...: %6d MB" % (shmmax / (1024 * 1024)) print " * System Huge Page Size: %6d MB" % (hugePageSize / (1024 * 1024)) print " * Number of Huge Pages.: %6d"% hugePages print " * Total size of Huge Pages.: %6d MB" % (hugePages * hugePageSize / (1024 * 1024)) print " * Remaining System Memory..: %6d MB" % (memTotal - (hugePages * hugePageSize / (1024 * 1024))) print " * Huge Page User Group.: %s (%d)" % (hugeGIDName, hugeGID) print # ask how memory they want to allocate for huge pages userIn = None while not userIn: try: userIn = raw_input("How much memory would you like to allocate for huge pages? " "(input in MB, unless postfixed with GB): ") if userIn[-2:] == "GB": userHugePageReqMB = int(userIn[0:-2]) * 1024 elif userIn[-1:] == "G": userHugePageReqMB = int(userIn[0:-1]) * 1024 elif userIn[-2:] == "MB": userHugePageReqMB = int(userIn[0:-2]) elif userIn[-1:] == "M": userHugePageReqMB = int(userIn[0:-1]) else: userHugePageReqMB = int(userIn) # As a sanity safeguard, require at least 128M not be allocated to huge pages if userHugePageReqMB > (memTotal - 128): userIn = None print "Refusing to allocate %d, you must leave at least 128MB for the system" % userHugePageReqMB else: break except ValueError: userIn = None print "Input must be an integer, please try again!" userHugePageReqKB = userHugePageReqMB * 1024 userHugePagesReq = userHugePageReqKB / (hugePageSize / 1024) print "Okay, we'll try to allocate %d MB for huge pages..." % userHugePageReqMB print # some basic user input validation badchars = list(' \\\'":;~`!$^&*(){}[]?/><,') inputIsValid = False foundbad = False # ask fo
Re: [Libhugetlbfs-devel] [RFC] updated huge page setup helper script
On 10/16/09 2:31 AM, Eric B Munson wrote: > On Fri, 02 Oct 2009, Jarod Wilson wrote: > >> Here's an updated huge page setup helper script, now using all the >> facilities added to hugeadm by the 5-part hugeadm enhancements patch >> series. >> >> The short version: >> >> 1) script ask user how much memory they want for huge pages and what >> user(s) and group should have access >> >> 2) script creates and/or modifies users and group as needed >> >> 3) script calls hugeadm to do all the huge page setup dirty work, >> including setting shmmax, shmgroup, etc. >> >> 4) script edits two config files for persistence (/etc/sysctl.conf and >> /etc/security/limits.d/hugepages.conf). >> >> Interesting enough to include in the tree? >> >> (btw, you can set DEBUG = True at the top of the script to run it w/o >> actually doing anything) >> >> > > Jarod, > > I have no problem with this going into libhugetlbfs. If no one has > any objections I am going to merge this. Excellent! I certainly have no objections... :) -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
Re: [Libhugetlbfs-devel] [RFC] updated huge page setup helper script
On 10/22/09 5:30 AM, Eric B Munson wrote: > On Tue, 20 Oct 2009, Jarod Wilson wrote: > >> On 10/16/09 2:31 AM, Eric B Munson wrote: >>> On Fri, 02 Oct 2009, Jarod Wilson wrote: >>> >>>> Here's an updated huge page setup helper script, now using all the >>>> facilities added to hugeadm by the 5-part hugeadm enhancements patch >>>> series. >>>> >>>> The short version: >>>> >>>> 1) script ask user how much memory they want for huge pages and what >>>> user(s) and group should have access >>>> >>>> 2) script creates and/or modifies users and group as needed >>>> >>>> 3) script calls hugeadm to do all the huge page setup dirty work, >>>> including setting shmmax, shmgroup, etc. >>>> >>>> 4) script edits two config files for persistence (/etc/sysctl.conf and >>>> /etc/security/limits.d/hugepages.conf). >>>> >>>> Interesting enough to include in the tree? >>>> >>>> (btw, you can set DEBUG = True at the top of the script to run it w/o >>>> actually doing anything) >>>> >>>> >>> >>> Jarod, >>> >>> I have no problem with this going into libhugetlbfs. If no one has >>> any objections I am going to merge this. >> >> Excellent! I certainly have no objections... :) >> > > Jarod, > > Can you please resubmit this as a patch against libhugetlbfs. Sure. Any preference on exactly where it should go, and if it should be installed by default, or simply included in a contrib/ dir in the source tree? -- Jarod Wilson ja...@redhat.com -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
Re: [Libhugetlbfs-devel] [RFC] updated huge page setup helper script
On 10/26/09 6:23 AM, Eric B Munson wrote: > On Thu, 22 Oct 2009, Jarod Wilson wrote: > ...snip >>> >>> Jarod, >>> >>> Can you please resubmit this as a patch against libhugetlbfs. >> >> Sure. Any preference on exactly where it should go, and if it should be >> installed by default, or simply included in a contrib/ dir in the source >> tree? >> > > Jarod, > > We do not yet have a scripts sub dir and I don't know that there will be > enough to justify one so please add it to the top level directory. I am > fine with it being installed with the other utilities. Blah. Sorry for the delay, been a bit tied up with other things... However, on the up side, our internal QA folks pointed out a few places that user input validation needed to be enhanced. Will send the patch in a new thread in just a moment. -- Jarod Wilson ja...@redhat.com -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
[Libhugetlbfs-devel] [PATCH] libhugetlbfs huge page setup helper
A common complaint we (Red Hat) get from customers is that setting up huge pages for use with their java or database applications is complex. While hugeadm makes this easier than it has been in the past, some customers want brain-dead simple, fire and forget setup. The attached patch adds a python script which attempts to implement that, asking the user only three questions (how much memory and what users and group should be able to use it), which should get them up and going. Signed-off-by: Jarod Wilson --- Makefile |2 +- huge_page_setup_helper.py | 329 + 2 files changed, 330 insertions(+), 1 deletions(-) diff --git a/Makefile b/Makefile index acd17f2..22e19d1 100644 --- a/Makefile +++ b/Makefile @@ -6,7 +6,7 @@ LIBPUOBJS = init_privutils.o debug.o hugeutils.o kernel-features.o INSTALL_OBJ_LIBS = libhugetlbfs.so libhugetlbfs.a libhugetlbfs_privutils.so BIN_OBJ_DIR=obj PM_OBJ_DIR=TLBC -INSTALL_BIN = hugectl hugeedit hugeadm pagesize +INSTALL_BIN = hugectl hugeedit hugeadm pagesize huge_page_setup_helper.py INSTALL_STAT = cpupcstat oprofile_map_events.pl oprofile_start.sh INSTALL_PERLMOD = DataCollect.pm OpCollect.pm PerfCollect.pm Report.pm INSTALL_HEADERS = hugetlbfs.h diff --git a/huge_page_setup_helper.py b/huge_page_setup_helper.py new file mode 100755 index 000..cdf3121 --- /dev/null +++ b/huge_page_setup_helper.py @@ -0,0 +1,329 @@ +#!/usr/bin/python + +# +# Tool to set up Linux large page support with minimal effort +# +# by Jarod Wilson +# (c) Red Hat, Inc., 2009 +# +# Requires hugeadm from libhugetlbfs 2.7 (or backported support) +# +import os + +debug = True + +# config files we need access to +sysctlConf = "/etc/sysctl.conf" +if not os.access(sysctlConf, os.W_OK): +print "Cannot access %s" % sysctlConf +if debug == False: +os._exit(1) + +# This file will be created if it doesn't exist +limitsConf = "/etc/security/limits.d/hugepages.conf" + + +# Figure out what we've got in the way of memory +memTotal = 0 +hugePageSize = 0 +hugePages = 0 + +hugeadmexplain = os.popen("/usr/bin/hugeadm --explain 2>/dev/null").readlines() + +for line in hugeadmexplain: +if line.startswith("Total System Memory:"): +memTotal = int(line.split()[3]) +break + +if memTotal == 0: +print "Your version of libhugetlbfs' hugeadm utility is too old!" +os._exit(1) + + +# Pick the default huge page size and see how many pages are allocated +poolList = os.popen("/usr/bin/hugeadm --pool-list").readlines() +for line in poolList: +if line.split()[4] == '*': +hugePageSize = int(line.split()[0]) +hugePages = int(line.split()[2]) +break + +if hugePageSize == 0: +print "Aborting, cannot determine system huge page size!" +os._exit(1) + +# Get initial sysctl settings +shmmax = 0 +hugeGID = 0 + +for line in hugeadmexplain: +if line.startswith("A /proc/sys/kernel/shmmax value of"): +shmmax = int(line.split()[4]) +break + +for line in hugeadmexplain: +if line.strip().startswith("vm.hugetlb_shm_group = "): +hugeGID = int(line.split()[2]) +break + + +# translate group into textual version +hugeGIDName = "null" +groupNames = os.popen("/usr/bin/getent group").readlines() +for line in groupNames: +curGID = int(line.split(":")[2]) +if curGID == hugeGID: +hugeGIDName = line.split(":")[0] +break + + +# dump system config as we see it before we start tweaking it +print "Current configuration:" +print " * Total System Memory..: %6d MB" % memTotal +print " * Shared Mem Max Mapping...: %6d MB" % (shmmax / (1024 * 1024)) +print " * System Huge Page Size: %6d MB" % (hugePageSize / (1024 * 1024)) +print " * Number of Huge Pages.: %6d"% hugePages +print " * Total size of Huge Pages.: %6d MB" % (hugePages * hugePageSize / (1024 * 1024)) +print " * Remaining System Memory..: %6d MB" % (memTotal - (hugePages * hugePageSize / (1024 * 1024))) +print " * Huge Page User Group.: %s (%d)" % (hugeGIDName, hugeGID) +print + + +# ask how memory they want to allocate for huge pages +userIn = None +while not userIn: +try: +userIn = raw_input("How much memory would you like to allocate for huge pages? " + "(input in MB, unless postfixed with GB): ") + if userIn[-2:] == "GB": +userHugePageReqMB = int(userIn[0:-2]) * 1024 + elif userIn[-1:] == "G": +userHugePageReqMB = int(userIn[0:-1]) * 1024 + elif userIn[-2:] == "MB": +userHugePageReqMB = int(userIn[0:-2]) + elif userIn[-1:] == "M": +userHugePageReqMB = int(userI
[Libhugetlbfs-devel] DIPLOMATIC DELIVERY OF YOUR FUND
FROM THE DESK OF NATIONAL SECURITY/DIPLOMATIC WAREHOUSE LONDON ENGLAND DIPLOMATIC DELIVERY OF YOUR CONSIGNMENT Hello, This is to inform you that the arrangements have been concluded in respect to shipment of your compensation funds by consignment to your country. I choose to conclude shipment to ensure it is lifted before contacting you. This process is the Airlifting of funds/consignment from one country to another via a Diplomatic means of delivery. I found out that this consignment has been lying here because you could not settle cost of fees for the release of your payment to you. This is why I decided to use my Position as the Shipment officer in charge of this Organization to convey this consignment to your Country. To enable you confirm when your consignment will arrive. I choose to do this for you because you may have paid a lot of money before abandoning this funds/consignment and I believe you will compensate me well with a good remuneration when you receive the consignment. Note: I know the content of the trunk Box because I could see the amount you are being owed for an unsettled compensation payment by a British Bank. Which is the major reason I decided to get involved. You must also know that this arrangement does not involve any of the people you were dealing with in the past because this consignment/Payment has been surrendered to the Government. Hence my involvement. Thank you for your understanding and I await your urgent respond. Regards, Lawrence A Wilson Chief Shipment Officer Linux Recovery -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel