On Thu, Oct 30, 2008 at 11:33:11AM +0000, Mel Gorman wrote:
> On Wed, Oct 29, 2008 at 04:02:53PM +0000, Andy Whitcroft wrote:
> > Add support for multiple page sizes to hugectl.  Now that the library
> > supports specification of the backing page size to use for the various
> > mappings we should expose this functionality in hugectl.  This patch
> > adds an optional size argument to all of the segment mapping options
> > specifying the backing page size required.  These are converted into the
> > appropriate variables and passed on.  For example, the following incantation
> > requests text be placed in 2MB pages, data in 64KB pages and the heap in
> > the default size:
> > 
> >     hugectl --text=2M --data=64K --heap ls
> > 
> > Combinations which are not possible result in warnings as normal.
> > The sizes are not validated but passed verbatim to the library which will
> > then validate them.
> 
> Ok, I can confirm that on an opteron with 2M and 1G pages, that I can
> text/data one size and heap another. When running a basic program that
> just mallocs and goes to sleep, I see
> 
> 00400000-00600000 r-xp 00000000 00:11 8711 
> /var/lib/hugetlbfs/mounts/pagesize-2097152/libhugetlbfs.tmp.kCVwTh
> (deleted)
> 00800000-00a00000 rw-p 00000000 00:11 8712 
> /var/lib/hugetlbfs/mounts/pagesize-2097152/libhugetlbfs.tmp.ChUZCu
> (deleted)
> 40000000-80000000 rw-p 00000000 00:12 8713 
> /var/lib/hugetlbfs/mounts/pagesize-1073741824/libhugetlbfs.tmp.a66ImH
> (deleted)
> 
> and the pagesize mounts are what you would expect. So, this is functionally
> correct by the looks of things.
> 
> > Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]>
> > ---
> >  hugectl.c |  108 
> > ++++++++++++++++++++++++++++++++++++++++---------------------
> >  1 files changed, 71 insertions(+), 37 deletions(-)
> > 
> > diff --git a/hugectl.c b/hugectl.c
> > index 902085e..a07fcb5 100644
> > --- a/hugectl.c
> > +++ b/hugectl.c
> > @@ -154,49 +154,73 @@ void verbose_expose(void)
> >  #define LONG_LIBRARY       (LONG_BASE | 'l')
> >  
> >  /*
> > - * Mapping selectors, one bit per remappable/backable area as requested
> > + * Mapping selectors, one per remappable/backable area as requested
> >   * by the user.  These are also used as returns from getopts where they
> >   * are offset from MAP_BASE, which must be removed before they are 
> > compared.
> >   */
> > -#define MAP_DISABLE        0x0001
> > -#define MAP_TEXT   0x0002
> > -#define MAP_DATA   0x0004
> > -#define MAP_BSS            0x0008
> > -#define MAP_HEAP   0x0010
> > -#define MAP_SHM            0x0020
> > -
> > -void setup_mappings(int which)
> > +enum {
> > +   MAP_TEXT,
> > +   MAP_DATA,
> > +   MAP_BSS,
> > +   MAP_HEAP,
> > +   MAP_SHM,
> > +   MAP_DISABLE,
> > +
> > +   MAP_COUNT,
> > +};
> > +char *map_size[MAP_COUNT];
> > +
> > +char default_size[] = "the default hugepage size";
> > +#define DEFAULT_SIZE default_size
> > +
> 
> I don't believe you actually use the contents of default_size anywhere
> and you just use it as a pointer value for the purposes of comparison.
> It probably could have been NULL or &mapsize too. Not that it matters a
> whole pile.

Yes actually in places we use the current size as a string to print in
warnings and the like, so its handy for the default pointer to be a real
printable string for this.

> > +void setup_mappings(int count)
> >  {
> > -   char remap[3] = { 0, 0, 0 };
> > -   int n = 0;
> > +   char value[20 + 20];
> > +   char *ptr = value;
> >  
> 
> What's this 20 + 20 about? I'm guessing it's the maximum sensible
> string size that can be placed in the environment variable but we never
> check the bounds of it, nor do we ever check if ptr goes over it. If
> hugectl was a suid binary like a hugetlbfs user to access a mount or
> accessible under sudo, this could be an exploit?

Yep, this is a hangover from when it was only using numbers.  And I
meant to covert it to snprintf... sigh.
> 
> >     /*
> >      * HUGETLB_ELFMAP should be set to either a combination of 'R' and 'W'
> >      * which indicate which segments should be remapped.  It may also be
> >      * set to 'no' to prevent remapping.
> >      */
> 
> The comment here is now a tad stale.

Fixed up.

> > -   if (which & MAP_TEXT)
> > -           remap[n++] = 'R';
> > -   if (which & (MAP_DATA|MAP_BSS)) {
> > -           if ((which & (MAP_DATA|MAP_BSS)) != (MAP_DATA|MAP_BSS))
> > -                   WARNING("data and bss remapped together\n");
> > -           remap[n++] = 'W';
> > +   if (map_size[MAP_TEXT]) {
> > +           *ptr++ = ':';
> > +           if (map_size[MAP_TEXT] == DEFAULT_SIZE)
> > +                   *ptr++ = 'R';
> > +           else
> > +                   ptr += sprintf(ptr, "R=%s", map_size[MAP_TEXT]);
> 
> heh, took me a few minutes to figure out why the environment variable
> did not always start with a ':' :)

Added a comment to tell you why.

> 
> >     }
> > -   if (which & MAP_DISABLE) {
> > -           if (which != MAP_DISABLE)
> > -                   WARNING("--disable masks requested remap\n");
> > -           n = 0;
> > -           remap[n++] = 'n';
> > -           remap[n++] = 'o';
> > +   if (map_size[MAP_DATA] != 0 || map_size[MAP_BSS] != 0) {
> 
> Any reason why it's not simply map_size[MAP_DATA] like the MAP_TEXT
> check above?

Yep, becuase we only can ask for all wriable segments together, ie we
must ask for MAP_DATA and MAP_BSS to be the same to not get any warnings
etc.

> > +           char *size = map_size[MAP_BSS];
> > +           if (map_size[MAP_DATA])
> > +                   size = map_size[MAP_DATA];
> > +           if (map_size[MAP_DATA] != map_size[MAP_BSS])
> > +                   WARNING("data and bss remapped together in %s\n", size);
> 
> ouch, this is a curious one. It's actually pretty bad if the user
> expects these things to be mapped different. However, our default level
> is ERROR, not WARNING so this message is not displayed unless there is
> at least one -v. It's not an error either though hmmm... Not much that
> can be done about it I guess.

Yep, I am assuming this will get fixed in adams work.

> > +           *ptr++ = ':';
> > +           if (size == DEFAULT_SIZE)
> > +                   *ptr++ = 'W';
> > +           else
> > +                   ptr += sprintf(ptr, "W=%s", size);
> >     }
> > +   *ptr = '\0';
> > +   if (ptr != value)
> > +           setup_environment("HUGETLB_ELFMAP", &value[1]);
> >  
> > -   if (n)
> > -           setup_environment("HUGETLB_ELFMAP", remap);
> > +   if (map_size[MAP_DISABLE]) {
> > +           if (ptr != value)
> > +                   WARNING("--disable masks requested remap\n");
> > +           setup_environment("HUGETLB_ELFMAP", "no");
> > +   }
> >  
> > -   if (which & MAP_HEAP)
> > +   if (map_size[MAP_HEAP] == DEFAULT_SIZE)
> >             setup_environment("HUGETLB_MORECORE", "yes");
> > +   else if (map_size[MAP_HEAP])
> > +           setup_environment("HUGETLB_MORECORE", map_size[MAP_HEAP]);
> >  
> > -   if (which & MAP_SHM)
> > +   if (map_size[MAP_SHM] && map_size[MAP_SHM] != DEFAULT_SIZE)
> > +           WARNING("shm segments may only be mapped in the "
> > +                   "default hugepage size\n");
> 
> Similarly there is an important warning that can be omitted without -v.
> 
> > +   if (map_size[MAP_SHM])
> >             setup_environment("HUGETLB_SHM", "yes");
> >  }
> >  
> > @@ -256,9 +280,16 @@ void library_path(char *path)
> >     setup_environment("LD_LIBRARY_PATH", val);
> >  }
> >  
> > -void ldpreload(int which)
> > +void ldpreload(int count)
> >  {
> > -   if (which & (MAP_HEAP|MAP_SHM)) {
> > +   int allowed = 0;
> > +
> > +   if (map_size[MAP_HEAP])
> > +           allowed++;
> > +   if (map_size[MAP_SHM])
> > +           allowed++;
> > +
> > +   if (allowed == count) {
> >             setup_environment("LD_PRELOAD", "libhugetlbfs.so");
> >             WARNING("LD_PRELOAD in use for lone --heap/--shm\n");
> >     } else {
> > @@ -284,12 +315,12 @@ int main(int argc, char** argv)
> >             {"library-use-path",
> >                            no_argument, NULL, LONG_NO_LIBRARY},
> >  
> > -           {"disable",    no_argument, NULL, MAP_BASE|MAP_DISABLE},
> > -           {"text",       no_argument, NULL, MAP_BASE|MAP_TEXT},
> > -           {"data",       no_argument, NULL, MAP_BASE|MAP_DATA},
> > -           {"bss",        no_argument, NULL, MAP_BASE|MAP_BSS},
> > -           {"heap",       no_argument, NULL, MAP_BASE|MAP_HEAP},
> > -           {"shm",        no_argument, NULL, MAP_BASE|MAP_SHM},
> > +           {"disable",    optional_argument, NULL, MAP_BASE|MAP_DISABLE},
> > +           {"text",       optional_argument, NULL, MAP_BASE|MAP_TEXT},
> > +           {"data",       optional_argument, NULL, MAP_BASE|MAP_DATA},
> > +           {"bss",        optional_argument, NULL, MAP_BASE|MAP_BSS},
> > +           {"heap",       optional_argument, NULL, MAP_BASE|MAP_HEAP},
> > +           {"shm",        optional_argument, NULL, MAP_BASE|MAP_SHM},
> >  
> 
> We update for optinal arguments here, but the help message is not
> updated, or the man page.

:(  Updated.

> 
> >             {0},
> >     };
> > @@ -299,7 +330,11 @@ int main(int argc, char** argv)
> >     while (ret != -1) {
> >             ret = getopt_long(argc, argv, opts, long_opts, &index);
> >             if (ret > 0 && (ret & MAP_BASE)) {
> > -                   opt_mappings |= ret;
> > +                   if (optarg)
> > +                           map_size[ret & ~MAP_BASE] = optarg;
> > +                   else
> > +                           map_size[ret & ~MAP_BASE] = DEFAULT_SIZE;
> > +                   opt_mappings++;
> >                     continue;
> >             }
> >             switch (ret) {
> > @@ -343,7 +378,6 @@ int main(int argc, char** argv)
> >             }
> >     }
> >     index = optind;
> > -   opt_mappings &= ~MAP_BASE;
> >  
> >     if (!opt_dry_run && (argc - index) < 1) {
> >             print_usage();
> > -- 
> > 1.6.0.2.711.gf1ba4
> > 

Thanks.

-apw

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to