On Mon, Oct 30, 2006 at 01:45:41PM -0800, Nishanth Aravamudan wrote:
> On 30.10.2006 [15:56:43 +1100], David Gibson wrote:
> > Latest daemon removal path.  Updated to apply clean on top of the
> > latest changes in the git tree, otherwise nothing new.
> > 
> > Now that 1.0.1 is out, are we ready to merge this?  Remember that we
> > can make a git branch to hold strictly-maintenance revisions of
> > 1.0/1.0.1 for Red Hat, if necessary.
> 
> <snip>
> 
> Mostly nits below...
> 
> > +Now, the same application run twice with the same environment variable
> > +settings will use the same file both times (reducing the overall
> > +hugetlbfs file usage). If, for any reason, segment sharing is
> > +requested but unavailable, the library will fallback to unlinked
> > +files.
> 
> Since there is no longer automatic removal of files in the hugetlbfs
> mount-point after some time (as there was with hugetlbd), this should be
> documented. And perhaps the bash script, or whatever you would like to
> have as a control mechanism should also be merged with the patch in its
> final form.

Ok, some notes about this added.  Actually, while I was at it, I
rewrote the whole section on sharing, to be clearer.

> > Index: libhugetlbfs/elflink.c
> > ===================================================================
> > --- libhugetlbfs.orig/elflink.c     2006-10-30 13:18:23.000000000 +1100
> > +++ libhugetlbfs/elflink.c  2006-10-30 13:26:05.000000000 +1100
> > @@ -1,6 +1,6 @@
> >  /*
> >   * libhugetlbfs - Easy use of Linux hugepages
> > - * Copyright (C) 2005-2006 David Gibson & Adam Litke, IBM Corporation.
> > + * Copyright (C) 2005-2006 David Gibson & dam Litke, IBM Corporation.
> 
> Mistake?

Oops, fixed.

> >   *
> >   * This library is free software; you can redistribute it and/or
> >   * modify it under the terms of the GNU Lesser General Public License
> > @@ -24,17 +24,20 @@
> >  #include <stdlib.h>
> >  #include <malloc.h>
> >  #include <string.h>
> > +#include <unistd.h>
> > +#include <fcntl.h>
> >  #include <signal.h>
> >  #include <sys/syscall.h>
> > +#include <sys/file.h>
> >  #include <linux/unistd.h>
> >  #include <sys/mman.h>
> >  #include <errno.h>
> > +#include <limits.h>
> >  #include <elf.h>
> >  #include <dlfcn.h>
> >  
> >  #include "hugetlbfs.h"
> >  #include "libhugetlbfs_internal.h"
> > -
> 
> Unnecessary whitespace change.

Fixed.

> > +/**
> > + * assemble_path - handy wrapper around snprintf() for building paths
> > + * @dst: buffer of size PATH_MAX+1 to assemble string into
> > + * @...: strings to concatenate into the path, terminated with a NULL
> > + */
> > +static void assemble_path(char *dst, const char *fmt, ...)
> > +{
> > +   va_list ap;
> > +   int len;
> > +
> > +   va_start(ap, fmt);
> > +   len = vsnprintf(dst, PATH_MAX+1, fmt, ap);
> 
> Shouldn't you also check the return value of vsnprintf()? And that makes
> assemble_path() need to be a non-void...
> 
> > +   va_end(ap);
> > +
> > +   if (len > PATH_MAX) {
> > +           ERROR("Overflow assembling path");
> > +           abort();
> 
> So you can just return -1 here or something... Why are you aborting by
> the way? Shouldn't it just fail to share? [1]

No.  It's an abort() for simplicity - same rationale as xmalloc().
Path overflows should *really* only occur in pathological situations:
indeed, the most likely is someone attempting to trigger a buffer
overflow.  Therefore, avoiding the need for checks and backout code in
all the callers is a good tradeoff.

> > +static int check_env(void)
> > +{
> 
> <snip>
> 
> > +}
> 
> This whole function move & change should be submitted as a separate
> patch, it has little to do with removing the daemon and makes this diff
> larger than necessary. But it totally makes sense to merge :)

Yeah, I think I had a reason to move it (function ordering or
something) in an interim version.  Move is undone.

However, the change (adding testing of HUGELTB_SHARE) remains.  The
path in which the test of HUGETLB_SHARE was before has been
re-arranged so much anyway, this is still the most sensible place for
it.

> >  /*
> >   * Parse an ELF header and record segment information for any segments
> >   * which contain hugetlb information.
> >   */
> > -
> 
> Unrelated change.

Fixed.

> > +static int find_or_create_share_path(void)
> 
> Deserves proper commenting at the head of the function, please.

Done.

> > +{
> > +   char *env;
> > +   struct stat sb;
> > +   int ret;
> > +
> > +   env = getenv("HUGETLB_SHARE_PATH");
> > +   if (env) {
> > +           /* Given an explicit path */
> > +           if (hugetlbfs_test_path(env) != 0) {
> > +                   ERROR("HUGETLB_SHARE_PATH %s is not on a hugetlbfs\n",
> 
> perhaps "mount point" or "filesystem" (even if it's redundant) at the
> end of the message?

Done.

> > +                         share_path);
> > +                   return -1;
> > +           }
> > +           assemble_path(share_path, "%s", env);
> > +           return 0;
> > +   }
> > +
> > +   assemble_path(share_path, "%s/elflink-uid-%d",
> > +                 hugetlbfs_find_path(), getuid());
> 
> [1] ... and check the return values here and abort then. Or just return
> -1 a level higher.

No, see comments on assemble_path() above.  Avoiding these tests and
backouts is exactly why assemble_path() uses abort().

> > +   ret = mkdir(share_path, 0700);
> > +   if ((ret != 0) && (errno != EEXIST)) {
> > +           ERROR("Error creating share directory %s", share_path);
> > +           return -1;
> > +   }
> > +
> > +   /* Check the share directory is sane */
> > +   ret = lstat(share_path, &sb);
> > +   if (ret != 0) {
> > +           ERROR("Couldn't stat() %s: %s\n", share_path, strerror(errno));
> > +           return -1;
> > +   }
> > +
> > +   if (! S_ISDIR(sb.st_mode)) {
> 
> I think we've been avoiding the extra space after the '!'.

Hrm.. both forms exist in elflink.c already, neither is clearly
dominant.  I really prefer the space here, because right between the (
and capital letter it's very easy to overlook the !.

> Does this check make sense given that you're using lstat() above? If the
> HUGETLB_SHARE_PATH is a symbolic link, then lstat() is going to stat the
> link, not the dir it points to, and then we'll fail here...

Well, first, HUGETLB_SHARE_PATH is not set in this code path.  Second,
the test is intentional: it ensures that the path is not a symbolic
link, since I can see little reason to make it so, and it very simply
prevents any security silly buggers from dropping in symlinks here.
Now, if the current uid is the owner of the link, the latter is
probably not an issue, but I'd need more time to convince myself.

> > +           ERROR("%s is not a directory", share_path);
> > +           return -1;
> > +   }
> > +
> > +   if (sb.st_uid != getuid()) {
> > +           ERROR("%s has wrong owner (uid=%d instead of %d)\n",
> > +                 share_path, sb.st_uid, getuid());
> > +           return -1;
> > +   }
> 
> To be consistent, probably should have a newline here too.

Done.

> > +   if (sb.st_mode & (S_IWGRP | S_IWOTH)) {
> > +           ERROR("%s has bad permissions 0%03o\n",
> > +                 share_path, sb.st_mode);
> > +           return -1;
> > +   }
> 
> But wait, this doesn't make complete sense -- you're only doing all this
> sanity checking if we mkdir(). Shouldn't we also do it in the case of a
> custom path?

No.  Part of the point of the custom path is that you can use it to
construct your own security model instead of the default.  If you set
HUGETLB_SHARE_PATH, it's assumed you know what you're doing
w.r.t. permissions on the target directory.

> And you're not backing out changes to the system correctly. After the
> mkdir() has succeeded, if something else fails, we should rmdir() it.

I don't think so.  The only things that can fail here are:
        - mkdir() fails, in which case we never created it so there's
no need to rmdir().
        - the permissions tests fail because someone else created it -
if someone else has already created it, we shouldn't go rmdir()ing it
(and may not have permission to, anyway)
        - mkdir() succeeds, but the lstat() fails.  This will never
happen unless things are in a *very* strange state indeed.  Best not
mess with it any further, and leave everything in place for diagnosis.

> >  /* 
> >   * Look for non-zero BSS data inside a range and print out any matches
> >   */
> > -
> 
> Unrelated change.

Ok.

> >  static void check_bss(unsigned long *start, unsigned long *end)
> >  {
> >     unsigned long *addr;
> > @@ -249,13 +383,52 @@ static void check_bss(unsigned long *sta
> >     }
> >  }
> >  
> > +/**
> > + * get_shared_file_name - create a shared file name from program name, gid 
> > and phdr
> 
> The gid is no longer being used, so the comments need to be updated. And
> mention of the word-size should be added.

Done.

> > + * @htlb_seg_info: pointer to program's segment data
> > + * @file_path: pointer to a PATH_MAX array to store filename in
> > + *
> > + * The file name created is *not* intended to be unique, except when the 
> > name,
> > + * gid or phdr number differ. The goal here is to have a standar means of
> 
> Typo from my version, standar -> standard.

Done.

> >  /* 
> >   * Subtle:  Since libhugetlbfs depends on glibc, we allow it
> >   * it to be loaded before us.  As part of its init functions, it
> >   * initializes stdin, stdout, and stderr in the bss.  We need to
> >   * include these initialized variables in our copy.
> >   */
> > -
> 
> Unrelated whitespace change.

Ok.

> >  static void get_extracopy(struct seg_info *seg, void **extra_start, 
> >                                                     void **extra_end)
> >  {
> > @@ -387,7 +560,6 @@ bail3:
> >   * smallest amount of data possible, unless the user disables this 
> >   * optimization via the HUGETLB_ELFMAP environment variable.
> >   */
> > -
> 
> Ditto.
> 
> <snip>
> 
> > +/**
> > + * find_or_prepare_shared_file - get one shareable file
> > + * @htlb_seg_info: pointer to program's segment data
> 
> This seems rather insufficient for a description of the core logic.

Yeah, added some more here.

> > + * returns:
> > + *   -1, on failure
> > + *   0, on success
> > + */
> > +static int find_or_prepare_shared_file(struct seg_info *htlb_seg_info)
> > +{
> > +   int fdx, fds;
> > +   int errnox, errnos;
> > +   int ret;
> > +   char final_path[PATH_MAX+1];
> > +   char tmp_path[PATH_MAX+1];
> > +
> > +   ret = get_shared_file_name(htlb_seg_info, final_path);
> > +   if (ret < 0)
> > +           return -1;
> > +   assemble_path(tmp_path, "%s.tmp", final_path);
> > +
> > +   do {
> 
> This really needs more commenting, I think. Maybe a little paragraph at
> the top like:

Or, for example, the comment I put in the mail I sent with the first
version of the patch containing this logic.  Revised and added as a
comment.

> > +           /* NB: mode is modified by umask */
> > +           fdx = open(tmp_path, O_CREAT | O_EXCL | O_RDWR, 0666);
> > +           errnox = errno;
> > +           fds = open(final_path, O_RDONLY);
> > +           errnos = errno;
> > +
> > +           if (fds > 0) {
> 
> Stupid nit, but >= technically, right?

Ah, yes.  Just in case someone closes stdin, then exec()s a
libhugetlbfs program.

> > +           /* Both opens failed, somebody else is still preparing */
> > +           /* Wait and try again */
> > +           sleep(1);
> > +           /* FIXME: should have a timeout */
> 
> Should be easy enough to add, no?

Fairly, yes, just that time arithmetic is usually a minor pain.  Since
this can only happen if something else has gone wrong, I think it can
wait until a later patch to add.  Indeed for now, spinning here might
actually for debugging other problems, if they arise.

> Otherwise, looks sane to me...
> 
> What kind of testing has been done with the patch?

Pretty much just the testsuite.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to