Re: Working on the install-mh change questions
On November 18, 2002 at 20:43, Jon Steinhart wrote: Oh, some details. 1. A second getenv() call would not break the code. The copy was really unnecessary. As pointed out earlier, making the copy may be needed depending on the implementation of getenv() for a given platform. And for all we know, the person who original wrote the code had an environment where it was required. 2. It's hard for me to imagine a situation where getpwuid() would #1 get called a second time and #2 for a different uid, which is the only that a problem would occur. It is hard for us to imagine alot of things, but I never fault a programmer for being extra paranoid. Paranoia makes programs more robust against unforseen events, and can pay off big time when such events occur. 3. If there's a NULL passwd-pw_dir then libc is broken and should be fixed. Better that this gets pointed out and fixed rather than covered up so tha t it stays unnoticed and broken. Oh, and I've wasted some of my volunteer time trying to figure out what the code did before changing it. I'd waste less if there was less code. Best way to accomplish this is to get rid of the code that doesn't do anything. Again, it is safe to be paranoid. It is not unheard of where an OS's libc have bugs, and software that properly checks all data is okay in my book. I'll give a real-world example related to nmh: When I did some changes to nmh code to get it to compile on cygwin, there was a system call that did NOT return the proper value. The call was broken or was stubbed and had not been implemented by the cygwin maintainers. Plus, if you take security seriously, one should always check data retrieved from components you do not have direct control over. Even if you think, this will never happen, the day it does, you'll kick yourself in the head and have a possible security vulnerability on your hands. It's like when people say, the data will never be larger than X, so they hard-code an array size, and when the data is larger than X, and buffer overflows occur. I do agree that MH/nmh does have duplicate code in areas that can be reduced down into reusable components (especially some of the time stuff when I played with making nmh compile under cygwin). I also agree that there are areas in the code that makes a new person wondering why in the hell certain code exists. However, in my experience, when I do not have complete knowledge of an existing code base, I hesitate to remove any code that appears to be unnecessary. For all I know, the original author encountered some abhorent scenario that requires the code to exist. Remember, the MH/nmh code base is *very* old, so there is alot of environmental experience that we have never been exposed to. In cases where I think something should not exist, I typically add a comment in the code like /* XXX: Why is this here? */ or /* XXX: What the heck does this do? */, and seek advice from other maintainers, and if possible the original author, on the purpose of the code, like exactly what you have done by asking about the code to this list. But, do not be surprised that other may disagree with you. Then, if people agree a chunk of code can be axed, after it is removed, make sure to test it. My $0.02, --ewh
Re: Working on the install-mh change questions
Jon Steinhart [EMAIL PROTECTED] writes: 3.If the $HOME environment variable is set, mypath is copied from the getenv return. Why? It's never changed. 4.If the $HOME environment variable is not set, mypath is copied from the pw_dir member of the returned passwd structure. Now, I understand that this is a static structure, but getpwuid is never called again so I don't see why the copy is needed. 4.If the $HOME environment variable is not set, the pw_dir member of the passwd structure returned by getpwuid() is checked for a NULL pointer. This can never happen in a non-error return, which is already checked. So why the superfluous check? These checks are not superflous, they are for maintainability. Two years from now someone will add a second getenv(3) call and waste their (probably volunteer) time trying to figure out how they busted the home dir variable. That is why, unless you are writing super-tight-must-be-the-best-performing-code-ever applications (which mh is not), it is necessary always to make a copy of the static buffer pointed to by the return values of such functions. -- Eric Gillespie * [EMAIL PROTECTED] Build a fire for a man, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life. -Terry Pratchett
Re: Working on the install-mh change questions
Jon Steinhart [EMAIL PROTECTED] writes: 3. If the $HOME environment variable is set, mypath is copied from the getenv return. Why? It's never changed. 4. If the $HOME environment variable is not set, mypath is copied from the pw_dir member of the returned passwd structure. Now, I understand that this is a static structure, but getpwuid is never called again so I don't see why the copy is needed. 4. If the $HOME environment variable is not set, the pw_dir member of the passwd structure returned by getpwuid() is checked for a NULL pointer. This can never happen in a non-error return, which is already checked. So why the superfluous check? These checks are not superflous, they are for maintainability. Two years from now someone will add a second getenv(3) call and waste their (probably volunteer) time trying to figure out how they busted the home dir variable. That is why, unless you are writing super-tight-must-be-the-best-performing-code-ever applications (which mh is not), it is necessary always to make a copy of the static buffer pointed to by the return values of such functions. -- Eric Gillespie * [EMAIL PROTECTED] This is one of those places where we'll have to respectfully disagree. I'm obviously in the minority given the quality of software that I see these days, but I think that programming is still something that should be done by professionals. I don't want someone hacking on code that doesn't take the time to figure out what's going on first. Protecting against the really silly mistakes allows such people to make really complex ones. As I said in my earlier email, I'm not going to write slow and sloppy code just because computers are fast. Matter of fact, I keep on trying to get up the courage to tackle m_getfld(). Oh, some details. 1. A second getenv() call would not break the code. The copy was really unnecessary. 2. It's hard for me to imagine a situation where getpwuid() would #1 get called a second time and #2 for a different uid, which is the only that a problem would occur. 3. If there's a NULL passwd-pw_dir then libc is broken and should be fixed. Better that this gets pointed out and fixed rather than covered up so that it stays unnoticed and broken. Oh, and I've wasted some of my volunteer time trying to figure out what the code did before changing it. I'd waste less if there was less code. Best way to accomplish this is to get rid of the code that doesn't do anything. Jon
Re: Working on the install-mh change questions
On Mon, 2002-11-18 at 22:47, Eric Gillespie wrote: Jon Steinhart [EMAIL PROTECTED] writes: 3. If the $HOME environment variable is set, mypath is copied from the getenv return. Why? It's never changed. 4. If the $HOME environment variable is not set, mypath is copied from the pw_dir member of the returned passwd structure. Now, I understand that this is a static structure, but getpwuid is never called again so I don't see why the copy is needed. 4. If the $HOME environment variable is not set, the pw_dir member of the passwd structure returned by getpwuid() is checked for a NULL pointer. This can never happen in a non-error return, which is already checked. So why the superfluous check? These checks are not superflous, they are for maintainability. Two years from now someone will add a second getenv(3) call and waste their (probably volunteer) time trying to figure out how they busted the home dir variable. The return value of getenv() is a pointer into the environment; a future getenv() call will not overwrite it. So it's safe not to copy it unless you anticipate a putenv(). (And I think it's safe even in the face of a putenv(), actually.) A judgement call. Your argument definitely holds for copying the getpwuid() value. Checking the pw_dir element for NULL is superfluous unless you feel like second-guessing the kernel and libc at every turn, a practice which doesn't have much value.
Re: Working on the install-mh change questions
Greg Hudson [EMAIL PROTECTED] writes: The return value of getenv() is a pointer into the environment; a future getenv() call will not overwrite it. So it's safe not to copy it unless you anticipate a putenv(). (And I think it's safe even in the face of a putenv(), actually.) A judgement call. Not according to the new POSIX/SUSv3 (and i suspect the original POSIX): http://www.opengroup.org/onlinepubs/007904975/functions/getenv.html The return value from getenv() may point to static data which may be overwritten by subsequent calls to getenv(), [CX] [Option Start] setenv(), or unsetenv(). [Option End] [XSI] [Option Start] On XSI-conformant systems, the return value from getenv() may point to static data which may also be overwritten by subsequent calls to putenv(). [Option End] If you think about it, that makes sense. The environment is an array of pointers to character buffers, and since what putenv(3) stores for FOO is often larger than the size of the original FOO, it will need to get a larger buffer at a different address, invalidating the original returned address. I just checked Harbison Steele, and according to them ISO C does not allow calls to putenv to modify the getenv return value, and as seen above, nor does POSIX (though the XSI extension does). Maybe i'm just not very imaginative at this late hour, but i don't see another way to implement the environment without putenv modifying it, so i find it strange that the two important standards don't allow for it. Bleh, i hope that made sense. sleep... -- Eric Gillespie * [EMAIL PROTECTED] Build a fire for a man, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life. -Terry Pratchett
Re: Working on the install-mh change questions
Eric Gillespie [EMAIL PROTECTED] writes: I just checked Harbison Steele, and according to them ISO C does not allow calls to putenv to modify the getenv return value, and as seen above, nor does POSIX (though the XSI extension does). Maybe i'm just not very imaginative at this late hour, but i don't see another way to implement the environment without putenv modifying it, so i find it strange that the two important standards don't allow for it. Just thought of one, though perhaps not the most obvious. POSIX and ISO C *do* mention static buffers; a getenv implementation can copy from the environment array into its own static buffer, extending it if necessary (subsequent getenv buffers may invalidate old ones, i.e. return addresses of newly allocated buffers). Then putenv calls wouldn't invalidate the getenv return value. -- Eric Gillespie * [EMAIL PROTECTED] Build a fire for a man, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life. -Terry Pratchett
Re: Working on the install-mh change questions
Jon Steinhart [EMAIL PROTECTED] wrote: 5.There's code that removes a trailing / from mypath if mypath is more than one character long. Seems completely unnecessary; the definition of path names means that // is interpreted as /. And if that wasn't the case, this code wouldn't catch multiple trailing slashes anyway. I'm not familiar enough with the MH code to know about the other issues, but with this one, if the path is shown to the user in any way, at worst this is a nice cosmetic fix. At best, it saves the user from opening the wrong file if he cut and pastes the path into Emacs, for example. Thanks for taking the time to clean things up. -- Bill Wohler [EMAIL PROTECTED] http://www.newt.com/wohler/ GnuPG ID:610BD9AD Maintainer of comp.mail.mh FAQ and mh-e. Vote Libertarian! If you're passed on the right, you're in the wrong lane.