------------ Forwarded Message ------------ Date: Saturday, July 19, 2003 16:10:36 -0700 From: Kean Johnston <[EMAIL PROTECTED]> To: Larry Rosenman <[EMAIL PROTECTED]> Cc: Subject: Re: PG Patch
I'm trying to get a discussion going, as Bruce wants to do it right for ALL platforms or none. It probably WONT happen for 7.3.4, but WILL (If I have my way) for 7.4.0.
Ok then let me explain the issue. You can forward this to Bruce since I haven't heard from him yet.
As you know, the run-time link editor (RTLD) is responsible for loading an ELF program, resolving its dependent libraries and symbols, setting things up for _start, and calling it. There are a few ELF dynamic tags that come into play. The ones we care about are: DT_SONAME This is the name of the shared object that the RTLD will try to load. DT_RPATH Specifies the list of paths to search for dependencies (old way) DT_RUNPATH Specifies the list of paths to search for dependencies (new way) DT_NEEDED Lists the dependencies for this object
There is also one environment variable that is used at load time to resolve dependencies, viz. LD_LIBRARY_PATH.
The gABI defines how and where these are used, but this is a basic summary. I refer to the current object and the dependent object. The current object is the entity which is having its dynamic section interpreted. This is the executable or shared library that has a dependency that needs to be loaded by the RTLD. The dependent object is the name of the actual dependency, and comes from the DT_NEEDED list.
1) If the dependent object name contains a / use the name directly, with no path searching. 2) Search along DT_RPATH if the current object doesnt have DT_RUNPATH defined. DT_RPATH is a colon separated list of paths. 3) Search along LD_LIBRARY_PATH which is also a colon separated list of paths. Only do this if the process does not have elevated (i.e setuid or setgid) priveliges(*). 4) If the dependency still hasnt been met, search along DT_RUNPATH (if defined for the current). DT_RUNPATH is a colon separated list of path names. 5) If we still havent found it, look in the standard system places. 6) If we still havent resolved the dependency, bail.
(*) this is the kicker. There are *MANY* older systems out there that have RTLD bugs that do not obey this rule. Consider the following. Most systems have xterm. xterm is very frequently setuid root. All you need to do is run dump -Lv on xterm to see if there are any shared libraries with no absolute path names, or any of the dependencies of any of the libraries, and you can get root like this. Let say, as is fairly common on older systems, that libX11.so does not have a fully qualified path name in its DT_SONAME. When xterm is linked, it will have a DT_NEEDED of libX11.so.5 or .6 or whatever, without an absolute path. That means that it will use the searching algorithms described above. All I need to do to get root is craft up (fairly easily) a libX11.so.5 that has, in a call I know xterm will use like XOpenDisplay, code that copies /bin/sh to somewhere and makes it setuid root. Now I put that hacked libX11.so.5 in my home directory, set LD_LIBRARY_PATH=$HOME, run xterm, and I've got a root shell.
This can all be so easily avoid by rule (1) above. Always hard-code your libary names. Its a pain sometimes, to be sure, as I will describe below, but it is completely unambiguous, its secure and it is quicker. Granted the RTLD isnt that slow searching paths but hey, every bit counts.
Before going in to detail on the problems of using absolute path names (there is always a catch, isn't there?), just a quick refresher on how these various dynamic tags get set in an ELF object. This varies from system to system but almost all system suse some subtle variation of the following. AIX is a bit funky as I recall.
DT_RPATH is set if the link editor (ld) encounterd LD_RUN_PATH in the environment at link edit time. Thus doing something like: LD_RUN_PATH=/foo:/bar ld -o libfnoz.so blah.o would set DT_RPATH to /foo:/bar.
DT_RUNPATH is set by the -R option on System-V-ish link editors, and by -rpath with GNU-ish ones (I think, I am no GNU ld expert, please correct me if I am wrong).
DT_SONAME is set by -h on System-V-ish link editors and by -soname on GNU-ish ones.
DT_NEEDED is set by any ELF link editor based on the -l options or explicit linkage against another shared object. It uses the DT_SONAME from the dependency to put in the object's DT_NEEDED list.
While absolute path names are the way to fly, in general, they have their drawbacks too. First, it can be a right royal pain to bootstrap things. Consider this. You are building a program. As part of its build, it builds a shared library, and link edits it with an absolute DT_SONAME. Later in the build you link a program against it, and you want to use that program in the build (perhaps executing it to produce some intermediate file or whatever). If this is the very first time you are compiling the program and library, then the shared library wont exist in its specified location, and execution of the program will fail. So you have to wait for the build to fail, then copy the just built library into its install location and continue the build, possibly repeating this several times.
Another, sometimes more frustrating problem is encountered if you DO have an older version of the library installed. Lets say the library was /usr/lib/libfoo.so.2. You are recompiling your stuff, building a new version of libfoo. It contains bug fixes, but is not sufficiently different to warrant moving to libfoo.so.3. Now during your build you link a program against -lfoo, and when you execute it, lo and behold, it runs, becuase /usr/lib/libfoo.so.2 is already in place from an earlier install. But the libfoo that the program is referencing is the buggy one, and it may make it impossible to build your program, or may produce incorrect results.
So what we need is the ability to always reference the frehsly built libraries while we are building the system, and to make sure that the final installed ones ahve full path names and that executables have been re-linked against them. This is possible, and fairly easy, but it does mean that all programs and libraries need to be relinked at install time, and they need to be done in the correct order. But its pretty straight-forward.
During build time, use LD_RUN_PATH or -R (or even -h with absolte path names pointing into the build tree) and do the build. As you install, you relink each shared library with -h and the final destination path name for the library, making sure you relink all libraries in the correct order such that all DT_NEEDED's have absolute path names. As you install each binary that depends on these libraries, you also relink them before doing the install.
libtool gets some, but not all of this right. However libtool has its own drawbacks, not least of which is its compltely non-sensical version numbering scheme which the docs go to great lengths to promote as an ideal solution. Its not, its crap.
An even easier solution is one I have been thinking about a lot of late. It would simplify the build and install procedures dramatically. I am thinking of writing an open source tool called "somod". This will allow you to change the ELF headers on already-installed ELF programs, adjusting the DT_SONAME, DT_NEEDED, DT_RUNPATH and DT_RPATH variables as you see fit. This would then simplify the build procedure by simply adding a step that after installing a shared library or binary, you run somod on it to set things up the way you want. That would be the least invasive way, and also allow you to take remedial action on old programs you may not have the source for, or even on mis-compiled or mis-produced vendor files.
---------- End Forwarded Message ----------
-- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED] US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster