On 2007-09-24, at 1120, Joshua Megerman wrote:
First off, let me prefice this by saying that while I understand theconcept of shared libraries, I don't understand the underlying mechanicsof how the OS handles them,
i'm not sure exactly how far "underlying" you don't understand, but here's a fairly simple overview of the seedy underside of program linking and the difference between static (i.e. compile-time) and dynamic (i.e. run-time) linking.
the compiler generates ".o" files, containing the following:- one or more "text" segments, which contain byte sequences of executable code
- a list of "exports", symbols which are available in the module, usually functions which may be called from, or global variables (such as "errno") which may be accessed by other modules.
- a list of "imports", symbols which this module needs in order to execute correctly, and the "fixup" locations where the final memory address of each symbol should be stored.
there are other types of files called ".a" files, which are basically a collection of .o files joined together for easier management- a "library", in other words. libvpopmail.a is one of these.
you can see the various imports and exports in a .o or .a file using the program "nm". for example, in vpopmail 5.4.22, the file "md5.o" contains the following symbols:
$ nm md5.o 000007e0 T MD5Final 00000038 T MD5Init 0000006c T MD5Transform 000006ec T MD5Update 00000000 T byteReverse U memcpy U memsetthe symbols with "T" are exports, the functions in the module. these function names are available to be matched against other modules which may need them. the symbols with "U" are imports, names which need to be matched against other modules in order to build a final working program. in this case, the "memcpy" and "memset" functions are defined in the "memcpy.o" module within "libc.a" or "libc.so".
the compile-time linker gathers a bunch of these .o and .a files, matches up the "imports" with the "exports" from the various modules, and produces a final executable with any interior links resolved. for a statically linked program, ALL links must be resolved in order to have a working program- so if your "main()" called any or all of the MD5 functions listed above, your ".o" would have "MD5Init" and friends as imports, and the linker would match those against the md5.o module and add "memcpy" and "memset" to the list of imports, so the linker would then bring in the "memcpy.o" and "memset.o" modules from "libc.a", as part of your program's final executable.
there are two problems with this scenario:- some functions, like printf(), have a LOT of dependencies. a three- line program which might normally generate a 4K executable, can grow to over 800K because of these dependencies.
- if the underlying library changes, you have to re-compile this program to gain the benefits (security fixes, new features, etc.) of the new library.
if a program is being compiled to support dynamic linking, then instead of looking at "libc.a", it looks at "libc.so". and instead of copying the code from the .so into the final executable, it builds a list of "run-time fixups", which is stored in the final executable.
then, when the program is actually executed, the first thing it does is call a "run-time linker", usually called "ld.so". the run-time linker loads the necessary .so files into your program's memory space, performs the "fixups" (i.e. stores the final in-memory address of the library functions into the correct memory locations in your code), and then jumps to the starting point of your program.
because modern CPUs support the concept of making a particular segment of memory "read only", and because most memory management hardware makes it possible to map a particular physical segment of memory to appear in any logical address within the address space, it is possible for shared libraries to physically exist in memory only one time, while visible to multiple processes as different addresses. this is why, if you look at a process with "ps" or "top", you'll see two memory-usage numbers- the "virtual size", which is how much total memory space is used if this process were the only one on the machine, and "resident set", which is how much memory is dedicated to just that one process. the difference in these two numbers is the amount used for shared memory, usually shared libraries like libc.so.
and thus am not sure exactly how can be affected performance-wise.
the vpopmail programs are already dynamically loaded- it's just the "libvpopmail.a" functions which are not loaded dynamically. the performance hit would be minimal- it already has to load libc.so at run-time, one more library won't take long enough to make any real difference.
1) A shared library with a stable API would make recompiling outsideprograms (e.g., QmailAdmin) unnecessary, which would be a Good Thing (tm).
as long as it's the same API for all of the authentication modules.i can also see having "libvpopmail.so" for the client-facing programs, then modules like "libvpopmailauth_cdb.so", "libvpopmailauth_mysql.so", and so forth, for the back-end code to handle the mechanics for that particular authentication back-end, similar to how courier-authlib is structured.
2) There has been some question regarding performance of the vpopmailprograms when compiled against shared vs. static libraries. I suggest thefollowing options be added for shared libraries at compile-time:a) --disable-shared - don't build libvpopmail.so, which is what vpopmaildoes now.b) --enable-shared - build libvpopmail.so, but don't link the vpopmailbinaries against it - this gives other programs the ability to use the shared library, but keeps the vpopmail binaries statically linked.c) --enable-shared-binaries - build libvpopmail.so and link the vpopmailbinaries against it. Implies --enable-shared.d) possibly, if it's not to difficult, have a --enable-shared- binaries= and/or --enable-static-binaries= option, which takes a list of binariesto link against the stated library, and links the rest against the other. So you could have static vdelivermail and vchkpw, but notvadduser, for example. Not sure if that really is necessary, but staticlinking does save space...
i vote for "a" and "c" during a transition period, then "c" as the only option after that.
in either case, i think "d" might be taking the idea too far.
3) In all cases, even if the vpopmail binaries are linked against the shared library, the static library libvpopmail.a should be built since some programs expect it.
maybe for interim versions, to give other programs' developers time to deal with the change... but i think that a "vpopmail version 6" should be "shared only".
Also, just a supposition on my part, but if you're running (e.g.)courier-authdaemon linked against libvpopmail.so all the time, wouldn't that (theoretically) mean that other dynamically linked vpopmail programs would run faster than the static version since the library would alreadybe loaded in memory?
yes, but the difference wouldn't really be noticeable- it would still be a few milliseconds slower than having the functions hard-coded into the binaries.
If so, perhaps the speed solution for a dynamic(e.g.) vdelivermail would be to run something that was dynamically linkedall the time, so libvpopmail stayed in memory...
if you're on a system which is busy enough that these few milliseconds are a significant issue, you will already have tens or hundreds of other processes with libvpopmail.so mapped into their memory space anyway- so again, it won't be an issue.
Anyway, that's it for now - I haven't even tried the patch against the latest vpopmail, though I'm guessing it should be fairly easy (albeingpossibly tedious) to integrate since it's not much in the way of actualcode changes...
if you have a URL for that patch, i'd like to play with it myself. ---------------------------------------------------------------- | John M. Simpson --- KG4ZOW --- Programmer At Large | | http://www.jms1.net/ <[EMAIL PROTECTED]> | ---------------------------------------------------------------- | http://video.google.com/videoplay?docid=-1656880303867390173 | ----------------------------------------------------------------
Description: This is a digitally signed message part