Hi all. Sorry for the crosspost.

I am looking for some documentation on the structure of the stack when an executable starts. I know the basics - argc, then argv, then envp. What I'm interested in is what's beyond that. I've tried googling, reading the sources in the kernel for fs/binfmt_elf.c and the sources for ld-linux.so. I'm sure what I'm looking for is in there, but I just couldn't nail it.

In particular, this is what I'm looking for. When one tries to load an executable (say, /bin/cat), the kernel figures out it is an ELF file, reads a field called "interpreter", which has a fairly typical value (on 32 bit Intel - /lib/ld-linux.so.2), and loads the interpreter and /bin/cat into memory, and runs the interpreter code. The interpreter then looks at the ELF headers (which the kernel has loaded into memory) for /bin/cat, and based on them loads the rest of the required shared objects into memory, and then runs the /bin/cat code.

Then again, the interpreter can also be directly run. That happens if my command line is actually "/lib/ld-linux.so.2 /bin/cat". In that case, the kernel loads just the interpreter into memory and runs it, the interpreter figures out that it was run in direct mode, loads /bin/cat into memory, and then proceeds as before. In other words, the interpreter KNOWS whether it was loaded as an interpreter or whether it was loaded directly. That is what I'm trying to figure out.

It is NOT done by looking at the args, and it is not done by querying /proc/self. It is done by examining a portion of the executable header left by the kernel somewhere in memory, and asking "where is the executable startup code located? Is it the same as mine?" If ld-linux figures the startup code is the same is its own entry, then it assumes it was called directly. Otherwise, it assumes it is just the interpreter.

I found the actual logic just described. It is in the glibc sources, in elf/rtld.c, in a function called "dl_main". It is the first "if" in that function. What I have, so far, failed to find is where the variables referenced by that if are being initialized. I have reason to believe this is just a struct left on the stack by the kernel, but what the struct is, and more importantly, where on the stack, I have not, yet, been able to figure out.

I have not yet given up. I'm just hoping someone will come up and say "oh, just look at this URL for an explanation". The code is so choke full of things that look like preprocessor directives but seem to be, in fact, internal gcc attributes that I find the program flow somewhat unreadable. My method, right now, is to compile it with debug symbols, and then use objdump to overlay the source over the actual assembly code. It has, in fact, come to the point where it is easier to try and understand what I need that way.

Any help would be greatly appreciated.

Shachar

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to