On 11/12/2018 23:54, Zachary Turner wrote:


On Tue, Dec 11, 2018 at 11:57 AM Pavel Labath <pa...@labath.sk <mailto:pa...@labath.sk>> wrote:

    The part I know nothing about is whether something similar could be
    done
    for PE/COFF files (and I'll need something like that there too).
    Adrian,
    Zachary, what is the relation ship between "image base" of an object
    file and its sections? Is there any way we could arrange so that the
    base address of a module always belongs to one of its sections?


Historically, an image base of N was used as a way to tell the loader "map the file in so that byte 0 of the file is virtual address N in the process's address space".  as in *((char *)N) would be the first byte of the file in a running process.  Then, everything else in the file is written as an offset from N.  This includes section addresses.  So for example, if we use dumpbin on a simple executable we can see something like this:

Dump of file bin\count.exe

PE signature found

File Type: EXECUTABLE IMAGE

OPTIONAL HEADER VALUES
                   ...
        140000000 image base (0000000140000000 to 0000000140011FFF)
                   ...
SECTION HEADER #1
    .text name
     1000 virtual address (0000000140001000 to 00000001400089AE)

So under this scheme, the first byte of the first section would be at virtual address 0000000140001000 in the running process.

Later, ASLR came along and threw a wrench in all of that, and so these days the image base is mostly meaningless.  The loader will probably never actually load your module at the address specified in image base. But the rest of the rules still hold true.  Wherever it *does* load your module, the first byte of .text will still be at offset 1000 from that.

So, if you want to return this value from the PE/COFF header, or even if you want to return the actual address that the module was loaded at, then no, it will never belong to any section (because the bytes at that address will be the PE/COFF file header).

Does this make sense?

I think it does.

I am aware that this address is not going to represent a valid address in target memory (the same is true for elf and macho targets), but what we're trying to ensure is that when we take this address, and ask the running target to give us the "load" address for it, it will return the actual place in memory (and conversely if the target is not running it should give us an invalid address instead of returning something bogus.

So, if I understand correctly, the PE/COFF file will always be loaded into one contiguous chunk of memory, ranging from ImageBase (modulo ASLR) to ImageBase+SizeOfImage. Then various sections are mapped into that range (according to their RVAs).

If that's the case, then we could model this as one big segment/container section/whateever, and the individual (loadable) sections would be sub-sections of that. Apart from solving my current problem, this should also improve the address lookup for these modules. E.g. right now if you ask lldb to lookup the address corresponding to the memory image of the header, it will say it does not belong anywhere, but that address is clearly associated with the module.

I'll try looking at what kind of changes are needed to make this happen. I'll start with the elf case, as I am more familiar with that (and it'll probably be more complicated).

thanks,
pl
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to