On Tue, 2014-11-04 at 14:37 -0800, Jason Molenda wrote:
> FWIW the use of an Address object to represent addresses was motivated by the
> hassle of address handling we had with gdb. If you start the debugger and
> give it an executable and a bunch of solibs, set an address breakpoint, and
> then run the process, how does that address breakpoint get re-set to the
> correct place when the executable and all the solibs land at their final
> address? What if multiple solibs are at the same addr (say 0x0) before we
> start execution? (on Mac OS X solibs don't have distinct virtual addresses
> these days - they're all 0-based and the dynamic loader picks a spot in the
> address space at runtime)
>
Yes, I believe that the linux OS does something very similar.
> The Address object represents addresses as a section +
> offset-into-that-section, if it's within the bounds of a known binary (a
> Module). When the process starts up, lldb learns where all the solibs are
> actually loaded from the dynamic loader -- it records where each Section is
> loaded in memory as part of the Target. So given an Address object, we can
> get the actual address in real memory via the Target's section load list.
>
> This abstraction solves a lot of subtle bugs we'd hit in gdb when an objfile
> would shift in memory to a new address -- we'd have to go through all the
> addresses that might have that old address and make sure they're updated
> correctly to the new one. With lldb we have the section load list in the
> Target which has the current address for each Section.
>
Yup. This abstraction makes a lot of sense.
> When we're getting a real memory address (a "load address", or as Greg was
> saying to me earlier, think of it as a "live address" if that helps) out of
> the process (e.g. we read the pc register), usually the first thing we'll do
> is convert that into an Address object (again, with help of the Target's
> section load list) putting it in terms of a Section and an offset.
>
Well, "live address" is possibly clearer, but I'm quite happy now that I
understand what's meant by lldb's load address terminology. And I
believe that Target::ResolveLoadAddress() will convert a raw ("live")
address to section:offset address, based on what the dyld has done, with
a running a process.
I had added Target::ResolveFileAddress() (to SBTarget and Target) in an
attempt to take a file_address and convert it to section:offset address.
I think that hindsight is now telling me that this was a bad idea, since
a file_address is not necessarily unique, since we may have address x in
both module A and module B....
thanks for your explanation and patience
Matt
> When a real memory address in the process doesn't belong to any of the
> binaries (Modules) -- for instance, a pointer into a heap allocation -- then
> we can put it in an Address object but it's just an offset alone with no
> section. But addresses that correspond to a function or a symbol are
> expressed in terms of the containing section and offset into that section.
>
>
>
> > On Nov 4, 2014, at 10:18 AM, Greg Clayton <[email protected]> wrote:
> >
> >
> >> On Nov 3, 2014, at 10:53 PM, Matthew Gardiner <[email protected]> wrote:
> >>
> >> Hi Greg,
> >>
> >> So what in lldb's world is the difference between a file and a load
> >> address?
> >
> >
> > "file address" (as LLDB treats them) is the address that is found in the
> > object file (ELF, MachO, or COFF). "load address" is the actual address in
> > the process where the section is loaded after being slid.
> >
> > For example you might have a shared library that has a function "foo" whose
> > file address is 0x1000, but when the shared library gets loaded into
> > memory, it will be loaded at a different address because all shared
> > libraries have functions in the low file address range (say from 0 to
> > 0x400000 for example). So if the shared library gets loaded with a slide of
> > 0x1000000, foo will have a load address of 0x1000000 + 0x1000.
> >
> > Now for most embedded debugging where there is no OS that will slide things
> > around, your file and load addresses will match. For actual OS level
> > debugging, they won't for shared libraries, and might for the main
> > executable. Sometimes the main executable doesn't get slid around, but
> > other OSs will use ASLR to slide the main executables around for security.
> >
> > The test that was written was trying to get a data section from the object
> > file. Then it made a section offset address that pointed to that data
> > section. Then it called SBTarget::ReadMemory(...) with that section offset
> > address. What was happening on MacOSX was:
> >
> > - get the data section whose file address was 0x10001000
> > - Make a section + offset address from it that was represented as a.out's
> > data section + 0 bytes
> > - call target read memory
> >
> > Prior to my fix this happened:
> >
> > SBTarget::ReadMemory() made a new section offset address:
> >
> > Address address(section_offset_addr.GetFileAddress(), NULL);
> >
> > Now we have an address that has no section. If such an address is passed to
> > anything that is trying to read memory from a live process, an address with
> > no section is considered to be a "load address". It will try and read from
> > the "load address" 0x10001000. But on MacOSX, or any OS with ASLR, the data
> > section was slid by a random amount (like 0xef0000). So we would try to
> > read from "load address" 0x10001000 and it would fail. If we leave the
> > address object as a section offset address (don't make a new address like
> > we did above), we pass this address to a read memory function and it will
> > resolve the section offset address into an address in the live process, or
> > into a load address. This will be 0x10001000 + 0xef0000 + 0. The load
> > address of the data section is 0x10001000 + 0xef0000 and the offset was 0.
> > And the resulting memory it will read from in the process is 0x10ef1000.
> > The old way it would have tried to read from 0x10001000 which was incorrect.
> >
> >
> >> In my world I consider the file and load addresses to be the
> >> same thing, that is, the address (not the file offset) of the symbol in
> >> the object file, e.g. when I objdump symbols and grep for ones I know of
> >> in a kalimba ELF, I get
> >>
> >> 0000054f g DM|0 00000000 $_g_matt1
> >> ...
> >>
> >>
> >> 0x54f as the file address of g_matt1.
> >>
> >>
> >> The other address terminology I hear of is "virtual address". To me this
> >> the address of the symbol once the binary is actually running on the
> >> processor. So in some embedded scenarios (like kalimba where there is no
> >> OS) we have code addresses in the ELF (i.e. file/load address) all
> >> starting at 80000000 e.g.
> >>
> >> 80000354 g F PM|0 00000000 $_main
> >>
> >> But on the device (since it's harvard architecture with a CODE and DATA
> >> bus), main is actually at 0x0354. So in this context I'd say 0x80000354
> >> was the "load/file address" but 0x0354 was the virtual address. I see
> >> similar scenario with linux shared object files where in the file the
> >> symbol addresses are often based at 0, but at runtime are fixed-up to
> >> some arbitrary offset.
> >>
> >> Can you explain what lldb means by file/load/virtual and so on
> >> addresses?
> >
> >
> > So in your terms:
> >
> > virtual address is what we call the "load address".
> > file address means address as it is found in the object file you loaded it
> > from.
> >
> > Many object files speak of a virtual address when they are speaking of file
> > addresses, so I didn't think "virtual address" made as much sense as "load
> > address". For example the mach-o segments have a "vmaddr" and "vmsize"
> > fields when parsing the segments which stand for virtual address and
> > virtual size.
> >
> > Hope that clears things up.
> >
> >> thanks
> >> Matt
> >>
> >>
> >>
> >> On Tue, 2014-11-04 at 00:56 +0000, Greg Clayton wrote:
> >>> Author: gclayton
> >>> Date: Mon Nov 3 18:56:30 2014
> >>> New Revision: 221213
> >>>
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=221213&view=rev
> >>> Log:
> >>> Fixed SBTarget::ReadMemory() to work correctly and the TestTargetAPI.py
> >>> test case that was reading target memory in
> >>> TargetAPITestCase.test_read_memory_with_dsym and
> >>> TargetAPITestCase.test_read_memory_with_dwarf.
> >>>
> >>> The problem was that SBTarget::ReadMemory() was making a new section
> >>> offset lldb_private::Address by doing:
> >>>
> >>>
> >>> size_t
> >>> SBTarget::ReadMemory (const SBAddress addr,
> >>> void *buf,
> >>> size_t size,
> >>> lldb::SBError &error)
> >>> {
> >>> ...
> >>> lldb_private::Address addr_priv(addr.GetFileAddress(), NULL);
> >>> bytes_read = target_sp->ReadMemory(addr_priv, false, buf, size,
> >>> err_priv);
> >>>
> >>>
> >>> This is wrong. If you get the file addresss from the "addr" argument and
> >>> try to read memory using that, it will think the file address is a load
> >>> address and it will try to resolve it accordingly. This will work fine if
> >>> your executable is loaded at the same address (no slide), but it won't
> >>> work if there is a slide.
> >>>
> >>> The fix is to just pass along the "addr.ref()" instead of making a new
> >>> addr_priv as this will pass along the lldb_private::Address that is
> >>> inside the SBAddress (which is what we want), and not always change it
> >>> into something that becomes a load address (if we are running), or
> >>> abmigious file address (think address zero when you have 150 shared
> >>> libraries that have sections that start at zero, which one would you
> >>> pick). The main reason for passing a section offset address to
> >>> SBTarget::ReadMemory() is so you _can_ read from the actual section +
> >>> offset that is specified in the SBAddress.
> >>>
> >>>
> >>>
> >>> Modified:
> >>> lldb/trunk/source/API/SBTarget.cpp
> >>> lldb/trunk/test/python_api/target/TestTargetAPI.py
> >>>
> >>> Modified: lldb/trunk/source/API/SBTarget.cpp
> >>> URL:
> >>> http://llvm.org/viewvc/llvm-project/lldb/trunk/source/API/SBTarget.cpp?rev=221213&r1=221212&r2=221213&view=diff
> >>> ==============================================================================
> >>> --- lldb/trunk/source/API/SBTarget.cpp (original)
> >>> +++ lldb/trunk/source/API/SBTarget.cpp Mon Nov 3 18:56:30 2014
> >>> @@ -1306,13 +1306,11 @@ SBTarget::ReadMemory (const SBAddress ad
> >>> if (target_sp)
> >>> {
> >>> Mutex::Locker api_locker (target_sp->GetAPIMutex());
> >>> - lldb_private::Address addr_priv(addr.GetFileAddress(), NULL);
> >>> - lldb_private::Error err_priv;
> >>> - bytes_read = target_sp->ReadMemory(addr_priv, false, buf, size,
> >>> err_priv);
> >>> - if(err_priv.Fail())
> >>> - {
> >>> - sb_error.SetError(err_priv.GetError(), err_priv.GetType());
> >>> - }
> >>> + bytes_read = target_sp->ReadMemory(addr.ref(), false, buf, size,
> >>> sb_error.ref());
> >>> + }
> >>> + else
> >>> + {
> >>> + sb_error.SetErrorString("invalid target");
> >>> }
> >>>
> >>> return bytes_read;
> >>>
> >>> Modified: lldb/trunk/test/python_api/target/TestTargetAPI.py
> >>> URL:
> >>> http://llvm.org/viewvc/llvm-project/lldb/trunk/test/python_api/target/TestTargetAPI.py?rev=221213&r1=221212&r2=221213&view=diff
> >>> ==============================================================================
> >>> --- lldb/trunk/test/python_api/target/TestTargetAPI.py (original)
> >>> +++ lldb/trunk/test/python_api/target/TestTargetAPI.py Mon Nov 3
> >>> 18:56:30 2014
> >>> @@ -213,16 +213,20 @@ class TargetAPITestCase(TestBase):
> >>> breakpoint = target.BreakpointCreateByLocation("main.c",
> >>> self.line_main)
> >>> self.assertTrue(breakpoint, VALID_BREAKPOINT)
> >>>
> >>> + # Put debugger into synchronous mode so when we
> >>> target.LaunchSimple returns
> >>> + # it will guaranteed to be at the breakpoint
> >>> + self.dbg.SetAsync(False)
> >>> +
> >>> # Launch the process, and do not stop at the entry point.
> >>> process = target.LaunchSimple (None, None,
> >>> self.get_process_working_directory())
> >>>
> >>> # find the file address in the .data section of the main
> >>> # module
> >>> data_section = self.find_data_section(target)
> >>> - data_section_addr = data_section.file_addr
> >>> - a = target.ResolveFileAddress(data_section_addr)
> >>> -
> >>> - content = target.ReadMemory(a, 1, lldb.SBError())
> >>> + sb_addr = lldb.SBAddress(data_section, 0)
> >>> + error = lldb.SBError()
> >>> + content = target.ReadMemory(sb_addr, 1, error)
> >>> + self.assertTrue(error.Success(), "Make sure memory read
> >>> succeeded")
> >>> self.assertEquals(len(content), 1)
> >>>
> >>> def create_simple_target(self, fn):
> >>>
> >>>
> >>> _______________________________________________
> >>> lldb-commits mailing list
> >>> [email protected]
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
> >>>
> >>>
> >>> To report this email as spam click
> >>> https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ== .
> >>
> >>
> >>
> >>
> >> Member of the CSR plc group of companies. CSR plc registered in England
> >> and Wales, registered number 4187346, registered office Churchill House,
> >> Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
> >> More information can be found at www.csr.com. Keep up to date with CSR on
> >> our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people,
> >> YouTube, www.youtube.com/user/CSRplc, Facebook,
> >> www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at
> >> www.twitter.com/CSR_plc.
> >> New for 2014, you can now access the wide range of products powered by
> >> aptX at www.aptx.com.
> >
> >
> > _______________________________________________
> > lldb-commits mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>
_______________________________________________
lldb-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits