Re: [Lldb-commits] [lldb] r221213 - Fixed SBTarget::ReadMemory() to work correctly and the TestTargetAPI.py test case that was reading target memory in TargetAPITestCase.test_read_memory_with_dsym and TargetAPITestCase.test_read_memory_with_dwarf.

Matthew Gardiner Wed, 05 Nov 2014 03:45:41 -0800

On Tue, 2014-11-04 at 14:37 -0800, Jason Molenda wrote:
> FWIW the use of an Address object to represent addresses was motivated by the 
> hassle of address handling we had with gdb.  If you start the debugger and 
> give it an executable and a bunch of solibs, set an address breakpoint, and 
> then run the process, how does that address breakpoint get re-set to the 
> correct place when the executable and all the solibs land at their final 
> address?  What if multiple solibs are at the same addr (say 0x0) before we 
> start execution?  (on Mac OS X solibs don't have distinct virtual addresses 
> these days - they're all 0-based and the dynamic loader picks a spot in the 
> address space at runtime)
>


Yes, I believe that the linux OS does something very similar.


> The Address object represents addresses as a section + 
> offset-into-that-section, if it's within the bounds of a known binary (a 
> Module).  When the process starts up, lldb learns where all the solibs are 
> actually loaded from the dynamic loader -- it records where each Section is 
> loaded in memory as part of the Target.  So given an Address object, we can 
> get the actual address in real memory via the Target's section load list.
> 
> This abstraction solves a lot of subtle bugs we'd hit in gdb when an objfile 
> would shift in memory to a new address -- we'd have to go through all the 
> addresses that might have that old address and make sure they're updated 
> correctly to the new one.  With lldb we have the section load list in the 
> Target which has the current address for each Section.
> 

Yup. This abstraction makes a lot of sense.

> When we're getting a real memory address (a "load address", or as Greg was 
> saying to me earlier, think of it as a "live address" if that helps) out of 
> the process (e.g. we read the pc register), usually the first thing we'll do 
> is convert that into an Address object (again, with help of the Target's 
> section load list) putting it in terms of a Section and an offset.
> 

Well, "live address" is possibly clearer, but I'm quite happy now that I
understand what's meant by lldb's load address terminology. And I
believe that Target::ResolveLoadAddress() will convert a raw ("live")
address to section:offset address, based on what the dyld has done, with
a running a process.

I had added Target::ResolveFileAddress() (to SBTarget and Target) in an
attempt to take a file_address and convert it to section:offset address.
I think that hindsight is now telling me that this was a bad idea, since
a file_address is not necessarily unique, since we may have address x in
both module A and module B....

thanks for your explanation and patience
Matt


> When a real memory address in the process doesn't belong to any of the 
> binaries (Modules) -- for instance, a pointer into a heap allocation -- then 
> we can put it in an Address object but it's just an offset alone with no 
> section.  But addresses that correspond to a function or a symbol are 
> expressed in terms of the containing section and offset into that section.
> 
> 
> 
> > On Nov 4, 2014, at 10:18 AM, Greg Clayton <[email protected]> wrote:
> > 
> > 
> >> On Nov 3, 2014, at 10:53 PM, Matthew Gardiner <[email protected]> wrote:
> >> 
> >> Hi Greg,
> >> 
> >> So what in lldb's world is the difference between a file and a load
> >> address?
> > 
> > 
> > "file address" (as LLDB treats them) is the address that is found in the 
> > object file (ELF, MachO, or COFF). "load address" is the actual address in 
> > the process where the section is loaded after being slid.
> > 
> > For example you might have a shared library that has a function "foo" whose 
> > file address is 0x1000, but when the shared library gets loaded into 
> > memory, it will be loaded at a different address because all shared 
> > libraries have functions in the low file address range (say from 0 to 
> > 0x400000 for example). So if the shared library gets loaded with a slide of 
> > 0x1000000, foo will have a load address of 0x1000000 + 0x1000.
> > 
> > Now for most embedded debugging where there is no OS that will slide things 
> > around, your file and load addresses will match. For actual OS level 
> > debugging, they won't for shared libraries, and might for the main 
> > executable. Sometimes the main executable doesn't get slid around, but 
> > other OSs will use ASLR to slide the main executables around for security.
> > 
> > The test that was written was trying to get a data section from the object 
> > file. Then it made a section offset address that pointed to that data 
> > section. Then it called SBTarget::ReadMemory(...) with that section offset 
> > address. What was happening on MacOSX was:
> > 
> > - get the data section whose file address was 0x10001000
> > - Make a section + offset address from it that was represented as a.out's 
> > data section + 0 bytes
> > - call target read memory
> > 
> > Prior to my fix this happened:
> > 
> > SBTarget::ReadMemory() made a new section offset address:
> > 
> > Address address(section_offset_addr.GetFileAddress(), NULL);
> > 
> > Now we have an address that has no section. If such an address is passed to 
> > anything that is trying to read memory from a live process, an address with 
> > no section is considered to be a "load address". It will try and read from 
> > the "load address" 0x10001000. But on MacOSX, or any OS with ASLR, the data 
> > section was slid by a random amount (like 0xef0000). So we would try to 
> > read from "load address" 0x10001000 and it would fail. If we leave the 
> > address object as a section offset address (don't make a new address like 
> > we did above), we pass this address to a read memory function and it will 
> > resolve the section offset address into an address in the live process, or 
> > into a load address. This will be 0x10001000 + 0xef0000 + 0. The load 
> > address of the data section is 0x10001000 + 0xef0000 and the offset was 0. 
> > And the resulting memory it will read from in the process is 0x10ef1000. 
> > The old way it would have tried to read from 0x10001000 which was incorrect.
> > 
> > 
> >> In my world I consider the file and load addresses to be the
> >> same thing, that is, the address (not the file offset) of the symbol in
> >> the object file, e.g. when I objdump symbols and grep for ones I know of
> >> in a kalimba ELF, I get
> >> 
> >> 0000054f g       DM|0      00000000 $_g_matt1
> >> ...
> >> 
> >> 
> >> 0x54f as the file address of g_matt1.
> >> 
> >> 
> >> The other address terminology I hear of is "virtual address". To me this
> >> the address of the symbol once the binary is actually running on the
> >> processor. So in some embedded scenarios (like kalimba where there is no
> >> OS) we have code addresses in the ELF (i.e. file/load address) all
> >> starting at 80000000 e.g.
> >> 
> >> 80000354 g     F PM|0      00000000 $_main
> >> 
> >> But on the device (since it's harvard architecture with a CODE and DATA
> >> bus), main is actually at 0x0354. So in this context I'd say 0x80000354
> >> was the "load/file address" but 0x0354 was the virtual address. I see
> >> similar scenario with linux shared object files where in the file the
> >> symbol addresses are often based at 0, but at runtime are fixed-up to
> >> some arbitrary offset.   
> >> 
> >> Can you explain what lldb means by file/load/virtual and so on
> >> addresses?   
> > 
> > 
> > So in your terms:
> > 
> > virtual address is what we call the "load address".
> > file address means address as it is found in the object file you loaded it 
> > from.
> > 
> > Many object files speak of a virtual address when they are speaking of file 
> > addresses, so I didn't think "virtual address" made as much sense as "load 
> > address". For example the mach-o segments have a "vmaddr" and "vmsize" 
> > fields when parsing the segments which stand for virtual address and 
> > virtual size.
> > 
> > Hope that clears things up. 
> > 
> >> thanks    
> >> Matt
> >> 
> >> 
> >> 
> >> On Tue, 2014-11-04 at 00:56 +0000, Greg Clayton wrote:
> >>> Author: gclayton
> >>> Date: Mon Nov  3 18:56:30 2014
> >>> New Revision: 221213
> >>> 
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=221213&view=rev
> >>> Log:
> >>> Fixed SBTarget::ReadMemory() to work correctly and the TestTargetAPI.py 
> >>> test case that was reading target memory in 
> >>> TargetAPITestCase.test_read_memory_with_dsym and 
> >>> TargetAPITestCase.test_read_memory_with_dwarf.
> >>> 
> >>> The problem was that SBTarget::ReadMemory() was making a new section 
> >>> offset lldb_private::Address by doing:
> >>> 
> >>> 
> >>> size_t
> >>> SBTarget::ReadMemory (const SBAddress addr,
> >>>                     void *buf,
> >>>                     size_t size,
> >>>                     lldb::SBError &error)
> >>> {
> >>>       ...
> >>>       lldb_private::Address addr_priv(addr.GetFileAddress(), NULL);
> >>>       bytes_read = target_sp->ReadMemory(addr_priv, false, buf, size, 
> >>> err_priv);
> >>> 
> >>> 
> >>> This is wrong. If you get the file addresss from the "addr" argument and 
> >>> try to read memory using that, it will think the file address is a load 
> >>> address and it will try to resolve it accordingly. This will work fine if 
> >>> your executable is loaded at the same address (no slide), but it won't 
> >>> work if there is a slide.
> >>> 
> >>> The fix is to just pass along the "addr.ref()" instead of making a new 
> >>> addr_priv as this will pass along the lldb_private::Address that is 
> >>> inside the SBAddress (which is what we want), and not always change it 
> >>> into something that becomes a load address (if we are running), or 
> >>> abmigious file address (think address zero when you have 150 shared 
> >>> libraries that have sections that start at zero, which one would you 
> >>> pick). The main reason for passing a section offset address to 
> >>> SBTarget::ReadMemory() is so you _can_ read from the actual section + 
> >>> offset that is specified in the SBAddress. 
> >>> 
> >>> 
> >>> 
> >>> Modified:
> >>>   lldb/trunk/source/API/SBTarget.cpp
> >>>   lldb/trunk/test/python_api/target/TestTargetAPI.py
> >>> 
> >>> Modified: lldb/trunk/source/API/SBTarget.cpp
> >>> URL: 
> >>> http://llvm.org/viewvc/llvm-project/lldb/trunk/source/API/SBTarget.cpp?rev=221213&r1=221212&r2=221213&view=diff
> >>> ==============================================================================
> >>> --- lldb/trunk/source/API/SBTarget.cpp (original)
> >>> +++ lldb/trunk/source/API/SBTarget.cpp Mon Nov  3 18:56:30 2014
> >>> @@ -1306,13 +1306,11 @@ SBTarget::ReadMemory (const SBAddress ad
> >>>    if (target_sp)
> >>>    {
> >>>        Mutex::Locker api_locker (target_sp->GetAPIMutex());
> >>> -        lldb_private::Address addr_priv(addr.GetFileAddress(), NULL);
> >>> -        lldb_private::Error err_priv;    
> >>> -        bytes_read = target_sp->ReadMemory(addr_priv, false, buf, size, 
> >>> err_priv);
> >>> -        if(err_priv.Fail())
> >>> -        {
> >>> -            sb_error.SetError(err_priv.GetError(), err_priv.GetType());
> >>> -        }
> >>> +        bytes_read = target_sp->ReadMemory(addr.ref(), false, buf, size, 
> >>> sb_error.ref());
> >>> +    }
> >>> +    else
> >>> +    {
> >>> +        sb_error.SetErrorString("invalid target");
> >>>    }
> >>> 
> >>>    return bytes_read;
> >>> 
> >>> Modified: lldb/trunk/test/python_api/target/TestTargetAPI.py
> >>> URL: 
> >>> http://llvm.org/viewvc/llvm-project/lldb/trunk/test/python_api/target/TestTargetAPI.py?rev=221213&r1=221212&r2=221213&view=diff
> >>> ==============================================================================
> >>> --- lldb/trunk/test/python_api/target/TestTargetAPI.py (original)
> >>> +++ lldb/trunk/test/python_api/target/TestTargetAPI.py Mon Nov  3 
> >>> 18:56:30 2014
> >>> @@ -213,16 +213,20 @@ class TargetAPITestCase(TestBase):
> >>>        breakpoint = target.BreakpointCreateByLocation("main.c", 
> >>> self.line_main)
> >>>        self.assertTrue(breakpoint, VALID_BREAKPOINT)
> >>> 
> >>> +        # Put debugger into synchronous mode so when we 
> >>> target.LaunchSimple returns
> >>> +        # it will guaranteed to be at the breakpoint
> >>> +        self.dbg.SetAsync(False)
> >>> +        
> >>>        # Launch the process, and do not stop at the entry point.
> >>>        process = target.LaunchSimple (None, None, 
> >>> self.get_process_working_directory())
> >>> 
> >>>        # find the file address in the .data section of the main
> >>>        # module            
> >>>        data_section = self.find_data_section(target)
> >>> -        data_section_addr = data_section.file_addr
> >>> -        a = target.ResolveFileAddress(data_section_addr)
> >>> -
> >>> -        content = target.ReadMemory(a, 1, lldb.SBError())
> >>> +        sb_addr = lldb.SBAddress(data_section, 0)
> >>> +        error = lldb.SBError()
> >>> +        content = target.ReadMemory(sb_addr, 1, error)
> >>> +        self.assertTrue(error.Success(), "Make sure memory read 
> >>> succeeded")
> >>>        self.assertEquals(len(content), 1)
> >>> 
> >>>    def create_simple_target(self, fn):
> >>> 
> >>> 
> >>> _______________________________________________
> >>> lldb-commits mailing list
> >>> [email protected]
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
> >>> 
> >>> 
> >>> To report this email as spam click 
> >>> https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ== .
> >> 
> >> 
> >> 
> >> 
> >> Member of the CSR plc group of companies. CSR plc registered in England 
> >> and Wales, registered number 4187346, registered office Churchill House, 
> >> Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
> >> More information can be found at www.csr.com. Keep up to date with CSR on 
> >> our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, 
> >> YouTube, www.youtube.com/user/CSRplc, Facebook, 
> >> www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at 
> >> www.twitter.com/CSR_plc.
> >> New for 2014, you can now access the wide range of products powered by 
> >> aptX at www.aptx.com.
> > 
> > 
> > _______________________________________________
> > lldb-commits mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
> 


_______________________________________________
lldb-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [lldb] r221213 - Fixed SBTarget::ReadMemory() to work correctly and the TestTargetAPI.py test case that was reading target memory in TargetAPITestCase.test_read_memory_with_dsym and TargetAPITestCase.test_read_memory_with_dwarf.

Reply via email to