The first issue is tricky, I'll have a look at your dump and play around with a few ideas. I'll let you know what I come up with.
> I am thinking we need to fix the MachO file producer in llvm/clang to make a __LINKEDIT segment. I'll see what I can do about that. On Thu, Jun 5, 2014 at 1:38 PM, Greg Clayton <[email protected]> wrote: > > > On Jun 5, 2014, at 9:47 AM, Keno Fischer <[email protected]> > wrote: > > > > > - What is updateSectionLoadAddress(...) doing when it checks "if > (section_sp->GetFileAddress() > 0x100000)"? > > > > LLVM allocates sections with relocations outside of the actual symbol > file and then updates the section vmaddr accordingly. What this code does > is basically traverse through the section tree and for every leaf section > adjust the load address accordingly. The only problem is that LLVm doesn't > actually relocate all sections, so we have to have some kind of check to > determine whether the section was relocated or not. The condition in there > right now is a stop gap and I'd like to come up with something more > reasonable (I meant to ask about that in the initial review). I think some > sort of comparison to the file size would be appropriate, but I don't know > enough about Mach O object files to know about the relation of vmaddr and > file offset. Any ideas? > > The file size can be zero for BSS sections, so the file size doesn't > necessarily correlate with the vmsize. What is the file type of the mach-o > file? In the mach header there is a "filetype" field. I have attached a raw > macho dump of the load commands for the "swig" executable: > > > % mach_o.py `which swig` > 0x00000000: /usr/local/bin/swig (x86_64) > Mach Header > magic: 0xfeedfacf MH_MAGIC_64 > cputype: 0x01000007 x86_64 > cpusubtype: 0x80000003 > filetype: 0x00000002 MH_EXECUTE > ncmds: 0x00000012 18 > sizeofcmds: 0x000007d0 > flags: 0x00210085 MH_NOUNDEFS | MH_DYLDLINK | MH_TWOLEVEL | > MH_BINDS_TO_WEAK | MH_PIE > > VMADDR VMSIZE > FILEOFF FILESIZE PROTECT > 0x00000020: <0x0048> LC_SEGMENT_64 0x0000000000000000 > 0x0000000100000000 0x0000000000000000 0x0000000000000000 --- --- 0 > 0x00000000 __PAGEZERO > 0x00000068: <0x02c8> LC_SEGMENT_64 0x0000000100000000 > 0x00000000000ed000 0x0000000000000000 0x00000000000ed000 rwx r-x 8 > 0x00000000 __TEXT > 0x00000330: <0x02c8> LC_SEGMENT_64 0x00000001000ed000 > 0x0000000000009000 0x00000000000ed000 0x0000000000005000 rwx rw- 8 > 0x00000000 __DATA > 0x000005f8: <0x0048> LC_SEGMENT_64 0x00000001000f6000 > 0x0000000000006000 0x00000000000f2000 0x0000000000004380 rwx r-- 0 > 0x00000000 __LINKEDIT > 0x00000640: <0x0030> LC_DYLD_INFO_ONLY rebase_off = 0x000f2000, > rebase_size = 216, bind_off = 0x000f20d8, bind_size = 400, weak_bind_off = > 0x000f2268, weak_bind_size = 48, lazy_bind_off = 0x000f2298, lazy_bind_size > = 1120, export_off = 0x000f26f8, export_size = 32, > 0x00000670: <0x0018> LC_SYMTAB symoff = 0x000f30d8, nsyms = > 82, stroff = 0x000f3858, strsize = 944 > 0x00000688: <0x0050> LC_DYSYMTAB ilocalsym = 0 , > nlocalsym = 1 > iextdefsym = 1 , > nextdefsym = 1 > iundefsym = 2 , > nundefsym = 80 > tocoff = 0x00000000, > ntoc = 0 > modtaboff = 0x00000000, > nmodtab = 0 > extrefsymoff = 0x00000000, > nextrefsyms = 0 > indirectsymoff = 0x000f35f8, > nindirectsyms = 152 > extreloff = 0x00000000, > nextrel = 0 > locreloff = 0x00000000, > nlocrel = 0 > 0x000006d8: <0x0020> LC_LOAD_DYLINKER /usr/lib/dyld > 0x000006f8: <0x0018> LC_UUID > f0c6b9ae-2ab8-3305-9746-f7275e37cc94 > 0x00000710: <0x0010> LC_VERSION_MIN_MACOSX > 0x00000720: <0x0010> 0x0000002a > 0x00000730: <0x0018> 0x80000028 > 0x00000748: <0x0030> LC_LOAD_DYLIB 0x00000002 0x00780000 > 0x00010000 /usr/lib/libc++.1.dylib > 0x00000778: <0x0038> LC_LOAD_DYLIB 0x00000002 0x04bc0000 > 0x00010000 /usr/lib/libSystem.B.dylib > 0x000007b0: <0x0010> LC_FUNCTION_STARTS dataoff = 0x000f2718, > datasize = 2464 > 0x000007c0: <0x0010> 0x00000029 > 0x000007d0: <0x0010> 0x0000002b > 0x000007e0: <0x0010> LC_CODE_SIGNATURE dataoff = 0x000f3c10, > datasize = 10096 > > > And the sections look like: > > INDEX ADDRESS SIZE OFFSET ALIGN RELOFF > NRELOC FLAGS RESERVED1 RESERVED2 RESERVED3 NAME > ===== ------------------ ------------------ ---------- ---------- > ---------- ---------- ---------- ---------- ---------- ---------- > ---------------------- > [ 1] 0x0000000100000e10 0x00000000000c1852 0x00000e10 0x00000004 > 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 > __TEXT.__text > [ 2] 0x00000001000c2662 0x00000000000001aa 0x000c2662 0x00000001 > 0x00000000 0x00000000 0x80000408 0x00000000 0x00000006 0x00000000 > __TEXT.__stubs > [ 3] 0x00000001000c280c 0x00000000000002d6 0x000c280c 0x00000002 > 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 > __TEXT.__stub_helper > [ 4] 0x00000001000c2af0 0x0000000000023288 0x000c2af0 0x00000004 > 0x00000000 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 > __TEXT.__cstring > [ 5] 0x00000001000e5d80 0x00000000000054a0 0x000e5d80 0x00000004 > 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 > __TEXT.__const > [ 6] 0x00000001000eb220 0x0000000000000128 0x000eb220 0x00000004 > 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 > __TEXT.__ustring > [ 7] 0x00000001000eb348 0x0000000000000ac4 0x000eb348 0x00000002 > 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 > __TEXT.__gcc_except_tab > [ 8] 0x00000001000ebe0c 0x00000000000011f0 0x000ebe0c 0x00000002 > 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 > __TEXT.__unwind_info > [ 9] 0x00000001000ed000 0x0000000000000040 0x000ed000 0x00000003 > 0x00000000 0x00000000 0x00000006 0x00000047 0x00000000 0x00000000 > __DATA.__got > [ 10] 0x00000001000ed040 0x0000000000000010 0x000ed040 0x00000003 > 0x00000000 0x00000000 0x00000006 0x0000004f 0x00000000 0x00000000 > __DATA.__nl_symbol_ptr > [ 11] 0x00000001000ed050 0x0000000000000238 0x000ed050 0x00000003 > 0x00000000 0x00000000 0x00000007 0x00000051 0x00000000 0x00000000 > __DATA.__la_symbol_ptr > [ 12] 0x00000001000ed288 0x0000000000000030 0x000ed288 0x00000003 > 0x00000000 0x00000000 0x00000009 0x00000000 0x00000000 0x00000000 > __DATA.__mod_init_func > [ 13] 0x00000001000ed2c0 0x0000000000001908 0x000ed2c0 0x00000004 > 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 > __DATA.__const > [ 14] 0x00000001000eebd0 0x0000000000002fd4 0x000eebd0 0x00000004 > 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 > __DATA.__data > [ 15] 0x00000001000f1bb0 0x00000000000000f8 0x00000000 0x00000004 > 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 > __DATA.__common > [ 16] 0x00000001000f1cb0 0x0000000000003860 0x00000000 0x00000004 > 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 > __DATA.__bss > > Not sure if this helps you see anything? > > > > > > - Why are we preloading everything with the code: ... > > > > Yes, you're right, that was for debugging and slipped past my cleanup. > > Ah, phew! > > > > > > - Your fix to ObjectFileMachO.cpp is not correct... > > > > The code in if (process) doesn't do anything if we don't have a > linkedit_section_sp. Maybe we need to duplicate that code in an else block > for linkedit_section_sp ... > > I am thinking we need to fix the MachO file producer in llvm/clang to make > a __LINKEDIT segment. The __LINKEDIT segment contains anything that isn't > in any other section that isn't needed for running. It is just a bunch if > linker bits like the symbol table, string table, compact unwind info and > more. So all bits in a mach-o file must be spoken for and must be in a > segment. The mach-o file starts with a bunch of load commands (as you can > see above in the swig dump). > > The LC_SYMTAB load command contains information about the symbol table and > it contains: > > 0x00000670: <0x0018> LC_SYMTAB symoff = 0x000f30d8, nsyms = > 82, stroff = 0x000f3858, strsize = 944 > > This tells us the symbol table offset in the file (offset from the start > of the mach header) and the size, and the string table offset + size. The > symbol table and string table should be in a __LINKEDIT segment. > > Note there is other load commands that point to data in the __LINKEDIT > segment: > > 0x00000640: <0x0030> LC_DYLD_INFO_ONLY rebase_off = 0x000f2000, > rebase_size = 216, bind_off = 0x000f20d8, bind_size = 400, weak_bind_off = > 0x000f2268, weak_bind_size = 48, lazy_bind_off = 0x000f2298, lazy_bind_size > = 1120, export_off = 0x000f26f8, export_size = 32, > 0x00000688: <0x0050> LC_DYSYMTAB ilocalsym = 0 , > nlocalsym = 1 > iextdefsym = 1 , > nextdefsym = 1 > iundefsym = 2 , > nundefsym = 80 > tocoff = 0x00000000, > ntoc = 0 > modtaboff = 0x00000000, > nmodtab = 0 > extrefsymoff = 0x00000000, > nextrefsyms = 0 > indirectsymoff = 0x000f35f8, > nindirectsyms = 152 > extreloff = 0x00000000, > nextrel = 0 > locreloff = 0x00000000, > nlocrel = 0 > > 0x000007b0: <0x0010> LC_FUNCTION_STARTS dataoff = 0x000f2718, > datasize = 2464 > 0x000007e0: <0x0010> LC_CODE_SIGNATURE dataoff = 0x000f3c10, > datasize = 10096 > > > Thank you for your comments. I'm learning as I'm going here. > > No worries I can definitely help out with getting this ready, a few more > iterations and we should be good. > > > > Keno > > > > > > > > On Thu, Jun 5, 2014 at 12:37 PM, Greg Clayton <[email protected]> > wrote: > > Can you explain a few things?: > > > > - What is updateSectionLoadAddress(...) doing when it checks "if > (section_sp->GetFileAddress() > 0x100000)"? > > - Why are we preloading everything with the code: > > > > // load the symbol table right away > > module_sp->GetObjectFile()->GetSymtab(); > > > > module_sp->GetSymbolVendor()->GetNumCompileUnits(); > > module_sp->GetSymbolVendor()->GetCompileUnitAtIndex(0); > > module_sp->ParseAllDebugSymbols(); > > > > This seems like we should just let it load things lazily. Parsing all > debug symbols is not advised, it should be allowed to lazily parse the > DWARF as it needs to. > > > > - Your fix to ObjectFileMachO.cpp is not correct. If we have a process, > then we load the symbol table from memory (the code in the "if (process)"), > else we load it from the load commands (in the "else") and from the file > itself. We don't want to always load the symbol table from the load > commands as the symtab_load_command.symoff and symtab_load_command.stroff > are not correct when a mach-o file is being read from memory. > > > > > > > > > On Jun 3, 2014, at 9:46 AM, Keno Fischer <[email protected]> > wrote: > > > > > > This is the LLDB side of http://reviews.llvm.org/D4005 > > > > > > http://reviews.llvm.org/D4006 > > > > > > Files: > > > lib/Makefile > > > source/Core/Section.cpp > > > source/Plugins/JITLoader/GDB/JITLoaderGDB.cpp > > > source/Plugins/Makefile > > > source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp > > > <D4006.10055.patch>_______________________________________________ > > > lldb-commits mailing list > > > [email protected] > > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits > > > > > >
_______________________________________________ lldb-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
