I missed your question about macho file types. These are all MH_OBJECTs.
On Thu, Jun 5, 2014 at 1:43 PM, Keno Fischer <[email protected]> wrote: > The first issue is tricky, I'll have a look at your dump and play around > with a few ideas. I'll let you know what I come up with. > > > I am thinking we need to fix the MachO file producer in llvm/clang to > make a __LINKEDIT segment. > > I'll see what I can do about that. > > > On Thu, Jun 5, 2014 at 1:38 PM, Greg Clayton <[email protected]> wrote: > >> >> > On Jun 5, 2014, at 9:47 AM, Keno Fischer <[email protected]> >> wrote: >> > >> > > - What is updateSectionLoadAddress(...) doing when it checks "if >> (section_sp->GetFileAddress() > 0x100000)"? >> > >> > LLVM allocates sections with relocations outside of the actual symbol >> file and then updates the section vmaddr accordingly. What this code does >> is basically traverse through the section tree and for every leaf section >> adjust the load address accordingly. The only problem is that LLVm doesn't >> actually relocate all sections, so we have to have some kind of check to >> determine whether the section was relocated or not. The condition in there >> right now is a stop gap and I'd like to come up with something more >> reasonable (I meant to ask about that in the initial review). I think some >> sort of comparison to the file size would be appropriate, but I don't know >> enough about Mach O object files to know about the relation of vmaddr and >> file offset. Any ideas? >> >> The file size can be zero for BSS sections, so the file size doesn't >> necessarily correlate with the vmsize. What is the file type of the mach-o >> file? In the mach header there is a "filetype" field. I have attached a raw >> macho dump of the load commands for the "swig" executable: >> >> >> % mach_o.py `which swig` >> 0x00000000: /usr/local/bin/swig (x86_64) >> Mach Header >> magic: 0xfeedfacf MH_MAGIC_64 >> cputype: 0x01000007 x86_64 >> cpusubtype: 0x80000003 >> filetype: 0x00000002 MH_EXECUTE >> ncmds: 0x00000012 18 >> sizeofcmds: 0x000007d0 >> flags: 0x00210085 MH_NOUNDEFS | MH_DYLDLINK | MH_TWOLEVEL | >> MH_BINDS_TO_WEAK | MH_PIE >> >> VMADDR VMSIZE >> FILEOFF FILESIZE PROTECT >> 0x00000020: <0x0048> LC_SEGMENT_64 0x0000000000000000 >> 0x0000000100000000 0x0000000000000000 0x0000000000000000 --- --- 0 >> 0x00000000 __PAGEZERO >> 0x00000068: <0x02c8> LC_SEGMENT_64 0x0000000100000000 >> 0x00000000000ed000 0x0000000000000000 0x00000000000ed000 rwx r-x 8 >> 0x00000000 __TEXT >> 0x00000330: <0x02c8> LC_SEGMENT_64 0x00000001000ed000 >> 0x0000000000009000 0x00000000000ed000 0x0000000000005000 rwx rw- 8 >> 0x00000000 __DATA >> 0x000005f8: <0x0048> LC_SEGMENT_64 0x00000001000f6000 >> 0x0000000000006000 0x00000000000f2000 0x0000000000004380 rwx r-- 0 >> 0x00000000 __LINKEDIT >> 0x00000640: <0x0030> LC_DYLD_INFO_ONLY rebase_off = 0x000f2000, >> rebase_size = 216, bind_off = 0x000f20d8, bind_size = 400, weak_bind_off = >> 0x000f2268, weak_bind_size = 48, lazy_bind_off = 0x000f2298, lazy_bind_size >> = 1120, export_off = 0x000f26f8, export_size = 32, >> 0x00000670: <0x0018> LC_SYMTAB symoff = 0x000f30d8, nsyms = >> 82, stroff = 0x000f3858, strsize = 944 >> 0x00000688: <0x0050> LC_DYSYMTAB ilocalsym = 0 , >> nlocalsym = 1 >> iextdefsym = 1 , >> nextdefsym = 1 >> iundefsym = 2 , >> nundefsym = 80 >> tocoff = 0x00000000, >> ntoc = 0 >> modtaboff = 0x00000000, >> nmodtab = 0 >> extrefsymoff = 0x00000000, >> nextrefsyms = 0 >> indirectsymoff = 0x000f35f8, >> nindirectsyms = 152 >> extreloff = 0x00000000, >> nextrel = 0 >> locreloff = 0x00000000, >> nlocrel = 0 >> 0x000006d8: <0x0020> LC_LOAD_DYLINKER /usr/lib/dyld >> 0x000006f8: <0x0018> LC_UUID >> f0c6b9ae-2ab8-3305-9746-f7275e37cc94 >> 0x00000710: <0x0010> LC_VERSION_MIN_MACOSX >> 0x00000720: <0x0010> 0x0000002a >> 0x00000730: <0x0018> 0x80000028 >> 0x00000748: <0x0030> LC_LOAD_DYLIB 0x00000002 0x00780000 >> 0x00010000 /usr/lib/libc++.1.dylib >> 0x00000778: <0x0038> LC_LOAD_DYLIB 0x00000002 0x04bc0000 >> 0x00010000 /usr/lib/libSystem.B.dylib >> 0x000007b0: <0x0010> LC_FUNCTION_STARTS dataoff = 0x000f2718, >> datasize = 2464 >> 0x000007c0: <0x0010> 0x00000029 >> 0x000007d0: <0x0010> 0x0000002b >> 0x000007e0: <0x0010> LC_CODE_SIGNATURE dataoff = 0x000f3c10, >> datasize = 10096 >> >> >> And the sections look like: >> >> INDEX ADDRESS SIZE OFFSET ALIGN RELOFF >> NRELOC FLAGS RESERVED1 RESERVED2 RESERVED3 NAME >> ===== ------------------ ------------------ ---------- ---------- >> ---------- ---------- ---------- ---------- ---------- ---------- >> ---------------------- >> [ 1] 0x0000000100000e10 0x00000000000c1852 0x00000e10 0x00000004 >> 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 >> __TEXT.__text >> [ 2] 0x00000001000c2662 0x00000000000001aa 0x000c2662 0x00000001 >> 0x00000000 0x00000000 0x80000408 0x00000000 0x00000006 0x00000000 >> __TEXT.__stubs >> [ 3] 0x00000001000c280c 0x00000000000002d6 0x000c280c 0x00000002 >> 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 >> __TEXT.__stub_helper >> [ 4] 0x00000001000c2af0 0x0000000000023288 0x000c2af0 0x00000004 >> 0x00000000 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 >> __TEXT.__cstring >> [ 5] 0x00000001000e5d80 0x00000000000054a0 0x000e5d80 0x00000004 >> 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 >> __TEXT.__const >> [ 6] 0x00000001000eb220 0x0000000000000128 0x000eb220 0x00000004 >> 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 >> __TEXT.__ustring >> [ 7] 0x00000001000eb348 0x0000000000000ac4 0x000eb348 0x00000002 >> 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 >> __TEXT.__gcc_except_tab >> [ 8] 0x00000001000ebe0c 0x00000000000011f0 0x000ebe0c 0x00000002 >> 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 >> __TEXT.__unwind_info >> [ 9] 0x00000001000ed000 0x0000000000000040 0x000ed000 0x00000003 >> 0x00000000 0x00000000 0x00000006 0x00000047 0x00000000 0x00000000 >> __DATA.__got >> [ 10] 0x00000001000ed040 0x0000000000000010 0x000ed040 0x00000003 >> 0x00000000 0x00000000 0x00000006 0x0000004f 0x00000000 0x00000000 >> __DATA.__nl_symbol_ptr >> [ 11] 0x00000001000ed050 0x0000000000000238 0x000ed050 0x00000003 >> 0x00000000 0x00000000 0x00000007 0x00000051 0x00000000 0x00000000 >> __DATA.__la_symbol_ptr >> [ 12] 0x00000001000ed288 0x0000000000000030 0x000ed288 0x00000003 >> 0x00000000 0x00000000 0x00000009 0x00000000 0x00000000 0x00000000 >> __DATA.__mod_init_func >> [ 13] 0x00000001000ed2c0 0x0000000000001908 0x000ed2c0 0x00000004 >> 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 >> __DATA.__const >> [ 14] 0x00000001000eebd0 0x0000000000002fd4 0x000eebd0 0x00000004 >> 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 >> __DATA.__data >> [ 15] 0x00000001000f1bb0 0x00000000000000f8 0x00000000 0x00000004 >> 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 >> __DATA.__common >> [ 16] 0x00000001000f1cb0 0x0000000000003860 0x00000000 0x00000004 >> 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 >> __DATA.__bss >> >> Not sure if this helps you see anything? >> >> > >> > > - Why are we preloading everything with the code: ... >> > >> > Yes, you're right, that was for debugging and slipped past my cleanup. >> >> Ah, phew! >> >> > >> > > - Your fix to ObjectFileMachO.cpp is not correct... >> > >> > The code in if (process) doesn't do anything if we don't have a >> linkedit_section_sp. Maybe we need to duplicate that code in an else block >> for linkedit_section_sp ... >> >> I am thinking we need to fix the MachO file producer in llvm/clang to >> make a __LINKEDIT segment. The __LINKEDIT segment contains anything that >> isn't in any other section that isn't needed for running. It is just a >> bunch if linker bits like the symbol table, string table, compact unwind >> info and more. So all bits in a mach-o file must be spoken for and must be >> in a segment. The mach-o file starts with a bunch of load commands (as you >> can see above in the swig dump). >> >> The LC_SYMTAB load command contains information about the symbol table >> and it contains: >> >> 0x00000670: <0x0018> LC_SYMTAB symoff = 0x000f30d8, nsyms = >> 82, stroff = 0x000f3858, strsize = 944 >> >> This tells us the symbol table offset in the file (offset from the start >> of the mach header) and the size, and the string table offset + size. The >> symbol table and string table should be in a __LINKEDIT segment. >> >> Note there is other load commands that point to data in the __LINKEDIT >> segment: >> >> 0x00000640: <0x0030> LC_DYLD_INFO_ONLY rebase_off = 0x000f2000, >> rebase_size = 216, bind_off = 0x000f20d8, bind_size = 400, weak_bind_off = >> 0x000f2268, weak_bind_size = 48, lazy_bind_off = 0x000f2298, lazy_bind_size >> = 1120, export_off = 0x000f26f8, export_size = 32, >> 0x00000688: <0x0050> LC_DYSYMTAB ilocalsym = 0 , >> nlocalsym = 1 >> iextdefsym = 1 , >> nextdefsym = 1 >> iundefsym = 2 , >> nundefsym = 80 >> tocoff = 0x00000000, >> ntoc = 0 >> modtaboff = 0x00000000, >> nmodtab = 0 >> extrefsymoff = 0x00000000, >> nextrefsyms = 0 >> indirectsymoff = 0x000f35f8, >> nindirectsyms = 152 >> extreloff = 0x00000000, >> nextrel = 0 >> locreloff = 0x00000000, >> nlocrel = 0 >> >> 0x000007b0: <0x0010> LC_FUNCTION_STARTS dataoff = 0x000f2718, >> datasize = 2464 >> 0x000007e0: <0x0010> LC_CODE_SIGNATURE dataoff = 0x000f3c10, >> datasize = 10096 >> >> > Thank you for your comments. I'm learning as I'm going here. >> >> No worries I can definitely help out with getting this ready, a few more >> iterations and we should be good. >> > >> > Keno >> > >> > >> > >> > On Thu, Jun 5, 2014 at 12:37 PM, Greg Clayton <[email protected]> >> wrote: >> > Can you explain a few things?: >> > >> > - What is updateSectionLoadAddress(...) doing when it checks "if >> (section_sp->GetFileAddress() > 0x100000)"? >> > - Why are we preloading everything with the code: >> > >> > // load the symbol table right away >> > module_sp->GetObjectFile()->GetSymtab(); >> > >> > module_sp->GetSymbolVendor()->GetNumCompileUnits(); >> > module_sp->GetSymbolVendor()->GetCompileUnitAtIndex(0); >> > module_sp->ParseAllDebugSymbols(); >> > >> > This seems like we should just let it load things lazily. Parsing all >> debug symbols is not advised, it should be allowed to lazily parse the >> DWARF as it needs to. >> > >> > - Your fix to ObjectFileMachO.cpp is not correct. If we have a process, >> then we load the symbol table from memory (the code in the "if (process)"), >> else we load it from the load commands (in the "else") and from the file >> itself. We don't want to always load the symbol table from the load >> commands as the symtab_load_command.symoff and symtab_load_command.stroff >> are not correct when a mach-o file is being read from memory. >> > >> > >> > >> > > On Jun 3, 2014, at 9:46 AM, Keno Fischer < >> [email protected]> wrote: >> > > >> > > This is the LLDB side of http://reviews.llvm.org/D4005 >> > > >> > > http://reviews.llvm.org/D4006 >> > > >> > > Files: >> > > lib/Makefile >> > > source/Core/Section.cpp >> > > source/Plugins/JITLoader/GDB/JITLoaderGDB.cpp >> > > source/Plugins/Makefile >> > > source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp >> > > <D4006.10055.patch>_______________________________________________ >> > > lldb-commits mailing list >> > > [email protected] >> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits >> > >> > >> >> >
_______________________________________________ lldb-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
