great work! :)
On Fri, Dec 27, 2013 at 3:21 PM, Ben Gamari <[email protected]> wrote: > Simon Marlow <[email protected]> writes: > > > This sounds right to me. Did you submit a patch? > > > > Note that dynamic linking with LLVM is likely to produce significantly > > worse code that with the NCG right now, because the LLVM back end uses > > dynamic references even for symbols in the same package, whereas the NCG > > back-end uses direct static references for these. > > > Today with the help of Edward Yang I examined the code produced by the > LLVM backend in light of this statement. I was surprised to find that > LLVM's code appears to be no worse than the NCG with respect to > intra-package references. > > My test case can be found here[2] and can be built with the included > `build.sh` script. The test consists of two modules build into a shared > library. One module, `LibTest`, exports a few simple members while the > other module (`LibTest2`) defines members that consume them. Care is > taken to ensure the members are not inlined. > > The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the > patches[1] I referred to in my last message. Please let me know if I've > missed something. > > > > # Evaluation > > ## First example ## > > The first member is a simple `String` (defined in `LibTest`), > > helloWorld :: String > helloWorld = "Hello World!" > > The use-site is quite straightforward, > > testHelloWorld :: IO String > testHelloWorld = return helloWorld > > With `-O1` the code looks reasonable in both cases. Most importantly, > both backends use IP relative addressing to find the symbol. > > ### LLVM ### > > 0000000000000ef8 <rKw_info>: > ef8: 48 8b 45 00 mov 0x0(%rbp),%rax > efc: 48 8d 1d cd 11 20 00 lea 0x2011cd(%rip),%rbx > # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure> > f03: ff e0 jmpq *%rax > > 0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>: > f28: eb ce jmp ef8 <rKw_info> > f2a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) > > ### NCG ### > > 0000000000000d58 <rH1_info>: > d58: 48 8d 1d 71 13 20 00 lea 0x201371(%rip),%rbx > # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure> > d5f: ff 65 00 jmpq *0x0(%rbp) > > 0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>: > d88: eb ce jmp d58 <rH1_info> > > > With `-O0` the code is substantially longer but the relocation behavior > is still correct, as one would expect. > > Looking at the definition of `helloWorld`[3] itself it becomes clear that > the LLVM backend is more likely to use PLT relocations over GOT. In > general, `stg_*` primitives are called through the PLT. As far as I can > tell, both of these call mechanisms will incur two memory > accesses. However, in the case of the PLT the call will consist of two > JMPs whereas the GOT will consist of only one. Is this a cause for > concern? Could these two jumps interfere with prediction? > > In general the LLVM backend produces a few more instructions than the > NCG although this doesn't appear to be related to handling of > relocations. For instance, the inexplicable (to me) `mov` at the > beginning of LLVM's `rKw_info`. > > > ## Second example ## > > The second example demonstrates an actual call, > > -- Definition (in LibTest) > infoRef :: Int -> Int > infoRef n = n + 1 > > -- Call site > testInfoRef :: IO Int > testInfoRef = return (infoRef 2) > > With `-O1` this produces the following code, > > ### LLVM ### > > 0000000000000fb0 <rLy_info>: > fb0: 48 8b 45 00 mov 0x0(%rbp),%rax > fb4: 48 8d 1d a5 10 20 00 lea 0x2010a5(%rip),%rbx > # 202060 <rLx_closure> > fbb: ff e0 jmpq *%rax > > 0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>: > fe0: eb ce jmp fb0 <rLy_info> > > ### NCG ### > > 0000000000000e10 <rI3_info>: > e10: 48 8d 1d 51 12 20 00 lea 0x201251(%rip),%rbx > # 202068 <rI2_closure> > e17: ff 65 00 jmpq *0x0(%rbp) > > 0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>: > e40: eb ce jmp e10 <rI3_info> > > Again, it seems that LLVM is a bit more verbose but seems to handle > intra-package calls efficiently. > > > > [1] https://github.com/bgamari/ghc/commits/llvm-dynamic > [2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test > [3] `helloWorld` definitions: > > LLVM: > 00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>: > 10a8: 50 push %rax > 10a9: 4c 8d 75 f0 lea -0x10(%rbp),%r14 > 10ad: 4d 39 fe cmp %r15,%r14 > 10b0: 73 07 jae 10b9 > <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11> > 10b2: 49 8b 45 f0 mov -0x10(%r13),%rax > 10b6: 5a pop %rdx > 10b7: ff e0 jmpq *%rax > 10b9: 4c 89 ef mov %r13,%rdi > 10bc: 48 89 de mov %rbx,%rsi > 10bf: e8 0c fd ff ff callq dd0 <newCAF@plt> > 10c4: 48 85 c0 test %rax,%rax > 10c7: 74 22 je 10eb > <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43> > 10c9: 48 8b 0d 18 0f 20 00 mov 0x200f18(%rip),%rcx > # 201fe8 <_DYNAMIC+0x228> > 10d0: 48 89 4d f0 mov %rcx,-0x10(%rbp) > 10d4: 48 89 45 f8 mov %rax,-0x8(%rbp) > 10d8: 48 8d 05 21 00 00 00 lea 0x21(%rip),%rax # > 1100 <cJC_str> > 10df: 4c 89 f5 mov %r14,%rbp > 10e2: 49 89 c6 mov %rax,%r14 > 10e5: 58 pop %rax > 10e6: e9 b5 fc ff ff jmpq da0 > <ghczmprim_GHCziCString_unpackCStringzh_info@plt> > 10eb: 48 8b 03 mov (%rbx),%rax > 10ee: 5a pop %rdx > 10ef: ff e0 jmpq *%rax > > > NCG: > > 0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>: > ef8: 48 8d 45 f0 lea -0x10(%rbp),%rax > efc: 4c 39 f8 cmp %r15,%rax > eff: 72 3f jb f40 > <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48> > f01: 4c 89 ef mov %r13,%rdi > f04: 48 89 de mov %rbx,%rsi > f07: 48 83 ec 08 sub $0x8,%rsp > f0b: b8 00 00 00 00 mov $0x0,%eax > f10: e8 1b fd ff ff callq c30 <newCAF@plt> > f15: 48 83 c4 08 add $0x8,%rsp > f19: 48 85 c0 test %rax,%rax > f1c: 74 20 je f3e > <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46> > f1e: 48 8b 1d cb 10 20 00 mov 0x2010cb(%rip),%rbx > # 201ff0 <_DYNAMIC+0x238> > f25: 48 89 5d f0 mov %rbx,-0x10(%rbp) > f29: 48 89 45 f8 mov %rax,-0x8(%rbp) > f2d: 4c 8d 35 1c 00 00 00 lea 0x1c(%rip),%r14 # > f50 <cGG_str> > f34: 48 83 c5 f0 add $0xfffffffffffffff0,%rbp > f38: ff 25 7a 10 20 00 jmpq *0x20107a(%rip) # > 201fb8 <_DYNAMIC+0x200> > f3e: ff 23 jmpq *(%rbx) > f40: 41 ff 65 f0 jmpq *-0x10(%r13) > > _______________________________________________ > ghc-devs mailing list > [email protected] > http://www.haskell.org/mailman/listinfo/ghc-devs > >
_______________________________________________ ghc-devs mailing list [email protected] http://www.haskell.org/mailman/listinfo/ghc-devs
