#5748: ghci segfault on OS X after dlsym failed lookup
------------------------+---------------------------------------------------
 Reporter:  gwright     |          Owner:                  
     Type:  bug         |         Status:  new             
 Priority:  normal      |      Component:  GHCi            
  Version:  7.2.1       |       Keywords:                  
       Os:  MacOS X     |   Architecture:  Unknown/Multiple
  Failure:  GHCi crash  |       Testcase:                  
Blockedby:              |       Blocking:                  
  Related:              |  
------------------------+---------------------------------------------------
 I've had repeatable segfaults with ghci 7.2.2 (OS X 10.6) and 7.0.4 (OS X
 10.7 and 10.6).  The immediate cause is a failed lookup of an external
 symbol in {{{rts/Linker.c}}}.  The failure is not detected and the NULL
 value returned is eventually dereferenced, leading to a segfault.  The
 underlying bug is still present in HEAD.

 This is what happens:
 {{{
 gwright-macbook> ghci -v Test.hs
 GHCi, version 7.2.2: http://www.haskell.org/ghc/  :? for help
 Glasgow Haskell Compiler, Version 7.2.2, stage 2 booted by GHC version
 7.0.4
 Using binary package database:
 /usr/local/lib/ghc-7.2.2/package.conf.d/package.cache
 Using binary package database:
 /Users/gwright/.ghc/x86_64-darwin-7.2.2/package.conf.d/package.cache
 hiding package Cabal-1.12.0 to avoid conflict with later version
 Cabal-1.13.3
 wired-in package ghc-prim mapped to ghc-
 prim-0.2.0.0-14e0c022e5d4efa3a40ab5991f2b2a1b
 wired-in package integer-gmp mapped to integer-
 gmp-0.3.0.0-2e2b0fd56be1a5f60c50913e615691d9
 wired-in package base mapped to
 base-4.4.1.0-5ca60b2acbb66fd59e5f81685cb72740
 wired-in package rts mapped to builtin_rts
 wired-in package template-haskell mapped to template-
 haskell-2.6.0.0-e7db5d1205f362bb792ab7bd5c7bbfae
 wired-in package dph-seq not found.
 wired-in package dph-par not found.
 Hsc static flags: -static
 Loading package ghc-prim ... linking ... done.
 Loading package integer-gmp ... linking ... done.
 Loading package base ... linking ... done.
 Loading package ffi-1.0 ... linking ... done.
 *** Chasing dependencies:
 Chasing modules from:
 Stable obj: []
 Stable BCO: []
 unload: retaining objs []
 unload: retaining bcos []
 Ready for upsweep []
 Upsweep completely successful.
 *** Deleting temp files:
 Deleting:
 *** Chasing dependencies:
 Chasing modules from: *Test.hs
 Stable obj: []
 Stable BCO: []
 unload: retaining objs []
 unload: retaining bcos []
 Ready for upsweep
   [NONREC
       ModSummary {
          ms_hs_date = Sun Jan  1 18:20:14 EST 2012
          ms_mod = main:Main,
          ms_textual_imps = [import Prelude,
                             import Math.Symbolic.Wheeler.TensorUtilities,
                             import Math.Symbolic.Wheeler.TensorBasics,
                             import Math.Symbolic.Wheeler.Tensor,
                             import Math.Symbolic.Wheeler.IO,
                             import Math.Symbolic.Wheeler.Symbol,
                             import Math.Symbolic.Wheeler.Numeric,
                             import Math.Symbolic.Wheeler.Expr,
                             import Math.Symbolic.Wheeler.Commutativity,
                             import Math.Symbolic.Wheeler.Canonicalize,
                             import Math.Symbolic.Wheeler.Basic, import
 Data.Ratio,
                             import Data.Maybe]
          ms_srcimps = []
       }]
 *** Deleting temp files:
 Deleting:
 compile: input file Test.hs
 Created temporary directory:
 /var/folders/4j/4jmo0VgVHgu2WNrlXKFTB++++TI/-Tmp-/ghc61560_1
 *** Checking old interface for main:Main:
 [1 of 1] Compiling Main             ( Test.hs, interpreted )
 *** Parser:
 *** Renamer/typechecker:
 *** Desugar:
 Result size of Desugar = 788
 *** Simplifier:
 Result size of Simplifier iteration=1 = 800
 Result size of Simplifier = 788
 *** Tidy Core:
 Result size of Tidy Core = 788
 *** CorePrep:
 Result size of CorePrep = 1010
 *** ByteCodeGen:
 Upsweep completely successful.
 *** Deleting temp files:
 Deleting:
 /var/folders/4j/4jmo0VgVHgu2WNrlXKFTB++++TI/-Tmp-/ghc61560_1/ghc61560_0.c
 /var/folders/4j/4jmo0VgVHgu2WNrlXKFTB++++TI/-Tmp-/ghc61560_1/ghc61560_0.o
 Warning: deleting non-existent
 /var/folders/4j/4jmo0VgVHgu2WNrlXKFTB++++TI/-Tmp-/ghc61560_1/ghc61560_0.c
 Warning: deleting non-existent
 /var/folders/4j/4jmo0VgVHgu2WNrlXKFTB++++TI/-Tmp-/ghc61560_1/ghc61560_0.o
 Ok, modules loaded: Main.
 *Main> x
 *** Parser:
 *** Desugar:
 *** Simplify:
 *** CorePrep:
 *** ByteCodeGen:
 Loading package bytestring-0.9.2.0 ... linking ... done.
 Loading package transformers-0.2.2.0 ... linking ... done.
 Loading package mtl-2.0.1.0 ... linking ... done.
 Loading package array-0.3.0.3 ... linking ... done.
 Loading package deepseq-1.2.0.1 ... linking ... done.
 Loading package text-0.11.1.12 ... linking ... done.
 Loading package parsec-3.1.2 ... linking ... done.
 Loading package uniqueid-0.1.1 ... linking ... done.
 Loading package Wheeler-0.1 ... linking ... done.
 Segmentation fault
 gwright-macbook>
 }}}

 The program is trying to display a record type (the variable "x").  The
 record type has a Show instance associated with it, and that is apparently
 not being resolved correctly, leading to a NULL dereference and a
 segfault.

 The problem is isolated to OS X as far as I can tell. The code responsible
 for the error is in the function {{{relocateSection}}}:

 {{{
         else if(reloc->r_extern)
         {
             struct nlist *symbol = &nlist[reloc->r_symbolnum];
             char *nm = image + symLC->stroff + symbol->n_un.n_strx;

             IF_DEBUG(linker, debugBelch("relocateSection: looking up
 external symbol %s\n", nm));
             IF_DEBUG(linker, debugBelch("               : type  = %d\n",
 symbol->n_type));
             IF_DEBUG(linker, debugBelch("               : sect  = %d\n",
 symbol->n_sect));
             IF_DEBUG(linker, debugBelch("               : desc  = %d\n",
 symbol->n_desc));
             IF_DEBUG(linker, debugBelch("               : value = %p\n",
 (void *)symbol->n_value));
             if ((symbol->n_type & N_TYPE) == N_SECT) {
                 value = relocateAddress(oc, nSections, sections,
                                         symbol->n_value);
                 IF_DEBUG(linker, debugBelch("relocateSection, defined
 external symbol %s, relocated address %p\n", nm, (void *)value));
             }
             else {
                 value = (uint64_t) lookupSymbol(nm);
                 IF_DEBUG(linker, debugBelch("relocateSection: external
 symbol %s, address %p\n", nm, (void *)value));
             }
         }
 }}}

 The returned value from {{{lookupSymbol}}} is not checked for failure.
 A simple check for NULL and a call to {{{errorBelch}}} is all that's
 needed to fix this.  There's another place where the return value from
 {{{lookupSymbol}}} is not checked and it should be fixed similarly.

 In a bit more detail, the program "Test.hs" I was loading does some simple
 tests on a library.  I admit to having been sloppy while sorting out the
 module exports, but the library compiles without warnings when -Wall is
 set.  The library has a top-level module that re-exports the most commonly
 used symbols, but the test program doesn't use it, importing all of the
 modules it needs explicitly.  Am I doing something that is known to be
 dangerous?

 The failed symbol lookup is

 {{{
 lookupSymbol: looking up
 
_Wheelerzm0zi1_MathziSymbolicziWheelerziSimpleSymbol_zdfShowSzuzdcshowsPrec_closure
 }}}

 which is the Show instance for a {{{SimpleSymbol}}} data type.  (The
 library implements a symbolic algebra DSL.)  The symbol is undefined in
 the object module:

 {{{
 gwright-macbook> nm HSWheeler-0.1.o  | grep
 
"_Wheelerzm0zi1_MathziSymbolicziWheelerziSimpleSymbol_zdfShowSzuzdcshowsPrec_closure"
                  U
 
_Wheelerzm0zi1_MathziSymbolicziWheelerziSimpleSymbol_zdfShowSzuzdcshowsPrec_closure
 gwright-macbook>
 }}}

 so some sort of failure is expected when I try to show something of the
 {{{SimpleSymbol}}} type.

 I'm puzzled that the first indication of failure is a segfault, or, after
 I patch {{{rts/Linker.c}}}, an error from deep inside the linker.  It
 seems that there is something else going wrong which ought to generate a
 warning at least.

 I will generate a patch against HEAD to check for the failed symbol
 lookups; it would be good if it were included in the final 7.4.1.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5748>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to