Brian Hulley wrote:
With -O2 enabled, __ccall_GC duma_vertex3f is indeed called directly
instead of vertex3f, from a different module, so that proves that
different monads can indeed be used to wrap IO operations without any
performance penalty at all.

However I've just discovered there *is* a penalty for converting between callback functions that return a different monad from the IO monad. For example, if I have a RenderM monad that allows primitives to be drawn to the screen, and a callback:

      newtype RenderM a = RenderM (IO a) deriving (Functor, Monad, MonadIO)

      type RenderCallback = Int -> Int -> RenderM ()

where the intention is that the callback will take the width and height of the window and return a RenderM action, the problem is that because the FFI does not allow RenderM to appear in a foreign type, the actual render function has to be converted into a function which returns an IO action instead of a RenderM action eg by:

      type RenderCallbackIO = Int -> Int -> IO ()

      dropRenderM :: RenderCallback -> RenderCallbackIO
      dropRenderM f x y = let RenderM io = f x y in io

foreign import ccall duma_onRender :: FunPtr RenderCallbackIO -> IO ()

      foreign import ccall "wrapper" mkRenderCallbackIO
           :: RenderCallbackIO -> IO (FunPtr RenderCallbackIO)

      onRender :: RenderCallback -> IO ()
      onRender f = mkRenderCallbackIO (dropRenderM f) >>= duma_onRender

With -O2 optimization, GHC does not seem to be able to optimize out the call to dropRenderM even though this just changes the return value of f from RenderM (IO a) to IO a, so RenderM is not transparent after all:

Duma.onRender = \ (f :: Duma.RenderCallback)
   (eta :: GHC.Prim.State# GHC.Prim.RealWorld) ->
case (# GHC.Prim.State# GHC.Prim.RealWorld, () #) Duma.mkRenderCallbackIO
             (Duma.dropRenderM f) eta
   of wild { (# new_s, a86 #) ->
   case (# GHC.Prim.State# GHC.Prim.RealWorld, () #) a86
   of ds { GHC.Ptr.FunPtr ds1 ->
   case (# GHC.Prim.State# GHC.Prim.RealWorld,
    () #) {__ccall_GC duma_onRender GHC.Prim.Addr#
    -> GHC.Prim.State# GHC.Prim.RealWorld
    -> (# GHC.Prim.State# GHC.Prim.RealWorld #)}
     ds1 new_s
   of wild1 { (# ds2 #) ->
   (# ds2, GHC.Base.() #)
   }
   }
   }

I must admit I'm not at all clear how to read the -ddump-simpl output so I may have got this wrong, but since Duma.dropRenderM is mentioned, I think this means this has not been optimized out.

Therefore there does seem to be an overhead for using different monads at the moment (?)

Regards, Brian.
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to