RE: Minor tweaks to ffi addendum

Simon Marlow Mon, 03 Jun 2002 03:12:28 -0700

> section 5.5:
> 
>   It's going to be really hard to implement ForeignPtr as specified.
>   The problem is that invocation of the cleanup function is triggered
>   by the garbage collector.  I don't want the garbage collector to be
>   recursively invoked so I don't want the garbage collector to
>   directly invoke Haskell function.  The Hugs garbage collector is not
>   supposed to be a mutator of the heap - so putting a simple wrapper
>   round the garbage colector won't work.  And there's no mechanism
>   outside the GC to look for cleanup functions to execute.
> 
>   GHC gets round this by scheduling a (preemptive) thread to execute
>   the cleanup code.  How on earth does NHC get round this?  Does
>   anyone have a suggestion for how it might be implemented in Hugs?
> 
>   Proposed change: none at present but I'm deeply sceptical of a
>   design which takes a simple task (invoke a C cleanup function) and,
>   for no discernable reason, generalizes it to the point that it 
>   requires a much more complex runtime system.


I've attached some mail I dug up on the subject, from before the
creation of [EMAIL PROTECTED] (there's more in the archives of that list).
The upshot is:

  - Haskell finalizers are much more in keeping with the general
    philosophy of the FFI: do as much in Haskell as possible.  It
    looks a bit strange if you have to do your allocation in Haskell
    but deallocation in C.

  - If we forced finalizers to be C functions because calling Haskell
    from the GC is inconvenient, then you have to add a constraint
    that the C function invoked from a finalizer can't call any
    functions foreign-exported from Haskell land.  That's an annoying
    constraint to have to add, because it means that C libraries
    can't be transparent w.r.t. whether they invoke Haskell code or
    not (actually the hs_init() problem that someone else brought
    up recently also has this side-effect).

Malcolm can comment on exactly how nhc98 handles the finalizers, but I
believe that there's a list of pending finalizers maintained by the RTS
which are run on re-entry to Haskell land.   The equivalent would be to
add a check in Hugs's eval loop, I imagine.

> Table 2 (section 6):
> 
>   Is there a reason to be so coy about the concrete C types used for
>   HsChar, HsFloat, HsDouble and HsBool?
> 
>   Proposed change:
> 
>     C symbol    Haskell Symbol  Constraint on concrete C type
>     HsChar      Char            unsigned char

HsChar is an unsigned 32-bit int in GHC.  We probably ought to say that
the type is unsigned, at least - and the Haskell 98 standard requires
that it can at least contain all the Unicode character values (20
bits?).

Cheers,
        Simon

--- Begin Message ---

>    * The type of the second argument of makeForeignObj (the finalizer)
>      has been changed from Addr to IO (). This is more consistent and
>      IMHO Addr is a hack here.

I liked it better before :-( as
    makeForeignObj :: Addr -> Addr -> IO ForeignObj
I would guess that the motivation to make the "finalizer" argument of
type IO () is that you want to write finalizers in Haskell.
Personally, I like to use the C finalizer, because if I'm importing the
value itself as an opaque type, then why not simply use the finalizer
code opaquely too?  In any case, the C finalizer has type
    void finalizer(real_c_type_for_foreign_obj);
Surely if you have this finalizer in the Haskell world, its type
must be something like the following?
    finalizer :: Addr -> IO ()
rather than
    finalizer :: IO ()
So we should really have
    makeForeignObj :: Addr -> (Addr->IO ()) -> IO ForeignObj
    writeForeignFinalizer :: ForeignObj -> (Addr->IO ()) -> IO ()
no?


> To finish the first round, I'd like to make a quick poll: Should
> stable names be mandatory for the new FFI libs or not?

I haven't got any immediate use for stable names, so I would vote to
exclude them (for simplicity) until they are better understood.

>    * The mapping of StablePtr to a C type.

I go with Manuel's (void*).

>    * Should addresses on the Haskell side be parameterized by the
>      type of the object they point to? How exactly should they be
>      mapped to C types?

Yes, I go with (Ptr a).  For simplicity, I think there should be no
compiler-supported mapping to C-types at this stage.
[  In earlier discussions, we thought it would be nice to have
   Ptr Char  -->  char*  and so on.  However, you can quickly run
   into tricky cases once the types become more complex, e.g.
   Ptr (a -> IO a)  ->  void**()(void*)
]

>    * Are the lifetime rules for ForeignObj (section 2.4 of the FFI
>      document) satisfying?

I'm still not sure that I fully understand the difficulties with
ForeignObj finalisation in systems other than nhc98, so I'm reasonably
happy with the current (minimal) rules.

Regards,
    Malcolm

--- End Message ---

--- Begin Message ---

To summarize the comments on the 1st shot, I've changed the proposal
in the following ways:

   * `Marshalable' has been renamed to `Storable'. I like adjectives as
     names of type classes and this sounds less esoteric than `Marshal'.

   * deref and friends have been nuked from Storable, because no one
     came up with a reason to keep it.

   * `fooElemOffs' has been renamed to `fooElemOff', this is shorter
     and in the tradition of `Addr', `Obj', `Ptr', ...

   * The module `Foreign' has been renamed to `ForeignObj'.

   * The type of the second argument of makeForeignObj (the finalizer)
     has been changed from Addr to IO (). This is more consistent and
     IMHO Addr is a hack here.

   * `writeForeignObj' has been renamed to `writeForeignAddr'. This
     makes clear which part is overwritten.

   * `addForeignFinalizer' has been renamed to `writeForeignFinalizer',
     because it *overwrites* any existing finalizer.

   * For enhanced symmetry and reduced confusion with makeForeignObj,
     `mkForeignObj' has been renamed to `addrToForeignObj'.

These are a lot of changes related to names, but it is important to
be consistent here and there are probably not very many programs out
there which need a change.

To finish the first round, I'd like to make a quick poll: Should
stable names be mandatory for the new FFI libs or not?

Remaining low-level topics are:

   * The mapping of StablePtr to a C type.

   * Should addresses on the Haskell side be parameterized by the
     type of the object they point to? How exactly should they be
     mapped to C types?

   * Are the lifetime rules for ForeignObj (section 2.4 of the FFI
     document) satisfying?

It would be nice if we could come to a conclusion here soon, because
experimenting with the the higher-level library is not much fun when
the foundations are a moving target...

Cheers,
   Sven
-- 
Sven Panne                                        Tel.: +49/89/2178-2235
LMU, Institut fuer Informatik                     FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen              Oettingenstr. 67
mailto:[EMAIL PROTECTED]            D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne

--- End Message ---

--- Begin Message ---

Malcolm Wallace wrote:
> >    * The type of the second argument of makeForeignObj (the finalizer)
> >      has been changed from Addr to IO (). This is more consistent and
> >      IMHO Addr is a hack here.
> 
> I liked it better before :-( as
>     makeForeignObj :: Addr -> Addr -> IO ForeignObj
> I would guess that the motivation to make the "finalizer" argument of
> type IO () is that you want to write finalizers in Haskell.

That's right: When GC time comes and the finalizer is called, you
want to do *two* things in general: Clean up something in Haskell land
and clean up the dirty realms of C. So you basically have two options:

   * The type of the finalizer is `Addr': To clean up Haskell land,
     you have to call back via any mechanism provided, return to C,
     and finally return to Haskell's GC. This seems a bit twisted to
     me. (OK, not as twisted as Keith's proposed family of -xx RTS
     options... ;-)

   * The type of the finalizer is `Addr -> IO ()' (not simply
     `IO ()' as I suggested prematurely :-}  This way you can do
     your Haskell cleanup naturally and call C via the FFI. The
     common deallocation via `free' for example would boil down to:

        -- The following is already exported from PreludeFFI:
        -- foreign import unsafe free :: Addr -> IO ()
        ...
        writeForeignFinalizer myForeignObj free

     Much nicer than the first alternative, IMHO.

> [...] So we should really have
>     makeForeignObj :: Addr -> (Addr->IO ()) -> IO ForeignObj
>     writeForeignFinalizer :: ForeignObj -> (Addr->IO ()) -> IO ()
> no?

I'm going to change the proposal exactly this way tomorrow.

> > To finish the first round, I'd like to make a quick poll: Should
> > stable names be mandatory for the new FFI libs or not?
> 
> I haven't got any immediate use for stable names, so I would vote to
> exclude them (for simplicity) until they are better understood.

Same for me. Hopefully Simon M is not to sad about this...

> >    * The mapping of StablePtr to a C type.
> 
> I go with Manuel's (void*).

This is somehow related to the next topic, but it should be a pointer
in any case.

> >    * Should addresses on the Haskell side be parameterized by the
> >      type of the object they point to? How exactly should they be
> >      mapped to C types?
> 
> Yes, I go with (Ptr a).  For simplicity, I think there should be no
> compiler-supported mapping to C-types at this stage.
> [  In earlier discussions, we thought it would be nice to have
>    Ptr Char  -->  char*  and so on.  However, you can quickly run
>    into tricky cases once the types become more complex, e.g.
>    Ptr (a -> IO a)  ->  void**()(void*)
> ]

What's wrong with this? It shouldn't be too hard to give a recursive
translation from the type of Ptr's argument to a C type. What bothers
me more is the question: How should we represent C's `const' qualifier
on the Haskell side?

> >    * Are the lifetime rules for ForeignObj (section 2.4 of the FFI
> >      document) satisfying?
> 
> I'm still not sure that I fully understand the difficulties with
> ForeignObj finalisation in systems other than nhc98, so I'm
>  reasonably happy with the current (minimal) rules.

<aol>
   Me too!  :-)
</aol>

Cheers,
   Sven
-- 
Sven Panne                                        Tel.: +49/89/2178-2235
LMU, Institut fuer Informatik                     FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen              Oettingenstr. 67
mailto:[EMAIL PROTECTED]            D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne

--- End Message ---

--- Begin Message ---

I'd like to return once more to the issue of finalizers for ForeignObjs.

|>      finalizer :: Addr               -- pointer to C function
    vs.
|>      finalizer :: Addr -> IO ()      -- Haskell finalizer

I have just tried to implement the latter, and realised there is a
small problem.  The finalizer is normally called by the GC when the
ForeignObj is no longer live.  It is somewhat unfortunate that at this
very moment, there is no heap-space available for running Haskell
computations.  :-)   No doubt someone already has a horribly complex
way round this, so please speak up.

Regards,
    Malcolm

--- End Message ---

--- Begin Message ---

To keep our Inboxes small and to advance to some more interesting stuff,
I think the time has come to fix the low-level part of the FFI proposal:

\begin{BenevolentDictator}
   * Module PreludeFFI is renamed to FFI.

   * Module FFI is renamed to Marshal.

   * The type of finalizer stays `Addr -> IO ()'. Even if it was simply
     `Addr', Haskell stuff could be invoked during GC (via the FFI). And
     Simon's proposal (simply use a list of pending finalizers) seems
     workable.

   * writeForeignFinalizer is renamed to addForeignFinalizer back again.
     I must have mis-read the GHC sources...  :-}

   * To keep the interface small, addForeignCFinalizer is not included.
     As already stated, it's easy to define it with f.i.d.

   * Although makeForeignObj could be defined via addrToForeignObj and
     addForeignFinalizer, it is included in the interface, because the
     whole point of creating a foreign object is to attach finalizers to
     it. One could argue to leave out addrToForeignObj, but I like the
     symmetry with foreignObjToAddr. BTW, could somebody explain, why
     this "is basically an elephant gun pointed at your foot"?

   * The stable pointer related stuff in the proposal is left as it is
     (representation as void*, no deRefStablePtr and splitStablePtr on
     the C side required, etc.).

   * `Addr' keeps its kind *. As SPJ pointed out, it's not clear how to
     get a design with `Ptr a' which "is right". But for experimentation
     purposes it would be good if a future GHC includes `Ptr a' as an
     extension. <= HINT!
\end{BenevolentDictator}

The discussion about the module Marshal is best made in a smaller group
of people first, because I doubt that everybody in this audience is
interested in efficiency considerations of the Marshal monad...   :-)

Cheers,
   Sven
-- 
Sven Panne                                        Tel.: +49/89/2178-2235
LMU, Institut fuer Informatik                     FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen              Oettingenstr. 67
mailto:[EMAIL PROTECTED]            D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne

--- End Message ---

RE: Minor tweaks to ffi addendum

Reply via email to