RE: Updates to FFI spec

2002-08-13 Thread Simon Marlow


 On 12-Aug-2002, Simon Marlow [EMAIL PROTECTED] wrote:
  
  I'd be equally happy (perhaps happier) if the header file spec was
  removed altogether.  In a sense, this would leave the 
 Haskell part of a
  foreign binding even more portable, because it doesn't have 
 to specify
  the names of header files which might change between platforms.
 
 This is a C interface we're talking about, right?
 
 In C, the name of the header file is part of the API.
 It doesn't change between different platforms unless
 the API changes.
 
 Specifying the header name is essential if Haskell 
 implementations are to
 ever apply any type-checking to these foreign interfaces.  If 
 they don't,
 then in practice I think Haskell programs using the FFI are 
 likely to be
 less portable, and certainly more error-prone, since they will contain
 type errors that may cause problems on one platform but not another.

Specifying the header name is also essential for certain implementations
(eg. GHC).  I wan't suggesting not supplying the header file at all,
just not supplying it in the foreign declaration and not defining it as
part of the standard.  But I take your point about the header file(s)
being a proper part of the API.

Cheers,
Simon
___
FFI mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/ffi



RE: Updates to FFI spec

2002-08-13 Thread Simon Marlow

  System.Mem.performGC does a major GC.  When would a partial GC be
  enough?
 
 I've described the image-processing example a bunch of times.
 
 We have an external resource (e.g., memory used to store images) which
 is somewhat abundant and cheap but not completely free (e.g.,
 eventually you start to swap).  It is used up at a different rate than
 the Haskell heap so Haskell GCs don't occur at the right times to keep
 the cost low and we want to trigger GCs ourselves.

Hmmm, the garbage collector is a black box and has its own complicated
heuristics for managing memory usage, but you are describing a mechanism
that depends rather heavily on certain assumed behaviours.  At the
least, that gives the garbage collector less flexibility to change its
own behaviour, lest it invalidate the assumptions made by the external
allocator.

 (In the image
 processing example, images were megabytes and an expression like (x +
 (y * mask)) would generate 2 intermediate images (several megabytes)
 while doing just 2 reductions in Haskell.)

I think I'd be tempted to try to use a more predictable allocation
scheme given the size of the objects involved.  Perhaps arenas? 

 How often and how hard should we GC?  We can't do a full GC too often,
 or we'll spend a lot of time GCing, destroy our cache and cause
 premature promotion of Haskell objects into the old generation which
 will make the GC behave poorly.  So if all we can do is a full GC,
 we'll GC rarely and use a lot of the external resource.
 
 Suppose we could collect just the allocation arena.  That would be
 much less expensive (time taken, effect on caches, confusion of object
 ages) but not always effective.  It would start out cheap and
 effective but more and more objects would slip into older generations
 and have to wait for a full GC.
 
 To achieve any desired tradeoff between GC cost and excess resource
 usage, we want a number of levels of GC: gc1, gc2, gc3, gc4, ...  Each
 one more effective than the last and each one more expensive than the
 last.  We'll use gc1 most often, gc2 less often, gc3 occasionally, gc4
 rarely, ...

But there seems to be no way to reasonably decide how often one should
call these.  Doesn't it depend on the garbage collector's own parameters
too?

  I think the spec should be clarified along these lines:
 
Header files have no impact on the semantics of a foreign call,
  and whether an implementation uses the header file or not is
  implementation-defined.  Some implementations may require a header
  file which supplies a correct prototype for the function in order to
  generate correct code.
 
 I still don't like the fact that compilers are free to ignore header
 files.  Labelling it an error instead of a change in semantics doesn't
 affect the fact that portability is compromised.

I don't see any alternative - would you require a compiler that has only
a native code generator to read header files?  When there's no C
compiler on the system? (this is realistic - at some point we'd like to
make the via-C route in GHC completely optional, so we can ship a
compiler on Windows that doesn't need to be bundled with GCC).

  Perhaps on GHC you should be required to register the top module
  in your program first, maybe something like
 
  registerModule(__stginit_Main);
 
  that way you can register multiple modules (which isn't possible at
  the moment, you have to have another module which imports all the
  others).
 
 What does that do?  Is it for threading, GC, profiling, ...?

Each module has a little initialisation fragment that calls all the
initialisation fragments for the modules it imports.

At the moment, there are two kinds of initialisation done for each
module:

  - each foreign export is registered as a stable pointer.  This
prevents the garbage collector from collecting any CAFs which
might be required (indirectly) by a foreign export.

  - when profiling, all the cost centres in the current module
are initialised.

It might be possible to do this using linker sets, but I haven't tried
(and it would probably be highly non-portable too).

Cheers,
Simon
___
FFI mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/ffi



Re: Updates to FFI spec

2002-08-13 Thread Alastair Reid


 At the moment, there are two kinds of initialisation done for each
 module:

Both ELF and DLLs on Windows provide a way of specifying initializers.

Or, easier yet, since the user is already using the hs_init function,
you could use that.  The way you'd do that in ELF is to define a
special section 'hs_initializers'.  Every module would contain a
single object in that section: the address of the initializer.  (In
gcc you do this by attaching an attribute to the variable.  In asm it
is even easier since assemblers directly expose sections to the
programmer.)  The linker will do what it does with all sections:
concatenate the pieces from all object files.  hs_init would treat the
section as an array of function pointers.  I'm sure that the Windows
linker must have a similar mechanism - the main trick is figuring out
what name to give the sections.

--
Alastair
___
FFI mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/ffi



RE: Updates to FFI spec

2002-08-13 Thread Simon Marlow

  At the moment, there are two kinds of initialisation done for each
  module:
 
 Both ELF and DLLs on Windows provide a way of specifying initializers.
 
 Or, easier yet, since the user is already using the hs_init function,
 you could use that.  The way you'd do that in ELF is to define a
 special section 'hs_initializers'.  Every module would contain a
 single object in that section: the address of the initializer.  (In
 gcc you do this by attaching an attribute to the variable.  In asm it
 is even easier since assemblers directly expose sections to the
 programmer.)  The linker will do what it does with all sections:
 concatenate the pieces from all object files.  hs_init would treat the
 section as an array of function pointers.  I'm sure that the Windows
 linker must have a similar mechanism - the main trick is figuring out
 what name to give the sections.

That's what I meant by a linker set :-)

Cheers,
Simon
___
FFI mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/ffi