Re: Allocation & Marshalling Question (again)

Wolfgang Thaller Thu, 29 May 2003 22:28:12 -0700

Hello Adrian,

It would be nice to just partially apply a foreign function of type..
  f :: CString -> CString -> IO CInt
.. to a Haskell string, as in..
  g :: CString -> IO CInt
  g = f <+ withCString "Hello"


OK... This is the special case of _constant_ values...
It might be safe to write the following:

newtype GarbageCollectedCString = GarbageCollectedCString (ForeignPtr CChar)

toCString :: String -> GarbageCollectedCString

After creation, the C String should never be modified (treat it as a const char* in C, not a char*). it would be allocated at some point in the future by the garbage collector.

The with* and alloca* functions, however, return 'Ptr's, not 'ForeignPtr's. Ptrs are just address values, it's not possible to garbage-collect them or to attach finalizers (that's what ForeignPtrs are for, there's a slight performance hit).

Also, as soon as you plan to modify the block pointed to by a ForeignPtr, you have to be in the IO monad or lose referential transparency.


Of couse there's nothing to stop you partially applying the function
that results from the explicit lambdas code above, but only at the
expense of re-marshalling the same Haskell string every time the
underlying foreign function is called.

Another example, this foreign import seems quite legal (not necessarily
safe, but legal:-)
  foreign import ccall unsafe "cfunc" cfunc :: CString -> Word32
but its awkward to use it in conjunction with withCString.

  withCString "Hello" cfunc
gives a type error (of course), you need this instead..
  (unsafePerformIO $ withCString "Hello" (\p -> return $! cfunc p))

OK, that particular unsafePerformIO looks safe to me. However, if you have

foreign import ccall unsafe "cfunc" cfunc :: CString -> Word32 -> Word32

(unsafePerformIO $ withCString "Hello" (\p -> return $! cfunc p)) 42

... you're in terrible danger;

Before applying the argument '42', the unsafePerformIO is evaluated; and before the argument '42' is applied, the unsafePerformIO will have to have _finished_ evaluating. That means withCString will be finished, which, in turn, means that withCString will already have deallocated the buffer for the string. Bad. Very bad. Perhaps it won't segfault, because the memory for the buffer won't have been reused and overwritten yet, but it's still bad.

What could the pure function you're passing to allocaBytes do with
the pointer?
[My understanding of Haskell was that all functions are pure, even if their type is (Ptr a -> IO b). What's impure is the action itself (which always has type IO <something>).]

Sorry. Sloppy terminology on my part. My point was that allocaBytes (in it's current form) relies on getting it's hands onto the action. The action

allocaBytes 42 f

will first allocate 42 bytes, then execute the action (f ptr) and then free the 42 bytes again. If (f ptr) is not of type IO a, but of type a -> b (where b might be IO c or c -> d or ...), what can possibly happen between before the block is freed again?

When would the storage be freed again? The original allocaBytes frees
it after the IO action has finished, but before returing itself. How
would that apply to the pure version?


Without understanding the technical details of various Haskell
implementations I can't say for sure, though I would imagine the
storage would be freed when the corresponding pointer became
garbage. This assumes the storage space is (haskell) heap allocated
(as I gather it is in ghc from Simon M's reply). If it was stack
allocated then I guess there would be problems.

It's a Ptr, not a ForeignPtr, so the Garbage Collector cannot keep track of whether it's still live or not. GHC's implementation of allocaBytes does some black magic for superior performance, and I think that's what SimonM was referring to. But the use of the Haskell heap is just an implementation detail here, not a feature...

From the library source code (libraries/base/Foreign/Marshal/Alloc.hs):

#ifdef __GLASGOW_HASKELL__
-- ... some completely unintelligible black magic ...
#else
allocaBytes      :: Int -> (Ptr a -> IO b) -> IO b
allocaBytes size  = bracket (mallocBytes size) free
#endif

Look at the "generic" non-glasgow-haskell implementation: it just mallocs the block and then frees it, no chance for GC (malloc & free are really implemented using the C stdlib).

Now it might well be that with current implementations it really
is unsafe to use withCString with any function that is not of
type (CString -> IO b). What I want to know is..
        * is that so? (maybe)

If you want to use Ptrs (and CString is basically Ptr CChar), yes, because the block has to be explicitly freed by the *action* withCString. If you use ForeignPtrs instead of Ptrs, you can have that, or something like the toCString function whose signature I presented above. But you can have that only if you "promise" not to modify the data block. For a function like allocaBytes, you will definitely want to modify the allocated block, so it's allocation and use has to be in the IO monad in order to be safe.

* should that be so? (no IMHO)

Yes, it should be. If I have a Ptr, I want to be able to do pointer arithmetic with it, and I want it to be represented as efficiently as possible. That means I have to stay in the IO monad. If I have a ForeignPtr, I don't want pointer arithmetic, and I want it to be garbage collected.

Now the question we should be asking is, why isn't there a version of CString based on ForeignPtrs?

Cheers,

Wolfgang

_______________________________________________
FFI mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/ffi

Re: Allocation & Marshalling Question (again)

Reply via email to