Re: Cheap ForeignPtr allocation
I agree with SimonM that the proposed routines have useful applications. Furthermore, it is trivial for Haskell systems to implement these routines. Hence, I will include them into the spec unless there are serious objections. Cheers, Manuel ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Cheap ForeignPtr allocation
Nevertheless, I think even without the tricks I'm using in GHC, the case where a ForeignPtr is used in conjunction with malloc()/free() is one which is likely to be optimisable in any system with its own memory management. I wasn't meaning so much that only GHC could take advantage of it (though I think that is true at present) but that someone might come along next week with a technique which avoids the problem altogether. [...] using a ForeignPtr here, with free() as its finalizer, adds so much overhead that [...] Where is the overhead coming from? Is it the cost of a C call or the cost of the standard malloc library? If the latter, I imagine that a custom allocator would have similar performance to using pinned objects. (I'm sort of assuming that pinned objects are more expensive than normal objects.) btw I don't know if it's relevant but there's an important semantic difference between allocating on the GHC heap and allocating on the C heap. The C version can be manipulated by the malloc.h functions. In particular, you can call realloc on it. Do that on the GHC heap version and... well, it won't be pretty. I don't know if this is relevant because using realloc with ForeignPtr'd memory is likely to be a delicate procedure no matter how it got allocated. -- Alastair ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Cheap ForeignPtr allocation
[...] using a ForeignPtr here, with free() as its finalizer, adds so much overhead that [...] Where is the overhead coming from? Is it the cost of a C call or the cost of the standard malloc library? It's the combined cost of - malloc() - creating a weak pointer to register the finalizer - running the finalizer - free() I can't remember the exact break-down, but I believe more than half the cost is in malloc+free. If the latter, I imagine that a custom allocator would have similar performance to using pinned objects. Yes, using a custom allocator is likely to get you most of the benefit (I say most, because you'd still need a finalizer and a free() routine, compared to GC). But this is an argument in favour of the new interface, because it abstracts away from the actual allocator used, so the implementor is free to provide a custom allocator. That's a win, isn't it? (I'm sort of assuming that pinned objects are more expensive than normal objects.) btw I don't know if it's relevant but there's an important semantic difference between allocating on the GHC heap and allocating on the C heap. The C version can be manipulated by the malloc.h functions. In particular, you can call realloc on it. Do that on the GHC heap version and... well, it won't be pretty. I don't know if this is relevant because using realloc with ForeignPtr'd memory is likely to be a delicate procedure no matter how it got allocated. Yes, I thought about this. Fortunately a ForeignPtr isn't mutable, even using GHC extensions, so I can't see a way to safely call realloc() once you've made a ForeignPtr. Anyway, the docs for mallocForeignPtr would have to say something like the pointer is not guarnateed to have been returned by malloc(). Cheers, Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Cheap ForeignPtr allocation
Simon Marlow [EMAIL PROTECTED] wrote, I'd like to propose two new functions for the ForeignPtr interface: mallocForeignPtr :: Storable a = IO (ForeignPtr a) mallocForeignPtrBytes :: Int - IO (ForeignPtr a) (the names can change, of course). The implementations are trivial in terms of existing things: mallocForeignPtr = do p - malloc newForeignPtr p free mallocForeignPtrBytes size = do p - mallocBytes size newForeignPtr p free However, in GHC we can provide a far more efficient implementation by using pinned ByteArray#s, avoiding the overhead of malloc()/free() and the finalizer. Since this is quite a common idiom when using ForeignPtrs, I think it's a good case to optimise. I did a little test, and using the above functions gave a 6x improvement in a small example which just repeatedly allocated a new ForeignPtr and passed it to a foreign function. The GHC implementation is to extend the ForeignPtr type like this: data ForeignPtr a = ForeignPtr ForeignObj# | MallocPtr (MutableByteArray# RealWorld) so it does in theory slow down normal ForeignPtrs slightly, but I didn't measure any difference in the limited tests I did. I vaguely remeber that in the context of the withForeignPtr discussion we where once trying to achieve some similar effect (but couldn't come up with something that would work). Do you remember? Does this, then, effectively solve this old problem? Wouldn't you want newXXX and withXXX variants of the above, too? Cheers, Manuel ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Cheap ForeignPtr allocation
I vaguely remeber that in the context of the withForeignPtr discussion we where once trying to achieve some similar effect (but couldn't come up with something that would work). Do you remember? Uh, my memory's a bit vague too :-) For a long time we were trying to get cheap allocation/freeing for temporary storage, i.e. a cheap alloca. We managed to achieve that when I realised I could pull a trick with GHC's garbage collector and have pinned objects as long as they don't contain any pointers into the heap. So instead of using malloc()/free() for alloca, we allocate a pinned ByteArray# and let the GC free it. (it needs to be pinned so that it can be passed to foreign functions which might re-enter the RTS and trigger GC, etc.) This proposed extension to ForeignPtr is just taking the idea one step further: we can use the same trick for ForeignPtrs too, at least in the common case where you want the finalizer to free() the object again. Does this, then, effectively solve this old problem? Wouldn't you want newXXX and withXXX variants of the above, too? The two functions I mentioned are all that's needed: mallocForeignPtr :: Storable a = IO (ForeignPtr a) mallocForeignPtrBytes :: Int - IO (ForeignPtr a) they both do the job of a combined malloc/newForeignPtr. withForeignPtr still works fine with a ForeignPtr constructed this way. Cheers, Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Cheap ForeignPtr allocation
Can you achieve the same performance gain by adding some rewrite rules? -- Alastair ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Cheap ForeignPtr allocation
Can you achieve the same performance gain by adding some rewrite rules? Perhaps you could try to spot (=) malloc (\p - newForeignPtr p free) Hmmm. Actually I'd like to do both: add the functions, because they encapsulate a common case and guarantee a speed improvement if you use them, and add the rewrite rule because it might catch some cases in existing code. The problem with relying on rewrite rules exclusively is that they tend to be a bit fragile and can fail to trigger without you noticing. Cheers, Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi