Hi Roman, On Tue, Jan 25, 2011 at 6:35 PM, Roman Leshchinskiy <[email protected]> wrote: > Daniel Peebles wrote: >> >> Johan Tibell and I have been working on some new primops for GHC that >> would allow people to use fast optimized memcpys on unpinned memory. > > Why not just use memcpy?
I'm working with unpinned memory so memcpy is not an option as the GC might move the memory during the foreign call. I also want to avoid to avoid filling the array twice (first with a initial element I don't care about, followed by the elements of another array). There's some research on /derived pointers/ (e.g. in HotSpot) that would allow traversals of arbitrary memory using pointer arithmetic and still let the GC move objects around. We don't have anything like that in GHC at the moment. I don't want to use pinned memory as the arrays I'm dealing with are pretty small (e.g. 32 elements). Even though they're small using a copyArray# primop is faster than a loop using indexArray#/writeArray#. In addition, I'd like to avoid unnecessarily filling the arrays with a default element just to overwrite them right after. Speaking of array initialization, I'm not sure the CMM compiler is up to optimizing the initialization loop well either. My understanding is that the CMM compiler is not as strong as e.g. GCC when it comes to optimizing loops. >> copyByteArray# :: ByteArray# -> Int# -> MutableByteArray# >> s -> Int# -> Int# -> State# s -> State# s >> copyMutableByteArray# :: MutableByteArray# s -> Int# -> MutableByteArray# >> s -> Int# -> Int# -> State# s -> State# s > > These are just instances of memcpy. FWIW, very similar operations are > provided by the primitive package. The GC issue applies here too. There are real users of unpinned ByteArray#s, e.g. Text. >> cloneArray# :: Array# s a -> State# s -> (# State s, >> MutableArray# s a #) >> cloneMutableArray# :: MutableArray# s a -> State# s -> (# State s, >> MutableArray# s a #) > > It would be nice if those could be used on array slices. Maybe this: > > cloneArray# :: Array# a -> Int# -> Int# -> State# s -> Array# a > cloneMutableArray# :: MutableArray# s a -> Int# -> Int# -> State# s -> (# > State# s, MutableArray# s a #) > freezeArray# :: MutableArray# s a -> Int# -> Int# -> State# s -> (# State# > s, Array# a #) > thawArray# :: Array# a -> Int# -> Int# -> State# s -> (# State# s, > MutableArray# s a #) Supporting slicing would be nice. I'd prefer if we had e.g. cloneArray# :: Array# a -> Int# -> Int# -> State# s -> (# State# s MutableArray# s a #) as you might won't to mutate the clone array before freezing it. The use case I really want to support is updating a single array element, which would entail first cloning the array, the writing an element, and finally freezing the array. > Note that freezeArray# and thawArray# would be safe, i.e., would always > copy. That sounds awfully expensive! > Perhaps a block write would also be useful: > > fillMutableArray# :: MutableArray# s a -> Int# -> Int# -> a -> State# s -> > (# State# s, () #) That could be useful. >> cloneByteArray# :: ByteArray# -> State# s -> (# State s, >> MutableByteArray# s #) >> cloneMutableByteArray# :: MutableByteArray# s -> State# s -> (# State s, >> MutableByteArray# s #) > > Aren't these just newMutableByteArray# followed by copy? Why do you want > them to be primops? There shouldn't be any speed advantage. You're right (assuming that we don't zero the memory on allocation). > IIRC, memmove just calls memcpy if the ranges don't overlap so always > calling memmove and having it do the check should be just as fast. That would be a nice simplification. We should double check that this is indeed what happens in e.g. libc. Cheers, Johan _______________________________________________ Cvs-ghc mailing list [email protected] http://www.haskell.org/mailman/listinfo/cvs-ghc
