Re: [Haskell-cafe] Call for discussion: OverloadedLists extension
Simon Peyton-Jones wrote: | I remember a similar discussion a few years ago. The question of whether | or not overloading list literals a good idea notwithstanding, the problem | with this is that fromList for vectors is highly inefficient. So if | something like this gets implemented and if vector/array literals are one | of the main motivations then I really hope there will be no lists | involved. Would you like to remind us why it is so inefficient? Can't the vector construction be a fold over the list? Ah... you need to know the *length* of the list, don't you? So that you can allocate a suitably-sized vector. Which of course we do for literal lists. So what if fromList went fromList :: Int - [b] - a b where the Int is the length of the list? That's part of a problem. There are really two aspects to it. Firstly, a naive list-based implementation would be a loop. But when I write ([x,y] :: Vector Double) somewhere in an inner loop in my program, I *really* don't want a loop with two iterations at runtime - I want just an allocation and two writes. I suppose this could be solved by doing something like this: {-# INLINE fromList #-} fromList [] = V.empty fromList [x] = V.singleton x fromList [x,y] = ... -- and so on up to 8? 16? 32? fromList xs = fromList_loop xs But it's ugly and, more importantly, inlines a huge term for every literal. The other problem is with literals where all values are known at compile time. Suppose I have ([2.5,1.4] :: Vector Double) in an inner loop. Here, I don't want a complicated CAF for the constant vector which would have to be entered on every loop iteration. I'd much rather just have a pointer to the actual data somewhere in memory and use that. This is more or less what happens for strings at the moment, even though you have to use rewrite rules to get at the pointer which, in my opinion, is neither ideal nor really necessary. IMO, the right design shouldn't rely on rewrite rules. Also, strings give you an Addr# whereas vector supports ByteArray#, too. Since enumerated literals have been mentioned in a different post, I'll just mention that the Enum class as it is now can't support those efficiently for arrays because there is no way to determine either the length or the nth element of [x..y] in constant time. This would have to be fixed. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Call for discussion: OverloadedLists extension
Simon Peyton-Jones wrote: | pointer to the actual data somewhere in memory and use that. This is | more or less what happens for strings at the moment, even though you | have to use rewrite rules to get at the pointer which, in my opinion, is | neither ideal nor really necessary. IMO, the right design shouldn't | rely on rewrite rules. Also, strings give you an Addr# whereas vector | supports ByteArray#, too. If it's not necessary, I wonder if you have an idea for the right design? For strings, we could have something like this: data StringPtr stringFromStringPtr :: StringPtr - Int - String unsafeStringPtrToPtr :: StringPtr - Ptr CChar class IsString a where fromString :: String - a fromStringPtr :: StringPtr - Int - a fromStringPtr p n = fromString $ stringFromStringPtr p n abc would then desugar to fromStringPtr (address of abc) 3. Note that we couldn't just use Ptr CChar instead of StringPtr because stringFromPtr would only be safe if the data that the pointer references never changes. It's much trickier for general-purpose arrays. It's also much trickier to support both Ptr and ByteArray. I'd have to think about how to do that. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Call for discussion: OverloadedLists extension
Michael Snoyman wrote: The simplest example I can think of is allowing easier usage of Vector: [1, 2, 3] :: Vector Int In order to allow this, we could use a typeclass approach similar to how OverloadedStrings works: class IsList a where fromList :: [b] - a b instance IsList Vector where fromList = V.fromList foo :: Vector Int foo = fromList [1, 2, 3] I remember a similar discussion a few years ago. The question of whether or not overloading list literals a good idea notwithstanding, the problem with this is that fromList for vectors is highly inefficient. So if something like this gets implemented and if vector/array literals are one of the main motivations then I really hope there will be no lists involved. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Dynamic Programming with Data.Vector
Myles C. Maxfield wrote: Aha there it is! Thanks so much. I didn't see it because it's under the Unfolding section instead of the Construction section. You're quite right, having a separate Unfolding section isn't the best idea. I'll fix this. Roman On Mon, Sep 17, 2012 at 6:07 AM, Roman Leshchinskiy r...@cse.unsw.edu.auwrote: Myles C. Maxfield wrote: Overall, I'm looking for a function, similar to Data.Vector's 'generate' function, but instead of the generation function taking the destination index, I'd like it to take the elements that have previously been constructed. Is there such a function? If there isn't one, is this kind of function feasible to write? If such a function doesn't exist and is feasible to write, I'd be happy to try to write and contribute it. Indeed there is, it's called constructN (or constructrN if you want to construct it right to left). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Dynamic Programming with Data.Vector
Myles C. Maxfield wrote: Overall, I'm looking for a function, similar to Data.Vector's 'generate' function, but instead of the generation function taking the destination index, I'd like it to take the elements that have previously been constructed. Is there such a function? If there isn't one, is this kind of function feasible to write? If such a function doesn't exist and is feasible to write, I'd be happy to try to write and contribute it. Indeed there is, it's called constructN (or constructrN if you want to construct it right to left). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance with do notation, mwc-random and unboxed vector
On 12 Jun 2012, at 12:52, Dmitry Dzhus d...@dzhus.org wrote: 12.06.2012, 01:08, Roman Leshchinskiy r...@cse.unsw.edu.au: perhaps the state hack is getting in the way. I don't quite understand the internals of this yet, but `-fno-state-hack` leads to great performance in both cases! How safe is that? It doesn't change the semantics of your program but it can make it significantly slower (or faster, as in this case). The various state hack related tickets on trac might give you an idea of what is happening here. We really need some proper arity analysis! Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector operations
On 11/06/2012, at 18:52, Evan Laforge wrote: On Tue, May 29, 2012 at 12:52 PM, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: Vector should definitely fuse this, if it doesn't it's a bug. Please report if it doesn't for you. To verify, just count the number of letrecs in the optimised Core. You'll see one letrec if it has been fused and two if it hasn't. I see two letrecs in find_before2, but both of them are on findIndex. I only have one findIndex so I'm not sure what's going on. The first one calls the second, but there's an boxed Either argument in there, which must be coming out of vector internals. Hmm, which version of GHC and what compiler flags are you using? I'm not familiar with ghc-core, maybe that's doing something wrong. Just run ghc -O2 -ddump-simpl and look at the output. Below is the code I'm getting for find_before2 with 7.4.2. As you can see, everything has been fused (although I notice that GHC isn't pushing x_a11p and y1_a124 into the branches for some reason, looks like a new regression but not a particularly bad one and nothing to do with fusion). find_before2_rkk :: Int - Vector Int - Int [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LU(LLL)m] find_before2_rkk = \ (n_arE :: Int) (vec_arF :: Vector Int) - case vec_arF `cast` ... of _ { Vector ipv_s2Jf ipv1_s2Jg ipv2_s2Jh - case n_arE of _ { I# y_a11t - case # 0 y_a11t of _ { False - letrec { $sfindIndex_loop_s2Qz [Occ=LoopBreaker] :: Int# - Int# - Int# - Id (Maybe Int) [LclId, Arity=3, Str=DmdType LLL] $sfindIndex_loop_s2Qz = \ (sc_s2Q8 :: Int#) (sc1_s2Q9 :: Int#) (sc2_s2Qa :: Int#) - case =# sc_s2Q8 ipv1_s2Jg of _ { False - case indexIntArray# ipv2_s2Jh (+# ipv_s2Jf sc_s2Q8) of wild_a2JM { __DEFAULT - let { x_a11p [Dmd=Just L] :: Int# [LclId, Str=DmdType] x_a11p = +# sc1_s2Q9 wild_a2JM } in case # x_a11p y_a11t of _ { False - $sfindIndex_loop_s2Qz (+# sc_s2Q8 1) x_a11p (+# sc2_s2Qa 1); True - (Just @ Int (I# sc2_s2Qa)) `cast` ... } }; True - (Nothing @ Int) `cast` ... }; } in case ($sfindIndex_loop_s2Qz 0 0 1) `cast` ... of _ { Nothing - lvl_r2QO; Just i_arH - case i_arH of _ { I# x_a11Q - let { y1_a124 [Dmd=Just L] :: Int# [LclId, Str=DmdType] y1_a124 = -# x_a11Q 1 } in case =# 0 y1_a124 of _ { False - lvl_r2QO; True - I# y1_a124 } } }; True - lvl_r2QO } } } Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance with do notation, mwc-random and unboxed vector
On 11/06/2012, at 10:38, Dmitry Dzhus wrote: Consider this simple source where we generate an unboxed vector with million pseudo-random numbers: 8 - import qualified Data.Vector.Unboxed as VU import System.Random.MWC import System.Random.MWC.Distributions (standard) count = 100 main = do g - create e' - VU.replicateM count $ standard g return () 8 - Being compiled with -O2, this runs for 0.052 s on my machine. Changing the replicateM line to use do notation brings the runtime down to 11.257 s! See below: 8 - import qualified Data.Vector.Unboxed as VU import System.Random.MWC import System.Random.MWC.Distributions (standard) count = 100 main = do g - create e' - VU.replicateM count $ do v - standard g return v return () 8 - The former essentially generates this: replicateM n ((letrec f = ... in f) `cast` ...) and the latter this: replicateM n (\(s :: State# RealWorld) - (letrec f = ... in f s) `cast` ...) I'd look further into this but mwc-random just inlines too much stuff. Could you perhaps find a smaller example that doesn't use mwc-random? In any case, it looks like a GHC bug, perhaps the state hack is getting in the way. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector operations
On 29/05/2012, at 19:49, Evan Laforge wrote: Good question.. I copied both to a file and tried ghc-core, but it inlines big chunks of Data.Vector and I can't read it very well, but it looks like the answer is no, it still builds the the list of sums. I guess the next step is to benchmark and see how busy the gc is on each version. Vector should definitely fuse this, if it doesn't it's a bug. Please report if it doesn't for you. To verify, just count the number of letrecs in the optimised Core. You'll see one letrec if it has been fused and two if it hasn't. But my impression was that stream fusion can't handle early aborts, which was why I was wondering why Vector lacks a foldAbort type function. Stream fusion easily handles early aborts. There isn't anything like foldAbort precisely because it can be built out of existing operations at no extra cost. Roman On Wed, May 23, 2012 at 5:13 AM, Jake McArthur jake.mcart...@gmail.com wrote: Have you already verified that stream fusion won't just do this for you? On May 23, 2012 12:35 AM, Evan Laforge qdun...@gmail.com wrote: So I wanted to find the first index in a vector whose running sum is greater than a given number. The straightforward way is to create the running sum and then search: Vector.findIndex (=target) (Vector.scanl' (+) 0 vector) But vectors are strict so it could do extra work, and what if I don't want to generate garbage? I could do it with a fold, but it would have to have the ability to abort early. Of course I could write such a fold myself using indexing: import qualified Data.Vector.Generic as Vector fold_abort :: (Vector.Vector v a) = (accum - a - Maybe accum) - accum - v a - accum fold_abort f accum vec = go 0 accum where go i accum = maybe accum (go (i+1)) $ f accum = vec Vector.!? i find_before :: (Vector.Vector v a, Num a, Ord a) = a - v a - Int find_before n = fst . fold_abort go (0, 0) where go (i, total) a | total + a = n = Nothing | otherwise = Just (i+1, total+a) So it's bigger and clunkier, but I would think it would be much more efficient (provided using Data.Vector.Generic won't inhibit inlining and unboxing). But I'm a bit surprised there isn't already something like fold_abort... or is there? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] strict version of Haskell - does it exist?
Marc Weber wrote: Replying to all replies at once: Malcolm Wallace At work, we have a strict version of Haskell :-) which proofs that it is worth thinking about it. But doesn't necessarily prove that it's a good idea. Just (Item id ua t k v) - M.insertWith (+) k 1 st Does replacing this by insertWith' help? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] strict version of Haskell - does it exist?
Marc Weber wrote: Replying to all replies at once: Malcolm Wallace At work, we have a strict version of Haskell :-) which proofs that it is worth thinking about it. But doesn't necessarily prove that it's a good idea. Just (Item id ua t k v) - M.insertWith (+) k 1 st Does replacing this by insertWith' help? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Unboxed Rationals?
On 12/01/2012, at 21:01, Artyom Kazak wrote: Yves Parès limestr...@gmail.com писал(а) в своём письме Thu, 12 Jan 2012 13:14:16 +0200: uvector is deprecated, its functionnalities has been ported into vector. Yes, but a Ratio a instance hasn't been ported. FWIW, vector isn't a port of uvector in any sense, shape or form. Rather, uvector was a fork of a very old version of an internal DPH package, whereas vector is a from-scratch implementation of arrays based on the experience gained while working on DPH. Vector is an open-source project and has a trac. If you need something, open a ticket or better yet, send me patches! Emails sometimes work, too :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Unboxed Rationals?
On 11/01/2012, at 17:00, Artyom Kazak wrote: In fact, I am surprised that Data.Vector doesn't have a Ratio instance, but has a Complex instance. Any ideas, why? Nobody has asked for it so far. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict
Johan Tibell wrote: map (\ v - undefined) == undefined mapKeys (\ k - undefined) == undefined Not really related to the question but I don't really understand how these properties can possibly hold. Shouldn't it be: map (\v - undefined) x = undefined And even then, does this really hold for empty maps? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Stream fusion
Yves Parès wrote: While re-reading RealWorldHaskell, chapter 25, I saw that -- unlike I believed -- loop fusion wasn't activated by default under GHC for lists (but that module Data.List.Stream from package stream-fusion could provide it). Note that stream fusion is only one way to do fusion. For lists, GHC uses foldr/build fusion which is a different approach but still fuses loops. You get this by default when compiling with optimisations. Is that still the case? If not, then are there some cases of list processing where loop fusion would be a bad thing? (Ergo cases where you should stick to Prelude/Data.List functions and not use Data.List.Stream implementation) I'm not sure if anybody has actually benchmarked the stream-fusion package with a modern GHC. I suspect it wouldn't hold up well, too many things have changed in the compiler since it was written. So I'm not really sure you should be using it at all. Chances are, if you really care about having tight loops you shouldn't be using lists at all. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data.Vector.Mutable.mapM
Joachim Breitner wrote: Hi, Iâm consdering to change some performance critical code from Vector to MVector, hopefully avoiding a lot of copying and garbage collecting. But it seems that the Data.Vector.Mutable interface at http://hackage.haskell.org/packages/archive/vector/0.9/doc/html/Data-Vector-Mutable.html is quite limited; e.g. I am missing simple functions having type modifyM :: PrimMonad m = (a - m a) - MVector (PrimState m) a - m () that would do something with each element in the vector. At the moment, the best way to do this is: modifyM = Data.Vector.Generic.Mutable.transform . Data.Vector.Fusion.Stream.Monadic.mapM Note that transform will return a new vector but that is guaranteed to be a slice of the original one. Since mapM doesn't change the number of elements, you can safely ignore the return value as it will be always your original vector. Is this an indication that such use is actually not desired, or is it just the case that nobody has developed that yet? The latter. I need to come up with a nice mechanism for specifying loops over mutable vectors but this isn't entirely trivial and I haven't had enough time to really work on this lately. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0
Conrad Parker wrote: On 15 October 2011 23:18, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: On 16 October 2011 01:15, Bas van Dijk v.dijk@gmail.com wrote: I agree that you shouldn't use ByteStrings or Vectors of Word8s for Unicode strings. However I can imagine that for quick sessions in ghci it can be quite handy if they are shown as strings. For example, currently we have: import Network.HTTP.Enumerator simpleHttp http://code.haskell.org/~basvandijk/; Chunk html\nheadtitleBas van Dijk/title/head\nbody\nh1Bas van Dijk/h1\n\npEmail: a href=\mailto://v.dijk@gmail.com\;v.dijk@gmail.com/a/p\n\npNick on IRC: ttbasvandijk/tt/p\n\na href=\http://www.haskellers.com/user/basvandijk/\;\n img src=\http://www.haskellers.com/static/badge.png\; \n alt=\I'm a Haskeller\\n border=\0\\n/a\n\npSee my a href=\https://github.com/basvandijk\;GitHub/a page for a list of projects I work on./p\n\n/body\n/html\n Empty If ByteStrings were not shown as strings this would look like: Chunk ( fromList [60,104,116,109,108,62,10,60,104,101,97,100,62,60,116,105,116,108,101,62,66,97,115,32,118,97,110,32,68,105,106,107,60,47,116,105,116,108,101,62,60,47,104,101,97,100,62,10,60,98,111,100,121,62,10,60,104,49,62,66,97,115,32,118,97,110,32,68,105,106,107,60,47,104,49,62,10,10,60,112,62,69,109,97,105,108,58,32,60,97,32,104,114,101,102,61,34,109,97,105,108,116,111,58,47,47,118,46,100,105,106,107,46,98,97,115,64,103,109,97,105,108,46,99,111,109,34,62,118,46,100,105,106,107,46,98,97,115,64,103,109,97,105,108,46,99,111,109,60,47,97,62,60,47,112,62,10,10,60,112,62,78,105,99,107,32,111,110,32,73,82,67,58,32,60,116,116,62,98,97,115,118,97,110,100,105,106,107,60,47,116,116,62,60,47,112,62,10,10,60,97,32,104,114,101,102,61,34,104,116,116,112,58,47,47,119,119,119,46,104,97,115,107,101,108,108,101,114,115,46,99,111,109,47,117,115,101,114,47,98,97,115,118,97,110,100,105,106,107,47,34,62,10,32,32,60,105,109,103,32,115,114,99,61,34,104,116,116,112,58,47,47,119,119,119,46,104,97,115,107,101,108,108,101,114,115,46,99,111,109,47,115,116,97,116,105,99,47,98,97,100,103,101,46,112,110,103,34,32,10,32,32,32,32,32,32,32,97,108,116,61,34,73,39,109,32,97,32,72,97,115,107,101,108,108,101,114,34,10,32,32,32,32,32,32,32,98,111,114,100,101,114,61,34,48,34,62,10,60,47,97,62,10,10,60,112,62,83,101,101,32,109,121,32,60,97,32,104,114,101,102,61,34,104,116,116,112,115,58,47,47,103,105,116,104,117,98,46,99,111,109,47,98,97,115,118,97,110,100,105,106,107,34,62,71,105,116,72,117,98,60,47,97,62,32,112,97,103,101,32,102,111,114,32,97,32,108,105,115,116,32,111,102,32,112,114,111,106,101,99,116,115,32,73,32,119,111,114,107,32,111,110,46,60,47,112,62,10,10,60,47,98,111,100,121,62,10,60,47,104,116,109,108,62,10]) Empty Personally, I don't work in ghci that often so I don't care that much if we have or don't have specialized Show instances for Vectors of Word8s. So what do other people think about this? Actually, for my current use case of Bytestrings (binary encoding of graphs using existing encoding schemes), I would prefer this [Word8]-based Show instance as it would help with my debugging, since the output looks along the lines of: Chunk (fromList [3,2,3,0,3,1,3,0,2,2,1,0]). I am the first to admit that my use case is probably different from others though. And I often work with mixed text/binary data (eg. text annotations in video streams). I'd want the Show/Read instances to be in the form of a hexdump with char representation alongside (like xxd or od -xc output). It roundtrips well, so why not? :-) So it seems that (1) people have very different requirements and (2) the Show instance only really matters for debugging in ghci. Here is a thought. What if ghci allowed Show instances to be overridden dynamically? So you could put your preferred Show instance for Vector Word8 in you .ghci file and ghci would use that when displaying stuff (but not when actually evaluating things). Would that solve most of the problems without messing with vector's Show instances? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0
Ivan Lazar Miljenovic wrote: On 19 October 2011 22:09, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: So it seems that (1) people have very different requirements and (2) the Show instance only really matters for debugging in ghci. Here is a thought. What if ghci allowed Show instances to be overridden dynamically? So you could put your preferred Show instance for Vector Word8 in you .ghci file and ghci would use that when displaying stuff (but not when actually evaluating things). Would that solve most of the problems without messing with vector's Show instances? Would this hypothetical ghci feature also work for cases where you have a ByteString as part of another type that derives Show and Read? Yes. The idea would be to evaluate the expression, then build the Show instance for the type of the result taking the ghci overrides into account and then use that to display the result. I have to admit that I have no idea how difficult it would be to do this but surely it can't be that hard. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0
Michael Snoyman wrote: On Mon, Oct 17, 2011 at 12:14 PM, Bas van Dijk v.dijk@gmail.com wrote: My idea is that when vector-bytestring is as fast as bytestring, it can replace it. When that happens it doesn't matter if users use the vector interface. I would even recommend it over using the bytestring interface so that bytestring can eventually be deprecated in favor of vector. +1. I'm in favor of using the OverlappingInstances/no newtype and specialized Show instance. I think that, if there was *ever* a case where OverlappingInstances was a good fit, it's this one. We're talking about a single module exporting both the base and overlapped instance, so which instance gets used should be completely decidable. (Unless of course someone defines an orphan instance elsewhere, but that's a different issue IMO.) And even in a worst-case-scenario where somehow we get the wrong instance, we're only talking about output used as a debugging aid, so the damage is minimal. So suppose we change the Show and Read instances for Storable vectors of Word8 and Char. What happens with unboxed and boxed vectors of these types? Should these be changed as well? Should these be changed as well? If not, why not? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0
On 14/10/2011, at 12:37, Bas van Dijk wrote: If there's need for a specific Show instance for Vectors of Word8s we can always add one directly to vector. (Roman, what are your thoughts on this?) Personally, I think that ByteString and especially Vector Word8 aren't strings and shouldn't be treated as such. But I wouldn't be strongly against showing them as strings. However, I *am* strongly against using UndecidableInstances in vector and I don't see how to implement this without using them. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0
On 15/10/2011, at 12:26, Roman Leshchinskiy wrote: On 14/10/2011, at 12:37, Bas van Dijk wrote: If there's need for a specific Show instance for Vectors of Word8s we can always add one directly to vector. (Roman, what are your thoughts on this?) Personally, I think that ByteString and especially Vector Word8 aren't strings and shouldn't be treated as such. But I wouldn't be strongly against showing them as strings. However, I *am* strongly against using UndecidableInstances in vector and I don't see how to implement this without using them. I meant OverlappingInstances, of course. To clarify, I would still consider it if everybody thinks it's a really good idea. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Combining stream and list fusion
Bas van Dijk wrote: Hello, I'm trying to make the following faster: Data.Vector.Generic.fromList list where 'list' is some expression yielding a list. Unfortunately, I don't think that's possible. The problem is that you 'list' will be expressed in terms of foldr/build and fromList would have to produce a Stream, i.e., basically an unfoldr. But AFAIK, there is no unfoldr/build fusion rule. There is one the other way round and vector makes use of that in toList. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] instance Enum Double considered not entirely great?
On 25/09/2011, at 18:20, Chris Smith wrote: class Ord a = Range a where rangeFromTo :: a - a - [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a - a - a - [a] inRange :: (a, a) - a - Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples class Range a = InfiniteRange a where -- [1] rangeFrom :: a - [a] rangeFromThen :: a - a - [a] -- Has instances for Float/Double -- No instances for tuples I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should ensure that they can be defined generically for all types. The rationale is that other data structures (like arrays) want to provide similar functions without having to go through lists. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] instance Enum Double considered not entirely great?
On 28/09/2011, at 23:23, Ivan Lazar Miljenovic wrote: On 29 September 2011 07:56, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: On 25/09/2011, at 18:20, Chris Smith wrote: class Ord a = Range a where rangeFromTo :: a - a - [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a - a - a - [a] inRange :: (a, a) - a - Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples class Range a = InfiniteRange a where -- [1] rangeFrom :: a - [a] rangeFromThen :: a - a - [a] -- Has instances for Float/Double -- No instances for tuples I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should ensure that they can be defined generically for all types. The rationale is that other data structures (like arrays) want to provide similar functions without having to go through lists. Wouldn't this require something like the ListLike class? Not at all. You could have something like: class Enum a where enumFromToSize :: a - a - Integer advance :: a - Integer - a ... And then [x..y] would desugar to map (advance x) (enumFromTo_Integer 0 $ enumFromToSize x y) where enumFromTo_Integer would be primitive. Of course, it's possible to design a much more efficient interface but this should give a general idea. An added benefit would be that you could generate the sequence in parallel (which is quite crucial for, e.g., DPH). Basically, the requirements would be that you can get the size of a range and compute the nth element of a range (or, equivalently, split the range) in constant time. Are there any Enum instances which don't satisfy this (apart from the broken floating point instances which *could* satisfy this)? As it stands, none of the array libraries that I've participated in designing and writing can use the Enum class properly (or, in the case of DPH, at all). For instance, vector has 230 lines of code (including comments) and 16 rules to implement enumFromTo (the vector version) halfway efficiently when the element type is known statically. I haven't bothered with enumFromThenTo so far. Interestingly, GHC's *list* library has to jump through similar hoops to make enumFromTo and enumFromThenTo work with foldr/build fusion (again, only when the element type is known statically). IMO, making enumFromThen and friends into methods just doesn't work, not even for lists really. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data.IArray rant
Jon Fairbairn wrote: Roman Leshchinskiy r...@cse.unsw.edu.au writes: No, arrays were not considered to be bad, they were designed with parallelism in mind. I'm not sure how this can be the case if, as discussed below, most array operations have to go through lists, an inherently sequential data structure. It's rather that some considered the IArray API to be inadequate most of the time. Really, H98 arrays aren't very good at anything they do. For collective operations, you are supposed to convert the array to a list, work on the list and then convert it back to an array which just seems wrong. I am unconvinded that this is any more wrong than using a for loop in an imperative language. Oh, definitely. In fact, I consider using a for loop even more wrong, you really want to use collective operations instead whenever possible. But I don't think for loops are the right benchmark for ease of use. Remember that the lists are lazy, so itâs misleading to say âconvert the array to a listâ since what happens most of the time is that elements are taken out of the array and passed to the processing function and then thrown away before the next element is processed. Efficiency isn't even the biggest problem here. Whenever you want to perform a collective operation on an array, you have to do the actual work on an entirely different data structure. So I, as a programmer, have to say: Convert this array to a list, then do something with the list and then convert the resulting list back to an array. To me, at least, this is very inconvenient and requires a context switch. No other widely used container type requires you to do that. Multidimensional arrays can't be sliced and diced in the style of Repa or NumPy. Iâm not familiar with Repa or NumPy, but what can they do that cannot be done with judicious use of ixmap, which is a very powerful mechanism. Yes, it is quite powerful but not very convenient. Just as an example, if A is a matrix, then A[3,:] gives you the 4th row and A[:,3] the 4th column in NumPy. You can do that with ixmap but it's far more involved. It also isn't powerful enough since it doesn't support certain highly useful uses of shape polymorphism (an example is a generic concat which decreases the dimensionality of any array by 1). This is discussed in detail in http://www.cse.unsw.edu.au/~chak/papers/repa.pdf. In general, H98 arrays seem to have been designed with the goal of providing a container with O(1) indexing. They do that, I suppose, although they aren't very well integrated with the rest of the language Can you give examples? My favourite is the interaction between arrays and Enum. If I want to implement a generic enumFromTo m n that produces an array, I have to create the list [m .. n], take its length (which forces the entire list into memory) and then create an array of that length and fill it from the list. There doesn't seem to be any other way of doing it. This is a deficiency of Enum, really, but that's what I mean by not being well integrated. There are other examples like this. and they have conceptual problems (such as requiring two bounds checks for one array access). Assuming that you mean that for safe array access where nothing is known about the index at compile time, since any sort of array has at least a beginning and an end, they all require two bounds checks. Once you do know something about the index, itâs a question of implementation. That's not what I meant. You really need to do two full bounds checks, one against inRange and one against the actual length of the array. Here is the relevant comment from the GHC libraries: Note [Double bounds-checking of index values] ~ When you index an array, a!x, there are two possible bounds checks we might make: (A) Check that (inRange (bounds a) x) holds. (A) is checked in the method for 'index' (B) Check that (index (bounds a) x) lies in the range 0..n, where n is the size of the underlying array (B) is checked in the top-level function (!), in safeIndex. Of course it *should* be the case that (A) holds iff (B) holds, but that is a property of the particular instances of index, bounds, and inRange, so GHC cannot guarantee it. * If you do (A) and not (B), then you might get a seg-fault, by indexing at some bizarre location. Trac #1610 * If you do (B) but not (A), you may get no complaint when you index an array out of its semantic bounds. Trac #2120 At various times we have had (A) and not (B), or (B) and not (A); both led to complaints. So now we implement *both* checks (Trac #2669). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Idiomatic usage of the fixpoint library
Roman Cheplyaka wrote: {-# LANGUAGE TypeFamilies, FlexibleContexts, UndecidableInstances, FlexibleInstances #-} import Data.Fixpoint newtype Expr = Expr { unExpr :: Pre Expr Expr } instance Functor (Pre Expr) = Fixpoint Expr where data Pre Expr a = Add a a | Const Int project = unExpr inject = Expr instance Functor (Pre Expr) where fmap f (Const x) = Const x fmap f (Add x1 x2) = Add (f x1) (f x2) eval = cata eval' where eval' (Const x) = x eval' (Add x1 x2) = x1 + x2 There are some issues with this code, compared to simply using newtype Fix f = In { out :: f (Fix f) } to build an Expr. 1. Since 'Pre' is a data (not type) family, we cannot simply make use of a functor defined elsewhere. We need to define the functor inside the instance declaration (or at least wrap an existing functor). Yes, it would be nicer if it was a type family. There is a single reason why this isn't the case but I find that reason pretty compelling: you couldn't type hylo if it was. 2. I wasn't able to derive the Functor instance, getting an error Derived instance `Functor (Pre Expr)' requires illegal partial application of data type family Pre In the data type instance declaration for `Pre' That's really a GHC problem. There is no reason why it shouldn't be able to do this. 3. Having to use UndecidableInstances makes me feel a bit uncomfortable. You don't need UndecidableInstances. Just get rid of the Functor (Pre Expr) constraint on the Fixpoint Expr instance, it's doesn't do anything anyway. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data.IArray rant
On 03/09/2011, at 03:04, Ivan Lazar Miljenovic wrote: On 3 September 2011 11:38, Evan Laforge qdun...@gmail.com wrote: The result is that my first contact with haskell arrays left me with the impression that they were complicated, hard to use, and designed for someone with different priorities than me. Of course, Data.Vector didn't exist back then, but if someone were new to haskell now I would recommend they skip Data.IArray and head straight for vector. To an extent, I wonder how much of this has been that arrays were considered to be bad in Haskell, so no-one used them and no-one bothered to try and improve the API much (and instead went and created Vector, etc.). It's rather that some considered the IArray API to be inadequate most of the time. Really, H98 arrays aren't very good at anything they do. For collective operations, you are supposed to convert the array to a list, work on the list and then convert it back to an array which just seems wrong. Multidimensional arrays can't be sliced and diced in the style of Repa or NumPy. In general, H98 arrays seem to have been designed with the goal of providing a container with O(1) indexing. They do that, I suppose, although they aren't very well integrated with the rest of the language and they have conceptual problems (such as requiring two bounds checks for one array access). But requirements have shifted quite a bit since then. Now, people want to write real array programs and they want those to be fast. Personally, I don't know how to improve the H98 array API to provide this. You basically need to create a completely new API based on different principles. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] attoparsec and vectors
Gregory Collins wrote: On Tue, Jun 28, 2011 at 6:20 PM, Eric Rasmussen ericrasmus...@gmail.com wrote: It runs quickly, but I naively thought I could outperform it by reworking many to build a vector directly, instead of having to build a list first and then convert it to a vector: manyVec :: Alternative f = f a - f (V.Vector a) manyVec v = many_v  where many_v = some_v | pure V.empty        some_v = V.cons $ v * many_v That's an O(n^2) loop, and a thunk leak to boot. If you don't know the size of the vector ahead of time, the only way I can think of to beat Vector.fromList is to use a mutable vector with a highwater mark, and double the size if you fill it. At the end, you'd use unsafeFreeze to turn the mutable vector into a pure one, and unsafeTake to truncate the vector into the correct size. That's basically what fromList does. You could do this at a higher abstraction level by generating a Stream rather than a list and then using unstream to create a vector. I don't know if it's possible to do that with attoparsec. But you'd only save allocating and deallocating a lazily consumed list anyway. I'm not sure if it will be even noticable compared to how much parsing costs. For an example of a similar technique (minus the freezing part), I did a similar thing in the hashtables library: You might be interested in 'grow' :-) http://hackage.haskell.org/packages/archive/vector/0.7.1/doc/html/Data-Vector-Generic-Mutable.html#g:8 Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is fusion overrated?
Roman Cheplyaka wrote: Of course I don't claim that fusion is useless -- just trying to understand the problem it solves. Are we saving a few closures and cons cells here? In addition to what everyone else said, fusion can be a big win when it allows further optimisations. For instance, fusing map (+1) . map (+2) can eliminate 1 addition per iteration. Even without taking allocation into account, most of the reasons for why loop fusion is a worthwhile optimisation in C apply to Haskell, too! Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library
Johan Tibell wrote: I'm working on a patch that provides O(1) size right now. The trick is to define HashMap as: data HashMap k v = HM {-# UNPACK #-} !Int !(Tree k v) Another possibility is: data HashMap k v = HM Int !(Tree k v) hashMap t = HM (treeSize t) t That way size is O(n) on first use but O(1) afterwards. Then again, if someone really needs this they can program it themselves. I've never needed an O(1) size for maps. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] rewrite rules to specialize function according to type class?
Max Bolingbroke wrote: 2011/2/15 Simon Peyton-Jones simo...@microsoft.com: but currently any pragmas in a class decl are treated as attaching to the *default method*, not to the method selector: I see. I didn't realise that that was what was happening. Personally I find this a bit surprising, but I can see the motivation. Of course, a sensible alternative design would be to have them control the selectors, and then you could declare that you want your default methods to be inlined like this: {{{ class MyClass a where foo :: a - a foo = default_foo {-# INLINE default_foo #-} default_foo = ... big expression ... }}} I wouldn't necessarily expect this to guarantee inlining for the same reason that the following code doesn't guarantee that foo gets rewritten to big: foo = bar {-# INLINE bar #-} bar = big It might work with the current implementation (I'm not even sure if it does) but it would always look dodgy to me. Also, what if I write: class MyClass a where foo :: a - a foo x = default_foo x I assume this wouldn't guarantee inlining? In any event, perhaps it would be worth warning if you write an INLINE pragma for some identifier in a class declaration where no corresponding default method has been declared, in just the same way you would if you wrote an INLINE pragma for a non-existant binding? +1 Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] rewrite rules to specialize function according to type class?
Max Bolingbroke wrote: On 15 February 2011 11:23, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: I wouldn't necessarily expect this to guarantee inlining for the same reason that the following code doesn't guarantee that foo gets rewritten to big: foo = bar {-# INLINE bar #-} bar = big It might work with the current implementation (I'm not even sure if it does) but it would always look dodgy to me. In this case there doesn't seem to be any point inlining anyway, because nothing is known about the context into which you are inlining. Nonetheless, what will happen (I think) is that any users of foo will get the definition of foo inlined (because that doesn't increase program size) so now they refer to bar instead. Now GHC can look at the use site of bar and the definition of bar and decide whether it is a good idea to inline. Ah, but you assume that bar won't be inlined into foo first. Consider that it is perfectly acceptable for GHC to generate this: foo = big {-# INLINE bar #-} bar = big We did ask to inline bar, after all. Basically, I expect the small RHS for the default in my class declaration to be inlined unconditionally, and then GHCs heuristics will determine how and when to inline the actual default definition (e.g. default_foo). As soon as GHC generates a Core term for the RHS of the default method all bets are off because it might inline default_foo into that term which would make it too big to be inlined somewhere else. I thought you were suggesting to treat foo = default_foo specially by not generating a separate RHS for the default definition of foo and just rewriting it to default_foo instead. What it basically comes down to is a staging problem. You don't want default_foo to be inlined into the RHS of foo before the latter is inlined but the only way to achieve this is by marking foo as INLINE which is precisely what you want to avoid. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] rewrite rules to specialize function according to type class?
Max Bolingbroke wrote: On 15 February 2011 15:12, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: Ah, but you assume that bar won't be inlined into foo first. Consider that it is perfectly acceptable for GHC to generate this: foo = big {-# INLINE bar #-} bar = big We did ask to inline bar, after all. Well, yes, but when considering the use site for foo don't we now inline the *original RHS* of foo? This recent change means that it doesn't matter whether bar gets inlined into foo first - use sites of foo will only get a chance to inline the bar RHS. Only if foo has an INLINE pragma. Otherwise, GHC uses whatever RHS is available when it wants to inline. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] rewrite rules to specialize function according to type class?
Max Bolingbroke wrote: On 15 February 2011 16:45, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: Only if foo has an INLINE pragma. Otherwise, GHC uses whatever RHS is available when it wants to inline. Ah, I see! Well yes, in that case my workaround is indeed broken in the way you describe, and there is no way to repair it because in my proposal you wouldn't be able to write an INLINE pragma on the actual default method definition. There is an alternative, actually. When compiling a module with a function that doesn't have an INLINE pragma, GHC uses its optimised rhs for inlining in every stage and then records its unfolding for use in other modules if it is small enough to be inlined. This has some unfortunate (IMO) implications. Consider the following code: {-# INLINE [1] f #-} f = big g = f h = g Will big be inlined into h? This depends on the module that h is defined in. If it's in the same module as g, then g will most likely be inlined into h in phase 2, i.e., before f has been inlined into g. Then, f will be inlined into both g and h in phase 1. However, after f is inlined into g, g's rhs becomes too big for inlining. So if h is defined in a different module, g won't be inlined into it. We could just as well say that a function's rhs should be recorded forever as soon as it becomes small enough to be considered for inlining. So GHC could notice that g is very small in phase 2 and basically add an INLINABLE pragma to it at that point, regardless of what happens to its rhs afterwards. This would ensure that inlining isn't affected by splitting things into modules and would probably also make your proposal work. But it would also result in a lot more inlining compared to now. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Vector library
Pierre-Etienne Meunier wrote: This is mostly a question for Roman : how do you use your vector library with multi-dimensional arrays ? I mean, the array library from the standard libraries does something more intelligent than the C-like solution with indirections. Vector doesn't include any support for multidimensional arrays. This is by design as the library has exactly one purpose: to provide a fast implementation of contiguous, one-dimensional arrays. As you point out, it is well possible to build multidimensional arrays on top of it but that would be a separate library. You might want to take a look at Repa (http://hackage.haskell.org/package/repa) which does exactly that (it sits on top of DPH which sits on top of vector). It also gives you parallelism. FWIW, I don't think we've nailed the right API for multidimensional arrays yet. It's a hard problem. But we are getting there. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ghc/dph
On 14/12/2010, at 13:35, Johannes Waldmann wrote: I want to use dph (data parallel haskell) for a presentation. (Nothing fancy, just compile and run some demos.) What ghc version should I use and where do I get it? That's a tricky question. We are currently working on getting DPH to work properly with GHC 7 but we aren't quite done yet. You might want to try 7.0.1 + the DPH sources from the darcs repo. I read the advice use HEAD but when I build from the 7.1.20101213 source snapshot, dph is not installed (should it be?) Nowadays, you have to install DPH separately which isn't easy since we haven't released the packages yet. In any case, DPH doesn't work with the current HEAD at all because I haven't adapted it to the new superclass story yet. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Type families again
On 2 Dec 2010, at 21:29, Andrew Coppin andrewcop...@btinternet.com wrote: Does anybody have any suggestions? class Mappable t a b where type Rebind t a b map :: (a - b) - t - Rebind a b This is based on an old C++ trick. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Eta-expansion and existentials (or: types destroy my laziness)
On 16/10/2010, at 12:36, Max Bolingbroke wrote: On 16 October 2010 12:16, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: eta :: Stream a - Stream a eta s = Stream s next where next (Stream s next') = case next' s of Just (x,s') - Just (x,Stream s' next') Nothing - Nothing Making GHC optimise stream code involving eta properly is hard :-) Good point, I don't exactly mean non-recursive for requirement 3) then - I mean an adjective with a fuzzier definition like GHC-optimisable :-) I suspect the easiest way to achieve this is to expand the set of GHC-optimisable things until it includes eta :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Eta-expansion and existentials (or: types destroy my laziness)
On 16/10/2010, at 12:00, Max Bolingbroke wrote: Hi Cafe, I've run across a problem with my use of existential data types, whereby programs using them are forced to become too strict, and I'm looking for possible solutions to the problem. I'm going to explain what I mean by using a literate Haskell program. First, the preliminaries: {-# LANGUAGE ExistentialQuantification #-} import Control.Arrow (second) import Unsafe.Coerce Let's start with a simple example of an existential data type: data Stream a = forall s. Stream s (s - Maybe (a, s)) [...] In fact, to define a correct cons it would be sufficient to have some function (eta :: Stream a - Stream a) such that (eta s) has the same semantics as s, except that eta s /= _|_ for any s. That's easy. eta :: Stream a - Stream a eta s = Stream s next where next (Stream s next') = case next' s of Just (x,s') - Just (x,Stream s' next') Nothing - Nothing Making GHC optimise stream code involving eta properly is hard :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Construction of short vectors
On 25/06/2010, at 06:41, Alexey Khudyakov wrote: Then constructor like one below arise naturally. And I don't know how to write them properly. It's possible to use fromList but then list could be allocated which is obviously wasteful. vector2 :: Double - Double - Vec2D vector2 x y = ... -- Vec2D is some wrapper around Vector data type Your best bet is probably singleton x ++ singleton y. Unfortunately, GHC doesn't seem to provide any real way of specifying array literals. Another question is there any specific problems with short vectors? They could be just 2 elements long. I mean performance problems A data type like this one should be faster: data Vec2D = Vec2D {-# UNPACK #-} !Double {-# UNPACK #-} !Double Firstly, this needs one less indirection for accessing the components. Secondly, GHC's simplifier knows a lot more about algebraic data types than it does about arrays so the above definition will often lead to better optimisations. Whether or not the difference is significant probably depends on the program. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The mother of all functors/monads/categories
On 27/06/2010, at 19:54, Max Bolingbroke wrote: Q: What is the mother of all X, where X is some type class? A: It is a data type D such that: 1. There exist total functions: lift :: X d = d a - D a lower :: X d = D a - d a Are those universally quantified over d? If so, then none of your examples fit this definition. I assume you mean this: lift :: X d = d a - D d a lower :: X d = D d a - d a In that case, isn't D just the dictionary for (X d) and a value of type (d a)? I.e., couldn't we always define it as: data D d a where { D :: X d = d a - D d a } Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Bug with [Double]
On 19/05/2010, at 19:24, Dmitry Olshansky wrote: Prelude [1,1+2/3..10] [1.0,1.6665,2.333,2.9996,3.666,4.332,4.998,5.664,6.33,6.9964,7.6625,8.329,8.995,9.66,10.327] -- It is a bug! Unfortunately, it isn't. Section 6.3.4 of the Haskell report says: For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i. In this case, i = 2/3 so the last value in the list is 10+1/3. The same applies to the other examples. Personally, I consider the Enum class itself to be broken. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Bug with [Double]
On 19/05/2010, at 20:36, Ivan Lazar Miljenovic wrote: Roman Leshchinskiy r...@cse.unsw.edu.au writes: Personally, I consider the Enum class itself to be broken. Oh? In what sense? Firstly, the enumFrom* family of functions shouldn't be methods and the class itself should provide enough facilities for implementing them generically. GHC, for instance, specialises them for all primitive numeric types just to get foldr/build fusion to work. That shouldn't be necessary and doesn't help with overloaded code anyway. For instance, this generates an intermediate list: foo :: Enum a = a - a - [Int] foo a b = [fromEnum x | x - [a..b]] It's even worse when you want to implement similar functionality for other data structures. In vector, I basically had to duplicate all those specialisations to get decent performance. The generic case is horribly inefficient: enumFromTo x y = fromList [x .. y] There is no other sensible definition. Secondly, it should be possible to compute the length and the nth element of [a..b] in constant time. At the moment, it's impossible to distribute [a..b] efficiently across multiple threads - you have to generate the entire list first and then split it into chunks. It's completely unclear to me what [:a .. b:] should mean in DPH, for instance. So basically, Enum only provides enough functionality to desugar [a..b] and friends and even here, it doesn't interact well with fusion. Of course, these concerns weren't relevant back when the class was designed. But it is really broken now, IMO. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Bug with [Double]
On 19/05/2010, at 23:44, Ben Millwood wrote: On Wed, May 19, 2010 at 10:57 AM, Serguey Zefirov sergu...@gmail.com wrote: PS Rationals: Prelude [1,1+2/3..10] :: [Rational] [1 % 1,5 % 3,7 % 3,3 % 1,11 % 3,13 % 3,5 % 1,17 % 3,19 % 3,7 % 1,23 % 3,25 % 3,9 % 1,29 % 3,31 % 3] Same result. This sounds like a bug to me. The section of the Haskell Report that deals with the Enum class mentions Float and Double, not Rational, and there's really no sensible reason why Rationals would exhibit this behaviour given that they don't have rounding error. From Section 12.1 of the Library Report: instance (Integral a) = Enum (Ratio a) where succ x = x+1 pred x = x-1 toEnum = fromIntegral fromEnum = fromInteger . truncate -- May overflow enumFrom = numericEnumFrom -- These numericEnumXXX functions enumFromThen = numericEnumFromThen -- are as defined in Prelude.hs enumFromTo = numericEnumFromTo -- but not exported from it! enumFromThenTo = numericEnumFromThenTo The numericEnum functions are defined in Section 8 of the Language Report and have semantics required for Float and Double. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Numerical Analysis
On 17/05/2010, at 05:17, Gregory Crosswhite wrote: As an aside, while there are advantages to writing numerical analysis routines in Haskell, it might be better strategy to instead link in something like LAPACK and provide nice wrappers to it in Haskell, since this way you can harness the work of the experts who have spent a lot of time perfecting their code rather than re-inventing the wheel. I don't see think this is an either/or question. A good array library ought to provide BLAS, Lapack, FFTW etc. bindings *and* allow writing high-performance code in pure Haskell. I haven't implemented any of these bindings for vector only because I'm still deciding what to do with multidimensional arrays. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Numerical Analysis
On 17/05/2010, at 02:52, Pierre-Etienne Meunier wrote: You are quite right that vector only supports nested arrays but not multidimensional ones. This is by design, however - the library's only goal is to provide efficient one-dimensional, Int-indexed arrays. I'm thinking about how to implement multidimensional arrays on top of vector but it's not that easy. While repa is a step in that direction, I also need to support mutable arrays and interoperability with C which complicates things immensely. I understand. What complicates it even more (at least in what I imagine) is that C uses the same syntax for multidimensional and nested arrays, and I do not believe that for instance GHC's FFI allows for array types such as int x[19][3]. Actually, it does since an argument of that type is equivalent to int *x. FWIW, I always say nested array when I mean that the individual subarrays can have different lengths as opposed to multidimensional ones where they are all the same. So the former are similar to int *x[]. I was also wondering about how to do linear algebra : an infinite number of types would be needed to express all the constraints on matrix multiplication : we need types such as array of size m * n. Is there a way to generate these automatically with for instance template haskell (again ! But I know nothing of template haskell, neither, sorry !) Encoding the bounds in the type system is possible but rather messy. In general, simply saying the array has indices of type (Int,Int) and doing dynamic bounds check when necessary seems to work best in Haskell. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Numerical Analysis
On 16/05/2010, at 10:17, Pierre-Etienne Meunier wrote: I've also just noticed a lack in the vector library : multidimensional arrays seem to require indirections like in caml, whereas in C or in Data.Ix, there is a way to avoid this. This is especially important for avoiding cache misses with many dimensions, as well as for providing a clean interface. For instance if a 10x10 matrix is initialized unproperly like Data.Vector.replicate 10 $ Data.Vector.replicate 10 0 The result is a total mess. Surely, every programmer knows that a computer has got memory, and that this memory has to be allocated, but from what I understand of haskell, I would expect the interface and the RTS to do it for me. And an integer multiplication, followed by an addition, is way cheaper than accessing uncached memory. Or maybe I do not understand that pipelines, hyperthreading and all that stuff would give you the same result ? You are quite right that vector only supports nested arrays but not multidimensional ones. This is by design, however - the library's only goal is to provide efficient one-dimensional, Int-indexed arrays. I'm thinking about how to implement multidimensional arrays on top of vector but it's not that easy. While repa is a step in that direction, I also need to support mutable arrays and interoperability with C which complicates things immensely. That said, if all you need is a matrix it's quite easy to implement the necessary index calculations yourself. Also, since you are working with numerics I highly recommend that you use either Data.Vector.Unboxed or Data.Vector.Storable instead of Data.Vector as boxing tends to be prohibitively expensive in this domain. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Monadic style with Streams (as in Data.Array.Parallel.Stream)
On 16/05/2010, at 11:54, Mark Wassell wrote: Hi, This possibly might go against the spirit of what Stream programming is about but I having difficulties converting an imperative algorithm [1] into Haskell and think it would be easier if I was able to write it in a monadic style with operations to read and write from and to the streams. I first tried to approach it by delving into the innards of other Stream functions to devise what I needed. I only got so far and the sticking point was defining the Monad. I then approached it from the Monad side and although what I have is workable, it probably isn't going to perform (for one it uses fromStream and tailS on each read off the front of the stream). Data.Array.Parallel.Stream serves only one purpose: to represent loops produced by DPH in such a way that the compiler is able to optimise them well. Putting a monad on top of that will very very likely break this. To be honest, I'm not sure why you need the monad anyway. I would expect compression/decompression to be pure functions of type Stream Word8 - Stream Word8. In any case, I would urgently recommend not to use Data.Array.Parallel.Stream for anything at this point. This whole subsystem will soon die of old age and be replaced by the much nicer stuff from package vector, specifically Data.Vector.Fusion.Stream and Data.Vector.Fusion.Stream.Monadic. Note that the latter implements monadic streams as described in http://www.cse.unsw.edu.au/~rl/publications/recycling.html. Perhaps those can be useful for you if you really need a monad. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strict type system allows for a maximum number of programming errors to be caught at compile time.
On 04/05/2010, at 13:30, Luke Palmer wrote: On Mon, May 3, 2010 at 11:07 AM, Kyle Murphy orc...@gmail.com wrote: The fact that it doesn't is proof enough that there's a problem with it even if that problem is simply that the types you're using aren't exactly correct. Further, I'd argue that in the first instance with a non-strict type system, the instance of wrong code that compiles would be higher. The only argument to support non-strict typing would be if you could show that it takes less time to track down runtime bugs than it does to fix compile time type errors, and any such claim I'd be highly skeptical of. Clearly. But many people believe in this methodology, and use test suites and code coverage instead of types. Indeed, such practices are essentially empirical type checking, and they afford the advantage that their verification is much more expressive (however less reliable) than our static type system, because they may use arbitrary code to express their predicates. I don't think it's a question of types vs. testing. Rather, it's types + testing vs. just testing. How is the latter more expressive than the former for defining properties of programs? Also, testing loses a lot of appeal once you start dealing with concurrent programs. Testing for this program doesn't have race conditions isn't exactly easy. You want as many static guarantees as possible. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Haskell and the Software design process
On 03/05/2010, at 06:02, Jaco van Iterson wrote: I was just wondering what methods are best to design/model the software in bigger projects when you are planning to use Haskell. Is there no difference compared to other languages? Are there any Haskell tools? In addition to what Don said, here are a couple of things I've learned. This is just from personal experience so YMMV. Design in Haskell is much more often bottom-up than in, say, traditional OO where it's frequently top-down all the way. I believe this is mainly due to purity. When you have some kind of global state, your design process often has to be top-down because of intricate interactions between program components which modify that state. Designing Haskell software tends to involve much fewer diagrams than OO. Your most important design tool is the type system. You can often express large chunks of your design through types and have the compiler check and enforce them. Fiddling with types is often part of the design process and should be treated accordingly. If you stumble on a useful design pattern, think about how to encode it in the type system (this is quite different from OO patterns). Higher-order functions and type classes are very powerful tools for reducing coupling and for implementing design patterns. Prototyping is very cheap and easy. Writing prototypes and playing with them in ghci allows you to see how your subsystems will behave and adjust the design accordingly. In general, you ought to write code (esp. type signatures) while designing. Some libraries/subsystems will evolve into or start out as EDSLs. This is good and should be encouraged. Identifying EDSLs that would be useful for implementing your software is an important step in the design process. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 22:42, Roman Leshchinskiy wrote: On 24/04/2010, at 22:06, Barak A. Pearlmutter wrote: I was thinking of this: data T = T Double deriving ( Eq, Ord ) ... GHC basically produces instance Ord T where compare (T x) (T y) = compare x y t u = compare t u == LT That is indeed what it does. Which is a plain old bug, since it leads to inconsistent behaviour between wrapped vs unwrapped values. *Main T (0/0) == T (0/0) False *Main T (0/0) T (0/0) False *Main T (0/0) T (0/0) True *Main (0/0) (0/0) False Urgh. You're right, I hadn't thought of this. Would you care to submit a bug report? I submitted one but on further reflection, this is not so simple. Let's look at pairs as an example. At the moment, () is implemented basically like this: (a,b) (c,d) = case compare a c of LT - False EQ - compare b d GT - True Of course, this means that (0/0,'a') (0/0,'a'). So we could change the implementation: (a,b) (c,d) = a c || (a == c b d) But now we compare a to c twice which is very bad for, say, ([Int],Int). Clearly, we want to use the first definition but it leads to inconsistent results for Doubles. I don't see how to solve this while keeping IEEE semantics of silent NaNs. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The instability of Haskell libraries
On 24/04/2010, at 07:29, Don Stewart wrote: Oh, the Platform has very strict standards about APIs, What is an API? The package versioning policy only seems to talk about types and function signatures. John's old-locale example shows that this is not enough. Would it perhaps make sense for at least the Platform to require packages to have unit tests and to require versions to be bumped whenever those change (sufficiently)? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The instability of Haskell libraries
On 24/04/2010, at 18:06, Ivan Lazar Miljenovic wrote: Roman Leshchinskiy r...@cse.unsw.edu.au writes: On 24/04/2010, at 07:29, Don Stewart wrote: Oh, the Platform has very strict standards about APIs, What is an API? The package versioning policy only seems to talk about types and function signatures. John's old-locale example shows that this is not enough. I would think that the API is all the functions/classes/datatypes/instances/etc. exported from the library in combination with their types. So the semantics of those functions doesn't matter at all? Would it perhaps make sense for at least the Platform to require packages to have unit tests and to require versions to be bumped whenever those change (sufficiently)? I don't get this; just because someone changes a unit test (because they thought of a new case, etc.) they should bump the package version even if all the changes were internal and not exported? Adding new tests (i.e., new postconditions) doesn't change the API. Loosening preconditions doesn't, either. Also, the tests would only cover the exposed part of the library, of course. Internal tests are of no concern to the library's clients. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The instability of Haskell libraries
On 24/04/2010, at 18:54, Ivan Lazar Miljenovic wrote: Roman Leshchinskiy r...@cse.unsw.edu.au writes: On 24/04/2010, at 18:06, Ivan Lazar Miljenovic wrote: I would think that the API is all the functions/classes/datatypes/instances/etc. exported from the library in combination with their types. So the semantics of those functions doesn't matter at all? What do you refer to by semantics? Can you provide an example of when what you consider to be the API to change when the functions, types, etc. don't? John Goerzen gave one in the very first post of this thread: the fix to old-locale which didn't change any types but apparently changed the behaviour of a function quite drastically. Another example would be a change to the Ord instances for Float and Double which would have compare raise an exception on NaNs as discussed in a different thread on this list. Another one, which is admittedly silly but demonstrates my point, would be changing the implementation of map to map _ _ = [] In general, any significant tightening/changing of preconditions and loosening/changing of postconditions would qualify. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 19:56, Barak A. Pearlmutter wrote: And yet a lot of generic code is written in terms of compare. That's can be an advantage, because often that code *should* blow up when it gets a NaN. E.g., sorting a list of Floats which includes a NaN. However, often you will know that the list doesn't contain NaNs and will still have to pay a performance penalty. It's a question of what the right default is - safety or performance. In the case of floating point numbers, I'm leaning towards performance. That said, I would be very much in favour of providing a SafeFloat or whatever type with much safer semantics than IEEE floats and trying to get people to use that type by default unless they really need the performance. Even deriving(Ord) only produces compare and relies on standard definitions for other methods. I don't think that's actually a problem. Surely the IEEE Floating Point types would give their own definitions of not just compare but also , =, etc, overriding the problematic deriving(Ord) definitions of comparison in terms of compare and vice-versa. I was thinking of this: data T = T Double deriving ( Eq, Ord ) Unless I'm mistaken, at the moment GHC basically produces instance Ord T where compare (T x) (T y) = compare x y t u = compare t u == LT ... That is, all comparisons on T would be paying the NaN performance tax. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 22:06, Barak A. Pearlmutter wrote: Currently the standard prelude has default definition: ... compare x y | x == y= EQ | x = y= LT | otherwise = GT I'd suggest [...] compare x y | x y = LT | x == y= EQ | x y = GT | otherwise = error no consistent ordering It is not clear to me that this would cause a measurable performance hit in the case of floating point numbers. We're talking about at most two extra instructions: a compare and a conditional branch. The The problem are not so much the additional instructions. Rather, it's the fact that compare for Float and Double can fail at all which inhibits some optimisations. For instance, GHC is free to eliminate the comparison in (x `compare` y) `seq` a but wouldn't be with your change. It doesn't actually do that at the moment, which looks like an optimiser deficiency to me. But in any case, the property can fail has a significant effect on optimisations sometimes. I was thinking of this: data T = T Double deriving ( Eq, Ord ) ... GHC basically produces instance Ord T where compare (T x) (T y) = compare x y t u = compare t u == LT That is indeed what it does. Which is a plain old bug, since it leads to inconsistent behaviour between wrapped vs unwrapped values. *Main T (0/0) == T (0/0) False *Main T (0/0) T (0/0) False *Main T (0/0) T (0/0) True *Main (0/0) (0/0) False Urgh. You're right, I hadn't thought of this. Would you care to submit a bug report? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 23/04/2010, at 01:34, Barak A. Pearlmutter wrote: I'd suggest that compare involving a NaN should yield error violation of the law of the excluded middle Please think of the poor guys trying to write high-performance code in Haskell! Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 07:15, Barak A. Pearlmutter wrote: In all seriousness, I think it is reasonable when isNaN x for x C x == C x C C x C == x C x to all be False, for all floats C, even C=x, as a sort of efficient weak Bool bottom. This is what the FP hardware does --- so it is very efficient. But if you force the system to choose one of the three, which is what compare x C is doing, I think the result should be _|_. Because there is no way to choose, no reasonable Ordering to return. It is possible to write generic Ord n = code under these conditions, if you're careful to case out ,==, when you don't want a NaN to kill the computation, and when necessary handle the case that all three come out false. That's what good numeric programmers actually do. But compare giving a wrong Ordering is an invitation to get it wrong. And yet a lot of generic code is written in terms of compare. Even deriving(Ord) only produces compare and relies on standard definitions for other methods. Don't get me wrong, I don't think the current situation is ideal (although it doesn't seem all that bad to me). But this change would have far-reaching implications for performance which ought to be evaluated before it can be seriously considered, in my opinion. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector recycling
On 18/04/2010, at 08:07, Ben wrote: On Fri, Apr 16, 2010 at 11:19 PM, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: That said, it would be quite possible to provide something like the following: fold_inplace :: Vector v a = (v a - b - v a) - v a - [b] - v a as far as i understand there would be two ways of writing such a function : 1) to use mutable vectors monadically underneath and hide them inside some kind of unsafeX, or 2) to give a specialized fold with sufficient hints to the compiler to use the rewriting framework. Right, I meant 2. I'm not saying it's necessarily a good idea, just that it would be possible. This could use the recycling framework to safely do as much in-place as possible while still preserving a purely functional interface. I have to think about it. Really, this looks like just a poor man's substitute for linear types. although i am supposed to know something about category theory, since my training is in math, i don't know about girard's later work. is there a short precis you can give (or a pointer?) This is a nice introduction: http://homepages.inf.ed.ac.uk/wadler/papers/linear/linear.ps Also, Clean's uniqueness types are quite similar. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector recycling
On 17/04/2010, at 13:32, Ben wrote: module Main where import qualified Data.Vector.Generic as V import qualified Data.Vector.Unboxed as UV type Vec = UV.Vector Double axpy :: Double - Vec - Vec - Vec axpy a x y = V.zipWith (+) (V.map (* a) x) y sumVecs :: [(Double, Vec)] - Vec sumVecs axs = let (a, x) = head axs in foldl adder (V.map (* a) x) (tail axs) where adder :: Vec - (Double, Vec) - Vec adder v1 (a, x) = axpy a x v1 how to i write this in a way which ensures recycling / fusion, e.g. in-place updates? Unfortunately, recycling won't help you here. It is a purely local optimisation which doesn't work across function boundaries (inlining notwithstanding) and recursive calls. Your best bet is to use a mutable vector and do the fold in the ST monad. That said, it would be quite possible to provide something like the following: fold_inplace :: Vector v a = (v a - b - v a) - v a - [b] - v a This could use the recycling framework to safely do as much in-place as possible while still preserving a purely functional interface. I have to think about it. Really, this looks like just a poor man's substitute for linear types. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strange error with type classes + associated types
On 17/04/2010, at 11:00, Conal Elliott wrote: I'm unsure now, but I think I tried making Basis a data type (not syn) and ran into the problem I mentioned above. The Basis *synonyms* also have HasTrie instances, which is crucially important. If we switch to (injective) data types, then we lose the HasTrie instances. I'd be okay with defining HasTrie instances (preferably via deriving) for the associated Basis data types, but I couldn't figure out how to. Maybe it's not possible currently, or maybe I just didn't know how. Could you perhaps make (:-*) a proper type rather than a synonym? That would help with the ambiguity. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Floyd Warshall performance (again)
On 16/04/2010, at 18:06, Mathieu Boespflug wrote: shortestPath :: [(Int, Int, Int)] - UArray Int Int shortestPath g = runSTUArray $ do let mnew = newArray (0, SIZE * SIZE) 1 mread arr i j = unsafeRead arr (i * SIZE + j) mwrite arr i j x = unsafeWrite arr (i * SIZE + j) x unsafeIOToST $ hSetBuffering stdout LineBuffering unsafeIOToST $ putStrLn Allocating ... pm - mnew unsafeIOToST $ putStrLn Allocating ... done let loop1 SIZE = return () loop1 k = let loop2 SIZE = return () loop2 i = let loop3 SIZE = return () loop3 j = do xij - mread pm i j xik - mread pm i k xkj - mread pm k j mwrite pm i j (min xij (xik + xkj)) loop3 (j + 1) in loop3 0 loop2 (i + 1) in loop2 0 loop1 (k + 1) loop1 0 return pm In general, GHC doesn't like nested loops. You might want to try the following structure: loop1 SIZE = return () loop1 k = loop2 k 0 loop2 k SIZE = loop1 (k+1) loop2 k i = loop3 k i 0 loop3 k i SIZE = loop2 k (i+1) loop3 k i j = do ... loop3 k i (j+1) And, as Max suggested, the llvm backend ought to improve things. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Vector to Monadic Stream and back, how?
On 14/04/2010, at 09:05, Xiao-Yong Jin wrote: I want to use 'mapM' on Data.Vector.Vector, but it looks like the only 'mapM' defined is in Data.Vector.Fusion.Stream.Monadic. I'm able to use 'stream' and 'liftStream' to convert a 'Vector' to a monadic stream, on which I can use 'mapM'. But I couldn't find a way to convert the monadic stream back to Vector without using an intermediate list. I don't think I understand the internal of monadic stream that much. But it looks not so fusion to me. Is it the only way back to Vector? Unfortunately, it's not at all clear to me how to implement mapM on vectors/arrays without going through an intermediate list for arbitrary monads (it's easy for ST and IO). The next version of vector will include mapM and friends but it will probably be implemented much like you describe, with appropriate specialisations for ST and IO. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strange error with type classes + associated types
On 15/04/2010, at 00:30, Brent Yorgey wrote: On Wed, Apr 14, 2010 at 09:51:52AM +0100, Stephen Tetley wrote: On 14 April 2010 03:48, Brent Yorgey byor...@seas.upenn.edu wrote: Can someone more well-versed in the intricacies of type checking with associated types explain this? Or is this a bug in GHC? If you take the definition of append out out the class - GHCi will give it a type: append (Affine a2 b2) (Affine a1 b1) = Affine (a2 *.* a1) (lapply a2 b1 ^+^ b2) *VectorSpace :t append append :: (Scalar v ~ Scalar v1, Basis v ~ Basis u, Basis v1 ~ Basis v, VectorSpace v1, HasTrie (Basis v), HasBasis v, HasBasis u) = Affine v1 - Affine v - Affine v1 Right, this seems weird to me. Why is there still a 'u' mentioned in the constraints? Actually, I don't even see why there ought to be both v and v1. The type of (*.*) mentions three type variables, u, v, and w: (*.*) :: (HasBasis u, HasTrie (Basis u), HasBasis v, HasTrie (Basis v), VectorSpace w, Scalar v ~ Scalar w) = (v :-* w) - (u :-* v) - u :-* w Note that (:-*) is a type synonym: type :-* u v = MSum (Basis u :-: v) Substituting this into the type of (*.*), we get: (*.*) :: ... = MSum (Basis v :-* w) - MSum (Basis u :-* v) - MSum (Basis u :-* w) Now, Basis is an associated type: class VectorSpace v = HasBasis v where type Basis v ... This means that there is no way to obtain u from Basis u. Since u only ever occurs as an argument to Basis, a type family, it can never be unified with anything. This, in turn, means that there is no way to call (*.*) at all (unless I'm severely mistaken). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] what are the safety conditions for unsafeIOToST
On 08/04/2010, at 01:38, Henning Thielemann wrote: On Apr 6, 2010, at 5:30 PM, Roman Leshchinskiy wrote: In fact, the only safe-ish use for it I have found is to use Storable-related functions in ST, hoping that the instances don't actually use any real IO functionality. Arguably, this shouldn't be necessary as Storable should live in ST anyway. But Storable in ST monad would be still dangerous, because pointers may point to non-allocated memory or point outside of an array. I don't think that's the kind of safety the original poster had in mind. You can have invalid memory accesses even in pure code but that's ok since we know what the semantics is: bottom. I understood the question to be about the conditions under which unsafeIOToST can violate referential transparency. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] what are the safety conditions for unsafeIOToST
On 07/04/2010, at 07:33, Nicolas Frisby wrote: I haven't been able to find it via Google or Haddock. An old message suggests is was just a matter of exceptions? I don't think that's correct. You can implement unsafePerformIO in terms unsafeIOToST: unsafePerformIO :: IO a - a unsafePerformIO p = runST (unsafeIOToST p) In fact, the only safe-ish use for it I have found is to use Storable-related functions in ST, hoping that the instances don't actually use any real IO functionality. Arguably, this shouldn't be necessary as Storable should live in ST anyway. I only want to use the IO for generating Data.Uniques to pair with STRefs in order to make a map of them. I'm guessing this would be a safe use since it's exception free (... right?). It's hard to tell without looking at your code. But if you are generating Uniques in ST then it's probably unsafe: foo :: () - Unique foo _ = runST (unsafeIOToST newUnique) What's the value of foo ()? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: replicateM over vectors
On 04/04/2010, at 05:33, Chad Scherrer wrote: Roman Leshchinskiy rl at cse.unsw.edu.au writes: Ah. I missed that. Then your best bet is probably replicate n action = munstream v $ Fusion.Stream.Monadic.generateM n (const action) $ new n It's uglier that it should be but vector simply doesn't define the right combinators for this at the moment. I'm having trouble getting this to typecheck. I'll reread your Recycle Your Arrays paper; maybe then it will make more sense. Ugh. I shouldn't write emails while frantically scrambling to make a conference deadline. What I meant is this: replicate n action = do { v - new n; v' - munstream v (generate M n (const action)) } Sorry for the confusion. There are two things one would have to do. First, add a function to Generic.New which initialises a New from a Monadic.Stream and fusion rules for it. That's easy. The hard part is to generalise New to work with arbitrary monads: at the moment it is defined as: data New a = New (forall mv s. MVector mv a = ST s (mv s a)) This is because its basic reason for existence is to be passed to Vector.new which then does a runST to produce an immutable vector. It is perhaps possible to make New more general but it's quite tricky. I'll think about it after the ICFP deadline But the m I'm interested in happens to be ST. Sounds like it's still easy in principle, but not immediate. Is that right? Not really. The big step is getting from the type above to a fixed s so that you can use it in a particular ST computation. It's not just a question of making the types work, either. I also have to convince myself that it is actually safe to do so (in particular, that the rewrite rules in the library can't break things). Getting from there to IO is very easy. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] replicateM over vectors
On 02/04/2010, at 12:16, Don Stewart wrote: Chad.Scherrer: Hi, I'd like to be able to do replicateM, but over a vector instead of a list. Right now I'm doing this: The operation you are looking for is called newWith. It probably should be called replicate. Roman? Can we generate frozen arrays for monadic generators, and still fuse in the current New/Mutable/MStream architecture? For monadic stuff, fusion happens on things of type New. For instance, you could write this (I'm omitting the class contexts and Data.Vector prefixes): replicate :: Int - a - New a replicate n x = Generic.New.unstream (Fusion.Stream.replicate n x) and then either Mutable.run (replicate n x) to get a mutable vector or new (replicate n x) to get an immutable one. You could also chain operations on New, including monadic ones: v - Mutable.run $ Generic.New.transform (Fusion.Stream.Monadic.mapM f) $ replicate n x and this ought to fuse. Note that the New stuff is quite rough and only really intended for internal use at the moment. I wanted to get the normal APIs working properly first. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] replicateM over vectors
On 02/04/2010, at 13:01, Don Stewart wrote: rl: replicate :: Int - a - New a replicate n x = Generic.New.unstream (Fusion.Stream.replicate n x) and then either Mutable.run (replicate n x) to get a mutable vector or new (replicate n x) Hmm, but here 'a' is pure. I don't think he wants newWith :: (PrimMonad m, MVector v a) = Int - a - m (v (PrimState m) a) but more: newWithM :: (PrimMonad m, MVector v a) = Int - m a - m (v (PrimState m) a) Ah. I missed that. Then your best bet is probably replicate n action = munstream v $ Fusion.Stream.Monadic.generateM n (const action) $ new n It's uglier that it should be but vector simply doesn't define the right combinators for this at the moment. to get an immutable one. You could also chain operations on New, including monadic ones: v - Mutable.run $ Generic.New.transform (Fusion.Stream.Monadic.mapM f) $ replicate n x Oh, that's interesting. But what if we want to fill directly with the monadic action? We wouldn't mapM (const a) $ replicate n undefined So how do we best do a fusible, e.g.: replicateM :: G.Vector v a = Int - IO a - IO (v a) There are two things one would have to do. First, add a function to Generic.New which initialises a New from a Monadic.Stream and fusion rules for it. That's easy. The hard part is to generalise New to work with arbitrary monads: at the moment it is defined as: data New a = New (forall mv s. MVector mv a = ST s (mv s a)) This is because its basic reason for existence is to be passed to Vector.new which then does a runST to produce an immutable vector. It is perhaps possible to make New more general but it's quite tricky. I'll think about it after the ICFP deadline :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Data Structures GSoC
On 31/03/2010, at 18:14, Achim Schneider wrote: We have a lot of useful interfaces (e.g. ListLike, Edison), but they don't seem to enjoy wide-spread popularity. Perhaps that's an indication that we need different interfaces? IMO, huge classes which generalise every useful function we can think of just isn't the right approach. We need small interfaces between containers and algorithms. In fact, the situation is perhaps somewhat similar to C++ where by providing exactly that the STL has been able to replace OO-style collection libraries which never really worked all that well. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Shootout update
I'm wondering... Since the DPH libraries are shipped with GHC by default are we allowed to use them for the shootout? Roman On 30/03/2010, at 19:25, Simon Marlow wrote: The shootout (sorry, Computer Language Benchmarks Game) recently updated to GHC 6.12.1, and many of the results got worse. Isaac Gouy has added the +RTS -qg flag to partially fix it, but that turns off the parallel GC completely and we know that in most cases better results can be had by leaving it on. We really need to tune the flags for these benchmarks properly. http://shootout.alioth.debian.org/u64q/haskell.php It may be that we have to back off to +RTS -N3 in some cases to avoid the last-core problem (http://hackage.haskell.org/trac/ghc/ticket/3553), at least until 6.12.2. Any volunteers with a quad-core to take a look at these programs and optimise them for 6.12.1? Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 29/03/2010, at 02:27, Lennart Augustsson wrote: Does anything change if you swap the first two rhss? No, not as far as I can tell. On Sun, Mar 28, 2010 at 1:28 AM, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: On 28/03/2010, at 09:47, Lennart Augustsson wrote: It's important to switch from mod to rem. This can be done by a simple abstract interpretation. Also, changing the definition of rem from a `rem` b | b == 0 = divZeroError | a == minBound b == (-1) = overflowError | otherwise = a `remInt` b to a `rem` b | b == 0 = divZeroError | b == (-1) a == minBound = overflowError | otherwise = a `remInt` b speeds up the GHC version by about 20%. Figuring out why is left as an exercise to the reader :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC
On 28/03/2010, at 01:36, Jan-Willem Maessen wrote: It's worth pointing out that there's a bit of bang-pattern mysticism going on in this conversation (which has not been uncommon of late!). A non-buggy strictness analyzer should expose the strictness of these functions without difficulty. Actually, rangeJ is lazy in i and rangeK is lazy in i and j. GHC does unbox everything important here but that needs more optimisations than just strictness analysis. You are right, though, that GHC doesn't need bang patterns here. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 27/03/2010, at 05:27, John Meacham wrote: Here are jhc's timings for the same programs on my machine. gcc and ghc both used -O3 and jhc had its full standard optimizations turned on. jhc: ./hs.out 5.12s user 0.07s system 96% cpu 5.380 total gcc: ./a.out 5.58s user 0.00s system 97% cpu 5.710 total ghc: ./try 31.11s user 0.00s system 96% cpu 32.200 total I really don't understand these GHC numbers. I get about 3s for the C version, about 5s for GHC with rem and about 7.5s for GHC with mod. Is this perhaps on a 64-bit system? What is sizeof(int) in C and sizeOf (undefined :: Int) in Haskell? That said, I suspect the only thing this benchmark really measures is how fast the various compilers can compute i * i + j * j + k * k `mod` 7. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 28/03/2010, at 11:07, John Meacham wrote: I have not thoroughly checked it, but I think there are a couple things going on here: It could also be worthwhile to float out (i*i + j*j) in rangeK instead of computing it in every loop iteration. Neither ghc nor gcc can do this; if jhc can then that might explain the performance difference (although I would expect it to be larger in this case). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 28/03/2010, at 09:47, Lennart Augustsson wrote: It's important to switch from mod to rem. This can be done by a simple abstract interpretation. Also, changing the definition of rem from a `rem` b | b == 0 = divZeroError | a == minBound b == (-1) = overflowError | otherwise = a `remInt` b to a `rem` b | b == 0 = divZeroError | b == (-1) a == minBound = overflowError | otherwise = a `remInt` b speeds up the GHC version by about 20%. Figuring out why is left as an exercise to the reader :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance question
On 19/03/2010, at 08:48, Daniel Fischer wrote: Am Donnerstag 18 März 2010 21:57:34 schrieb Daniel Fischer: Contrary to my expectations, however, using unboxed arrays is slower than straight arrays (in my tests). However, a few {-# SPECIALISE #-} pragmas set the record straight. This is because without specialising, unsafeAt is a straight (inlineable) function call for boxed arrays but is overloaded and hence much slower for unboxed ones. In general, unboxed arrays tend to be slower in generic code. The only real solution is making functions such as binarySearch INLINE. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector stream fusion, inlining and compilation time
On 06/03/2010, at 03:10, stefan kersten wrote: i'm still curious, though, why my three versions of direct convolution perform so differently (see attached file). in particular, i somehow expected conv_3 to be the slowest and conv_2 to perform similar to conv_1. any ideas? i haven't had a look at the core yet, mainly because i'm lacking the expertise ... Hmm, one problem is that the current definition of reverse is suboptimal to say the least. I'll fix that. Could you perhaps send me your complete benchmark? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: [Haskell] Recursive definition of fibonacci with Data.Vector
On 08/03/2010, at 12:17, Alexander Solla wrote: GHC even optimizes it to: fib = fib Sounds like an implementation bug, not an infinite dimensional vector space bug. My guess is that strictness is getting in the way, and forcing what would be a lazy call to fib in the corresponding list code -- fib = 0 : 1 : (zipWith (+) fib (tail fib)) -- into a strict one. In fact, I'm pretty sure that's what the problem is: data Vector a = Vector {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !(Array a) The problem is that you have to allocate an Array of a specific length when creating a Vector. Arrays are finite by definition. It's not a bug, it's a feature. Note that in the context of package vector, vector means a 1-dimensional, 0-indexed array. This is not unusual - see, for instance, the standard C++ library. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector stream fusion, inlining and compilation time
On 05/03/2010, at 04:34, stefan kersten wrote: i've been hunting down some performance problems in DSP code using vector and the single most important transformation seems to be throwing in INLINE pragmas for any function that uses vector combinators and is to be called from higher-level code. failing to do so seems to prevent vector operations from being fused and results in big performance hits (the good news is that the optimized code is quite competitive). does anybody have some more info about the do's and don'ts when programming with vector? This is a general problem when working with RULES-based optimisations. Here is an example of what happens: suppose we have foo :: Vector Int - Vector Int foo xs = map (+1) xs Now, GHC will generate a nice tight loop for this but if in a different module, we have something like this: bar xs = foo (foo xs) then this won't fuse because (a) foo won't be inlined and (b) even if GHC did inline here, it would inline the nice tight loop which can't possibly fuse instead of the original map which can. By slapping an INLINE pragma on foo, you're telling GHC to (almost) always inline the function and to use the original definition for inlining, thus giving it a chance to fuse. GHC could be a bit cleverer here (perhaps by noticing that the original definition is small enough to inline and keeping it) but in general, you have to add INLINE pragmas in such cases if you want to be sure your code fuses. A general-purpose mechanism for handling situations like this automatically would be great but we haven't found a good one so far. the downside after adding the INLINE pragmas is that now some of my modules take _really_ long to compile (up to a couple of minutes); any ideas where i can start looking to bring the compilation times down again? Alas, stream fusion (and fusion in general, I guess) requires what I would call whole loop compilation - you need to inline everything into loops. That tends to be slow. I don't know what your code looks like but you could try to control inlining a bit more. For instance, if you have something like this: foo ... = ... map f xs ... where f x = ... you could tell GHC not to inline f until fairly late in the game by adding {-# INLINE [0] f #-} to the where clause. This helps sometimes. i'm compiling with -O2 -funbox-strict-fields instead of -Odph (with ghc 6.10.4 on OSX 10.4), because it's faster for some of my code, but -O2 vs. -Odph doesn't make a noticable difference in compilation time. If you're *really* interested in performance, I would suggest using GHC head. It really is much better for this kind of code (although not necessarily faster wrt to compilation times). This is what -Odph does: -- -Odph is equivalent to -- ---O2 optimise as much as possible ---fno-method-sharing sharing specialisation defeats fusion -- sometimes ---fdicts-cheap always inline dictionaries ---fmax-simplifier-iterations20 this is necessary sometimes ---fsimplifier-phases=3 we use an additional simplifier phase -- for fusion ---fno-spec-constr-thresholdrun SpecConstr even for big loops ---fno-spec-constr-countSpecConstr as much as possible I'm surprised -Odph doesn't produce faster code than -O2. In any case, you could try turning these flags on individually (esp. -fno-method-sharing and the spec-constr flags) to see how they affect performance and compilation times. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 23:28, Dan Doel wrote: On Thursday 11 February 2010 8:54:15 pm Dan Doel wrote: On Thursday 11 February 2010 12:43:10 pm stefan kersten wrote: On 10.02.10 19:03, Bryan O'Sullivan wrote: I'm thinking of switching the statistics library over to using vector. that would be even better of course! an O(0) solution, at least for me ;) let me know if i can be of any help (e.g. in testing). i suppose uvector-algorithms would also need to be ported to vector, then. I could do this. To this end, I've done a preliminary port of the library, such that all the modules compile. I've just used safe operations so far, so it's probably a significant decrease in performance over the 0.2 uvector-algorithms (unless perhaps you turn off the bounds checking flag), but it's a start. It can be gotten with: darcs get http://code.haskell.org/~dolio/vector-algorithms That's great, thanks! FWIW, vector has two kinds of bounds checks: real ones which catch invalid indices supplied by the user (on by default) and internal ones which catch bugs in the library (off by default since the library is, of course, bug-free ;-). I guess you'd eventually want to use the latter but not the former; that's exactly what unsafe operations provide. I only encountered a couple snags during the porting so far: * swap isn't exported from D.V.Generic.Mutable, so I'm using my own. Ah, I'll export it. Also, I gladly accept patches :-) * I use a copy with an offset into the from and to arrays, and with a length (this is necessary for merge sort). However, I only saw a whole array copy (and only with identical sizes) in vector (so I wrote my own again). That's actually a conscious decision. Since vectors support O(1) slicing, you can simply copy a slice of the source vector into a slice of the target vector. * Some kind of thawing of immutable vectors into mutable vectors, or other way to copy the former into the latter would be useful. Right now I'm using unstream . stream, but I'm not sure that's the best way to do it. At the moment, it is (although it ought to be wrapped in a nicer interface). Something like memcpy doesn't work for Data.Vector.Unboxed because the ByteArrays aren't pinned. I don't really want to provide thawing until someone convinces me that it is actually useful. BTW, vector also supports array recycling so you could implement true in-place sorting for fused pipelines. Something like map (+1) . sort . update xs wouldn't allocate any temporary arrays in that case. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 12:39, Don Stewart wrote: bos: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? uvector's not seeing much development, but at least in the last round of benchmarks it was still consistently faster -- since it's been micro-optimized. FWIW, the development version of vector is usually faster the both uvector and dph-prim-seq, at least for the development version of NoSlow. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 12:40, Don Stewart wrote: rl: On 11/02/2010, at 05:03, Bryan O'Sullivan wrote: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? Yes, with the caveat that I haven't really used it in production code (I have tested and benchmarked it, though). BTW, I'll release version 0.5 as soon as get a code.haskell.org account and move the repo there. That's the main problem. I think we could move to vector as a whole, if the suite of testing/ performance/documentation stuff from uvector was ported. Hmm, I'm not sure what you mean here. Mostly thanks to Max Bolingbroke's efforts, vector has a fairly extensive testsuite. I benchmark it a lot (with NoSlow) and haven't found any significant performance problems in a while. As to documentation, there are comments for most of the functions :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 12:54, Dan Doel wrote: I also notice that vector seems to have discarded the idea of Vec (A * B) = Vec A * Vec B Oh no, it hasn't. In contrast to uvector/DPH, which use a custom strict tuple type for rather outdated reasons, vector uses normal tuples. For instance, Data.Vector.Unboxed.Vector (a,b,c) is internally represented as a triple of unboxed vectors of a, b and c. In general, vector supports 4 kinds of arrays at the moment: Data.Vector.Primitive wrappers around ByteArray#, can store primitive types Data.Vector.Unboxed uses type families, can store everything D.V.Primitive can plus tuples and can be extended for user-defined types Data.Vector.Storable wrappers around ForeignPtr, can store Storable things Data.Vector boxed arrays Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 13:49, Don Stewart wrote: rl: On 12/02/2010, at 12:39, Don Stewart wrote: bos: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? uvector's not seeing much development, but at least in the last round of benchmarks it was still consistently faster -- since it's been micro-optimized. FWIW, the development version of vector is usually faster the both uvector and dph-prim-seq, at least for the development version of NoSlow. Ah ha -- that's useful. Public benchmarks soon? In time for the Zurich Hackathon?? (March 20) I've been trying to find the time to put the benchmarks on my blog since the beginning of January but, alas, unsuccessfully so far. In any case, vector and NoSlow currently live in http://www.cse.unsw.edu.au/~rl/code/darcs/vector http://www.cse.unsw.edu.au/~rl/code/darcs/NoSlow If Roman declares the vector to be faster -- my main concern here for flat uarrays -- and makes the repo available so we can work on it, I'd be willing to merge uvector's tests and docs and extra array operations in. It is generally faster than dph-prim-seq. Benchmarking against uvector is a bit difficult because it's missing operations necessary for implementing most of the algorithms in NoSlow (in particular, bulk updates). For the ones that uvector supports, vector tends to be faster. BTW, this is for unsafe operations which don't use bounds checking. Bounds checking can make things a little slower but often doesn't cost anything as long as only collective operations are used. Sometimes it makes things faster which means that the simplifier still gets confused in some situations. There are also some significant differences between 6.12 and the HEAD (the HEAD is much more predictable). In general, I find it hard to believe that the performance differences I'm seeing really matter all that much in real-world programs. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 11/02/2010, at 05:03, Bryan O'Sullivan wrote: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? Yes, with the caveat that I haven't really used it in production code (I have tested and benchmarked it, though). BTW, I'll release version 0.5 as soon as get a code.haskell.org account and move the repo there. Roman On Wed, Feb 10, 2010 at 9:59 AM, stefan kersten s...@k-hornz.de wrote: hi, i've been using the vector [1] library for implementing some signal processing algorithms, but now i'd like to use the statistics [2] package on my data, which is based on the uvector [3] library. is there a (straightforward) way of converting between vectors and uvectors, preferrably O(1)? thanks, sk [1] http://hackage.haskell.org/package/vector [2] http://hackage.haskell.org/package/statistics [3] http://hackage.haskell.org/package/uvector ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Restrictions on associated types for classes
On 18/12/2009, at 00:37, Stephen Lavelle wrote: Given class MyClass k where type AssociatedType k :: * Is there a way of requiring AssociatedType be of class Eq, say? This works with -XFlexibleContexts: class Eq (AssociatedType k) = MyClass k where type AssociatedType k :: * Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Boxed Mutable Arrays
On 15/12/2009, at 06:53, Brad Larsen wrote: On another note, does this (or perhaps better phrased, will this) bug also affect Data Parallel Haskell? Luckily, no. DPH represents arrays of user-defined types by unboxed arrays (that's essentially what the vectoriser does). It does use boxed arrays in a couple of places internally but they are small. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] ANN: NoSlow - Microbenchmarks for array libraries
On 28/11/2009, at 07:45, Henning Thielemann wrote: Is there also a darcs repository? Yes, http://www.cse.unsw.edu.au/~rl/code/darcs/NoSlow. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Status of TypeDirectedNameResolution proposal?
On 18/11/2009, at 21:10, Simon Peyton-Jones wrote: Yes I think it can, although you are right to point out that I said nothing about type inference. One minor thing is that you've misunderstood the proposal a bit. It ONLY springs into action when there's a dot. So you'd have to write bar1 x = x.foo bar2 x = x.foo Yes, that's what I meant to write, silly me. I promise to pay more attention next time. OK so now it works rather like type functions. Suppose, the types with which foo was in scope were foo :: Int - Int foo :: Bool - Char Now imagine that we had a weird kind of type function type instance TDNR_foo Int = Int - Int type instance TDNR_foo Bool = Bool - Char Each 'foo' gives a type instance for TDNR_foo, mapping the type of the first argument to the type of that foo. Hmm... GHC doesn't allow this: type instance TDNR_foo () = forall a. () - a - a IIUC this restriction is necessary to guarantee termination. Given your analogy, wouldn't this proposal run into similar problems? | Another example: suppose we have | | data T a where | TInt :: T Int | TBool :: T Bool | | foo :: T Int - u | foo :: T Bool - u | | bar :: T a - u | bar x = case x of | TInt - foo x | TBool - foo x | | Here, (foo x) calls different functions in the two alternatives, right? To be | honest, that's not something I'd like to see in Haskell. You mean x.foo and x.foo, right? Then yes, certainly. Of course that's already true of type classes: data T a where T1 :: Show a = T a T2 :: Sow a = T a bar :: a - T a - String bar x y = case y of T1 - show x T2 - show x Then I get different show's. How so? Surely you'll get the same Show instance in both cases unless you have conflicting instances in your program? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Status of TypeDirectedNameResolution proposal?
Simon, have you given any thought to how this interacts with type system extensions, in particular with GADTs and type families? The proposal relies on being able to find the type of a term but it's not entirely clear to me what that means. Here is an example: foo :: F Int - Int foo :: Int - Int bar1 :: Int - Int bar1 = foo bar2 :: Int ~ F Int = Int - Int bar2 = foo IIUC, bar1 is ok but bar2 isn't. Do we realy want to have such a strong dependency between name lookup and type inference? Can name lookup be specified properly without also having to specify the entire inference algorithm? Another example: suppose we have data T a where TInt :: T Int TBool :: T Bool foo :: T Int - u foo :: T Bool - u bar :: T a - u bar x = case x of TInt - foo x TBool - foo x Here, (foo x) calls different functions in the two alternatives, right? To be honest, that's not something I'd like to see in Haskell. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] poor perfomance of indexU in uvector package
On 16/11/2009, at 22:46, Alexey Khudyakov wrote: Problems begin when you need non-contiguous block. Easiest way to so is indexing. FWIW, this operation is called backpermute and is probably exported as bpermute in uvector. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] What's the deal with Clean?
On 05/11/2009, at 04:01, Bulat Ziganshin wrote: oh, can we stop saying about shootout? if you want to see speed of pure haskell code, look at papers about fast arrays/strings - their authors have measured that lazy lists are hundreds times slower than idiomatic C code. is use of lazy lists counted as mistake too and paper authors had too small haskell experience? In the papers I coauthored, I don't think we measured any such thing. What we measured was that in algorithms that are best implemented with (unboxed) arrays, using boxed lists is going to cost you. That's not a very surprising conclusion and it's by no means specific to Haskell. The problem was/is the lack of nice purely declarative array libraries but that changing, albeit slowly. It's a question of using the right data structure for the algorithm, not a C vs. Haskell thing. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 13:23, Daniel Peebles wrote: In the presence of fusion (as is the case in uvector), it's hard to give meaningful time complexities for operations as they depend on what operations they are paired with. We need to think of a better way to express this behavior in the documentation though. I have to disagree here. Fusion never makes the complexity of operations worse. If it does, it's a bug. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 13:12, brian wrote: indexU :: UA e = UArr e - Int - e O(n). indexU extracts an element out of an immutable unboxed array. This is a typo (unless Don inserted a nop loop into the original DPH code). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 13:35, wren ng thornton wrote: Roman Leshchinskiy wrote: On 04/11/2009, at 13:23, Daniel Peebles wrote: In the presence of fusion (as is the case in uvector), it's hard to give meaningful time complexities for operations as they depend on what operations they are paired with. We need to think of a better way to express this behavior in the documentation though. I have to disagree here. Fusion never makes the complexity of operations worse. If it does, it's a bug. I think the point was more that the relevant complexity bound can change in the presence of fusion. For a poor example: the first map over a list is O(n) but all subsequent ones in a chain of maps are O(1) with fusion. I'm sure there are better examples than that, but you get the idea. Some people may care to know about that latter complexity rather than just the independent complexity. I think asymptotic complexity is the wrong tool for what you're trying to do. You implement your algorithm using operations with known complexities. This allows you to compute the complexity of the entire algorithm. That's all you can use operation complexities for. The compiler is then free to optimise the algorithm as it sees fit but is supposed to preserve (or improve) its complexity. It is not guaranteed or even supposed to preserve the original operations. To stay with your example, each of the two maps is linear regardless of whether fusion happens. Executing the two maps, be it one after another or interlocked, is linear simply because O(n) + O(n) = O(n), not because of fusion. Essentially, you're trying to use complexity to describe an optimisation which doesn't actually affect the complexity. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 14:07, Gregory Crosswhite wrote: Actually, it's not a typo. If you look at the source, what you'll see is indexU arr n = indexS (streamU arr) n I suspect it gets rewritten back to the O(1) version somewhere after is has had a chance to fuse. If not, then it's a bug. In the vector package, I do this instead, though: indexU arr n = O(1) implemetation {-# RULES indexU/unstreamU forall s n. indexU (unstreamU s) n = indexS s n #-} Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Arrays in Clean and Haskell
On 04/11/2009, at 14:38, Philippos Apolinarius wrote: And here comes the reason for writing this article. In the previous version of the Gauss elimination algorithm, I have imported Data.Array.IO. I also wrote a version of the program that imports Data.Array.ST. The problem is that I don't know how to read an STUArray from a file, process it, and write it back to a file. Why don't you use the IOUArray directly instead of converting it to STUArray and back? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Stream-fusion without the lists
On 12/05/2009, at 14:45, Reiner Pope wrote: The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly? I think the main reason is that streams don't store data and therefore don't support sharing. That is, in let xs = map f ys in (sum xs, product xs) the elements of xs will be computed once if it is a list but twice if it is a stream. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] bytestring vs. uvector
On 09/03/2009, at 11:47, Claus Reinke wrote: Btw, have any of the Haskell array optimization researchers considered fixpoints yet? This, for instance, is a very nice paper: http://www.pllab.riec.tohoku.ac.jp/~ohori/research/OhoriSasanoPOPL07.pdf However, in the context of high-performance array programming explicit recursion is bad because it is very hard if not impossible to parallelise automatically except in fairly trivial cases. And if your array program is not parallelisable then you don't really care about performance all that much :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe