Re: [Haskell-cafe] Increasing memory use in stream computation
Arie Peterson wrote: (Sorry for the long email.) Summary: why does the attached program have non-constant memory use? Unfortunately, I don't know. I'll intersperse some remarks and propose an alternative to stream fusion at the end, which allows your test program to run in constant space. A simple program When running the program, the resident memory quickly grows to about 3.5 MB (which I am fine with); this stays constant for a long time, but after about 7 minutes, it starts to grow further. The growth is slow, but I really would hope this program to run in constant memory. A quicker way to spot the increased memory usage is to look at GC statistics. I used ./Test +RTS -Sstderr 21 | grep 'Gen: 1' 569904 8192 65488 0.00 0.000.010.0100 (Gen: 1) 516520 9768 67080 0.00 0.004.234.2300 (Gen: 1) 513824 14136 71448 0.00 0.008.438.4400 (Gen: 1) 515856 16728 74040 0.00 0.00 12.70 12.7500 (Gen: 1) 515416 19080 76392 0.00 0.00 17.01 17.1100 (Gen: 1) 515856 22248 79560 0.00 0.00 21.33 21.4800 (Gen: 1) 514936 25080 82392 0.00 0.00 25.65 25.8400 (Gen: 1) 514936 28632 85944 0.00 0.00 29.94 30.1600 (Gen: 1) 513512 32328 89640 0.00 0.00 34.24 34.4800 (Gen: 1) 515224 37032127112 0.00 0.00 38.35 38.6200 (Gen: 1) Note the increasing values in the third column; that's the live bytes after each major GC. The code Note that I added an instance for Monad Stream, using concatMap. This is implicitly used in the definition of the big stream. The source of Data.Stream contains many alternative implementations of concat and concatMap, and alludes to the difficulty of making it fuse properly. Could it be that the fusion did not succeed in this case? I had a glimpse at the core code generated by ghc, but the amount of code is overwhelming. From reading the source code, and as far as my intuition goes, the code *should* run in constant space. As an experiment, I rewrote the code using difference lists, and the result ran in constant memory. I then tried to abstract this idea into a nice data type. (I lost the low-level difference list code on the way, but the code was quite hard to read anyway.) I ended up with an odd mixture of a difference lists and a continuation that should be applied to each element: data Stream a where Stream :: (forall r. (a - r) - [r] - [r]) - Stream a with Stream s representing the list s id []. The motivation for the (a - r) argument is that it makes fmap trivial: fmap f (Stream s) = Stream (\g - s (g . f)) I'll attach the full code below (it's a separate module, Stream.hs that can be imported instead of Data.Stream for your small example.) With that replacement, the code runs in constant space and becomes about 3x faster. Using the 'singleton' function for 'return' results in an additional, but very modest (about 10%) speedup. I wonder whether that approach scales up to your real code. Enjoy, Bertram {-# LANGUAGE GADTs, Rank2Types #-} -- A difference list based implementation of a small part of the -- Data.Stream interface from the stream-fusion package. module Stream where import Prelude hiding (concatMap) import qualified Data.List as List data Stream a where Stream :: { unStream :: forall r. (a - r) - [r] - [r] } - Stream a empty :: Stream a empty = Stream (\_ - id) singleton :: a - Stream a singleton x = Stream (\f - (f x :)) fromList :: [a] - Stream a fromList xs = Stream (\f zs - foldr (\x xs - f x : xs) zs xs) toList :: Stream a - [a] toList (Stream s) = s id [] instance Functor Stream where fmap f (Stream s) = Stream (\g - s (g . f)) concatMap :: (a - Stream b) - Stream a - Stream b concatMap f g = Stream $ \h zs - foldr (\x - unStream (f x) h) zs (toList g) filter :: (a - Bool) - Stream a - Stream a filter p = concatMap (\x - if p x then singleton x else empty) stream :: [a] - Stream a stream = fromList foldl' :: (a - b - a) - a - Stream b - a foldl' f i s = List.foldl' f i (toList s) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is withAsync absolutely safe?
Roman Cheplyaka wrote: Can withAsync guarantee that its child will be terminated if the thread executing withAsync gets an exception? To remind, here's an implementation of withAsync: withAsyncUsing :: (IO () - IO ThreadId) - IO a - (Async a - IO b) - IO b -- The bracket version works, but is slow. We can do better by -- hand-coding it: withAsyncUsing doFork = \action inner - do var - newEmptyTMVarIO mask $ \restore - do t - doFork $ try (restore action) = atomically . putTMVar var let a = Async t (readTMVar var) r - restore (inner a) `catchAll` \e - do cancel a; throwIO e cancel a return r I am interested in the case when an exception arrives which transfers control to 'cancel', and then another exception arrives to the same thread. Even though 'catchAll' (which is a type-restricted synonym for catch) masks the exception handler, 'throwTo' inside 'cancel' is interruptible (as stated by the documentation). Will this scenario lead to a thread leakage? Yes. I guess that 'cancel' should use 'uninterruptibleMask_', but it's a hard call to make (if an async action becomes unresponsive, do we want to risk not being able to deliver any exceptions to the controlling thread just because it wants to terminate the async action?) Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] I killed performance of my code with Eval and Strategies
Dear Janek, I am reading Simon Marlow's tutorial on parallelism and I have problems with correctly using Eval monad and Strategies. I *thought* I understand them but after writing some code it turns out that obviously I don't because parallelized code is about 20 times slower. Here's a short example (code + criterion benchmarks): Actually, (sin . sqrt) is simply too cheap. The overhead of constructing chunks (which have to be constructed on the heap) and concatenating the results far outweighs the cost of computing the list elements. If, for example, you replace sin . sqrt by f defined by f :: Double - Double f x | x 10 = x*x | otherwise = sin x * f (x-100) the picture will change. The loss also becomes far less dramatic if you construct the chunks outside of the benchmark: main :: IO () main = defaultMain [ bench Seq $ nf (map calculateSeq) xs , bench Par $ nf calculatePar xs ] where xs = chunk 2048 [1..16384] f, f' :: Double - Double f x = sqrt (sin x) f' x | x 10 = x*x | otherwise = sin x * f' (x-100) calculateSeq :: [Double] - [Double] calculateSeq [] = [] calculateSeq (x:xs) = f x : calculateSeq xs calculatePar :: [[Double]] - [[Double]] calculatePar xss = runEval $ parList (rdeepseq . calculateSeq) xss chunk :: Int - [a] - [[a]] chunk _ [] = [] chunk n xs = as : chunk n bs where !(as, bs) = splitAt n xs The parallel version (with f = sqrt . sin) is still somewhat slower than the sequential version with -N1 -- probably due to rdeepseq. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What does unpacking an MVar really mean?
Leon Smith wrote: I am familiar with the source of Control.Concurrent.MVar, and I do see {-# UNPACK #-}'ed MVars around, for example in GHC's IO manager. What I should have asked is, what does an MVar# look like? This cannot be inferred from Haskell source; though I suppose I could have tried to read the Runtime source. So let's have a brief look at the source. MVar# is an RTS specific heap object which contains three pointers: (from ghc/includes/rts/storage/Closures.h) typedef struct { StgHeaderheader; struct StgMVarTSOQueue_ *head; struct StgMVarTSOQueue_ *tail; StgClosure* value; } StgMVar; The 'value' pointer refers to the actual value held by the mutable variable, if any. The 'head' and 'tail' pointers are used for managing a linked list of threads blocked on the mutable variable. An MVar (if evaluated) contains just a pointer the MVar# object. To access the value of an MVar, one starts with a pointer to the MVar heap object. Then, 1. Make sure that the MVar is evaluated, using standard lazy evaluation (follow indirections, enter thunks, ...). In the best case that's a check of a tag bit in the pointer. 2. Read the pointer to the MVar# in the MVar. 3. access the 'value' field of the StgMVar record, which results in another pointer to a heap object representing the actual data held by the MVar. (In reality the code has to check whether the MVar is full or not, and block if necessary. This is quite involved; see stg_takeMVarzh in ghc/rts/PrimOps.cmm) That's two dereferences and some bookkeeping work. In loops, the compiler will often unpack the MVar, so that you can expect the first two steps to be performed just once. Unpacking an MVar into a bigger record means that the pointer to the MVar# will be stored in the record directly, rather than a pointer to an MVar object that holds a pointer to the MVar#. Note that MVar# itself cannot be unpacked -- the StgMVar record will always be a separate heap object. I was asking the dual question: if the MVar# exists in some data structure, can that data structure still be garbage collected when there is a reference to the MVar#, but not the data structure it is contained within. Yes, because the data structure only contains a pointer to an MVar# (StgMVar record) that will live on. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] not enough fusion?
Hi, Johannes Waldmann wrote: s2 :: Int - Int s2 n = sum $ do x - [ 0 .. n-1 ] y - [ 0 .. n-1 ] return $ gcd x y This code shows some interesting behaviour: its runtime depends heavily on the allocation area size. For comparison, with main = print $ s1 1 I see the following GC statistics. # ./a.out +RTS -A512k -s 15,201,891,976 bytes allocated in the heap 4,272,144 bytes copied during GC Total time9.97s ( 10.00s elapsed) %GC time 1.1% (1.1% elapsed) For s2, using main = print $ s2 1, I get # ./s2 +RTS -A512k -s 20,801,251,976 bytes allocated in the heap 3,438,870,504 bytes copied during GC Total time 15.90s ( 15.95s elapsed) %GC time 34.3% (34.4% elapsed) # ./s2 +RTS -A1m -s 20,801,251,976 bytes allocated in the heap 2,724,903,032 bytes copied during GC Total time 14.74s ( 14.80s elapsed) %GC time 29.2% (29.3% elapsed) # ./s2 +RTS -A2m -s 20,801,251,976 bytes allocated in the heap 20,311,952 bytes copied during GC Total time 10.74s ( 10.78s elapsed) %GC time 1.2% (1.2% elapsed) # ./a.out +RTS -A2052k -s 20,801,251,976 bytes allocated in the heap 1,812,776 bytes copied during GC Total time 10.35s ( 10.38s elapsed) %GC time 0.8% (0.8% elapsed) Note that the number of bytes copied during GC drops by three orders of magnitude when we increase the allocation area size from 1 MB to 2 MB, and another order of magnitude when adding an additional 4kB to that. Why does this happen? The reason is a failure of generational GC heuristics. If we look at the core, we find that the inner loop (which generates a list that is consumed by sum) translates to something like nums = [0..] go xs0 = case xs0 of [] - [] x : xs - let go' ys0 = case ys0 of [] - [] y : ys - GHC.Real.gcdInt x y : go' ys in case go' nums of [] - go xs (z : zs) - z : zs ++ go xs At a first glance, this looks fine - there's one big chunk of fixed data (the nums list) that will end up in the old generation. The rest of the data is consumed as it is created. However, this is not quite true: The thunk for the second argument to (++) (representing 'go xs') is already allocated on the heap when the first element of the result of (++), i.e., the first element of zs, is consumed. While zs is processed, this thunk is never touched. If it survives a whole GC-mutator cycle, then the next garbage collection will consider the thunk mature and put it into the old generation. But when the code starts evaluating this 'go xs' call, it produces a long list, all of which is being kept alive by the (now updated) thunk in generation 1, and as a result will be copied during GC, until the next major GC. So the observed behaviour hinges on the question whether the go xs thunk can survive processing the zs list. The memory allocated during this time is constant - besides the list, some memory is allocated for thunks in go', and for intermediate Integers[*] in gcdInt. If the allocation area size exceeds this constant, then the program will run as expected. Note that every time a 'go xs' thunk survives, a lot of extra work is caused for the GC -- this explains the sharp increase in bytes copied observed above. Bertram [*] gcdInt really ought to only do a single allocation for the result, with an inlining opportunity to get rid of that as well. It is defined in GHC.Real as gcdInt :: Int - Int - Int gcdInt a b = fromIntegral (gcdInteger (fromIntegral a) (fromIntegral b)) which used to optimize to a single low-level GHC.Integer.Type.gcdInt call in ghc 7.2. With 7.4 and HEAD, integerToInt and smallInteger are no longer inlined, resulting in worse code. See also http://hackage.haskell.org/trac/ghc/ticket/7041 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] mueval leaving behind tmp files
Johannes Waldmann wrote: The following program prints Right (test,Bool,True) as it should, but it leaves behind in /tmp two files (name is a long string of digits) and an empty directory (name is ghcN_N). I have a partial solutions for this; see the attached patch for hint. It cleans up the numbered files, which are generated for phantom modules. There are two phantom modules generated -- one that defines a _show function for hint's internal use, and one that provides a bunch of pre-imported modules for mueval. The latter will not be generated if the 'modules' mueval option is set to Nothing. For some reason, cleaning the ghc* directory fails if the executed code fails (for example, using 'test = undefined' in Johannes' code). Can anybody explain why? The `finally` handler is run, and the phantom module files are actually deleted. ... and it deletes the input file (/tmp/Main.hs). That appears to be an undocumented mueval feature. It's not a good one, I think. That's not nice. Ideally, I would want to read input from a String (instead of the file), and not write to disk at all. I suppose that ghc's interface does not support this, but I have not checked. Best regards, Bertram 1 patch for repository http://darcsden.com/jcpetruzza/hint: Wed Apr 4 14:59:33 CEST 2012 Bertram Felgenhauer in...@gmx.de * clean temporary files in runInterpreterT(withArgs) New patches: [clean temporary files in runInterpreterT(withArgs) Bertram Felgenhauer in...@gmx.de**20120404125933 Ignore-this: ff9abed505645f81131a57182d371861 ] hunk ./src/Hint/Context.hs 10 PhantomModule(..), ModuleText, addPhantomModule, removePhantomModule, getPhantomModules, + cleanPhantomModules, allModulesInContext, onAnEmptyContext, hunk ./src/Hint/Context.hs 254 -- onState (\s -s{qual_imports = quals}) --- | All imported modules are cleared from the context, and --- loaded modules are unloaded. It is similar to a @:load@ in --- GHCi, but observe that not even the Prelude will be in --- context after a reset. -reset :: MonadInterpreter m = m () -reset = +-- | 'cleanPhantomModules' works like 'reset', but skips the +-- loading of the support module that installs '_show'. Its purpose +-- is to clean up all temporary files generated for phantom modules. +cleanPhantomModules :: MonadInterpreter m = m () +cleanPhantomModules = do -- Remove all modules from context runGhc2 Compat.setContext [] [] -- hunk ./src/Hint/Context.hs 280 import_qual_hack_mod = Nothing, qual_imports = []}) liftIO $ mapM_ (removeFile . pm_file) (old_active ++ old_zombie) - -- - -- Now, install a support module - installSupportModule + +-- | All imported modules are cleared from the context, and +-- loaded modules are unloaded. It is similar to a @:load@ in +-- GHCi, but observe that not even the Prelude will be in +-- context after a reset. +reset :: MonadInterpreter m = m () +reset = do -- clean up context + cleanPhantomModules + -- + -- Now, install a support module + installSupportModule -- Load a phantom module with all the symbols from the prelude we need installSupportModule :: MonadInterpreter m = m () hunk ./src/Hint/InterpreterT.hs 164 ifInterpreterNotRunning $ do s - newInterpreterSession `catch` rethrowGhcException -- SH.protectHandlers $ execute s (initialize args action) - execute s (initialize args action) + execute s (initialize args action `finally` cleanSession) where rethrowGhcException = throw . GhcException . showGhcEx #if __GLASGOW_HASKELL__ 610 newInterpreterSession = do s - liftIO $ hunk ./src/Hint/InterpreterT.hs 170 Compat.newSession GHC.Paths.libdir newSessionData s + cleanSession = cleanPhantomModules -- clean ghc session, too? #else -- GHC = 610 newInterpreterSession = newSessionData () hunk ./src/Hint/InterpreterT.hs 174 + cleanSession = + do cleanPhantomModules + runGhc $ do dflags - GHC.getSessionDynFlags + GHC.defaultCleanupHandler dflags (return ()) #endif {-# NOINLINE uniqueToken #-} Context: [bump to version 0.3.3.4 jcpetru...@gmail.com**20111220224039 Ignore-this: 23d55959cc61ebbd20f5ebd4f2a86bd9 ] [authors file updated jcpetru...@gmail.com**20111220224018 Ignore-this: e4f66f8324ac599e74e665a1e2292c12 ] [compile with ghc 7.4 snapshot Mark Wright gie...@gentoo.org**20111220114907 Ignore-this: cc43ccb4e716324ccfbfbb1d38f2668c ] [TAG 0.3.3.3 jcpetru...@gmail.com**2004192050 Ignore-this: f2f8da08437fa759cb41f0f4e35a11a ] Patch bundle hash: 7ee3dacd0f2fd8713494534c000231ed2e85b783 ___ Haskell-Cafe mailing list Haskell-Cafe
[Haskell-cafe] Cabal-1.10.1.0 and bytestring-0.9.2.0 hackage problem.
Dear list, Cabal-1.10.1.0 contains a bug that causes it to fail to parse the test-suite target of bytestring-0.9.2.0. Since cabal-install parses all package descriptions to before resolving dependencies, users with that version of Cabal are stuck. Now it seems somebody realised this problem and removed bytestring-0.9.2.0 from hackage. However, 1. http://hackage.haskell.org/package/bytestring is now broken. 2. The downloadeble package index (00-index.tar) still contains the bytestring-0.9.2.0 cabal file, so the problem persists. As a workaround, one can remove the cabal file from the downloaded index manually, tar -f ~/.cabal/packages/hackage.haskell.org/00-index.tar --delete bytestring/0.9.2.0 (or rebuild Cabal / cabal-install starting with 'cabal unpack Cabal; cabal unpack cabal-install') But it really needs to be fixed on the hackage server. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] pointer equality
Carl Howells wrote: On Tue, Jul 19, 2011 at 11:14 PM, yi huang yi.codepla...@gmail.com wrote: 2011/7/20 Eugene Kirpichov ekirpic...@gmail.com reallyUnsafePointerEq#, and it really is as unsafe as it sounds :) Why is it so unsafe? i can't find any documentation on it. I think always compare pointer first is a good optimization. False positives and false negatives are both possible, depending on GC timing. At the moment, as implemented in ghc, false positives are not possible, because GC only happens on allocation [*], and there is no allocation happening in that primitive operation. I don't think this is going to change without a total rewrite of ghc, since allowing GC (i.e., moving pointers) at arbitrary times would be a fundamental change to the STG execution model. Pretty much everything else imaginable can happen; in particular, if two variables a and b compared equal at one point, they may later become different pointers again. In the parallel RTS, if you're unlucky, this may even be a permanent effect. Best regards, Bertram [*] we'll have thread-local GC for the first generation soon, but a lot of effort went into ensuring consistentcy of pointers seen by other threads. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] pointer equality
David Barbour wrote: On Wed, Jul 20, 2011 at 10:40 AM, Chris Smith cdsm...@gmail.com wrote: The point, I think, is that if pointer equality testing really does what it says, then there shouldn't *be* any correct implementation in which false positives are possible. It seems the claim is that the garbage collector might be moving things around, have just by chance happened to place the second value in the spot formerly occupied by the first, and have not updated the first pointer yet. But if that's the case, and it's executing arbitrary user code that may refer to that memory, then the garbage collector contains race conditions! You assume that the GC uses the same primitive as the developer, and is inherently subject to its own race conditions. But Bertram has said that false positives are not possible. I can only assume that the pointer comparison is atomic with respect to the GC. That's right. A lot of things that the CMM code (and eventually the machine code) generated by ghc does is atomic with respect to GCs - from a single worker thread's point of view, GCs only happen when it tries to allocate some memory. (Then it does a heap check, and if that fails, saves some state and hands control over to the garbage collector. If the state contains pointers, the GC will know that and adjust them. Finally the state is restored and execution resumes.) Between these points, the code is free to access pointers on the stack and heap and dereference them, without having to worry about GC changing the memory under its nose. The reallyUnsafePointerEquality# primitive is implemented at this low level, and there are no intervening heap checks, and thus no GCs that could interfere with the comparison. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Splitting Hackage Packages and re-exporting entire modules (with same module name)
Antoine Latter wrote: If you give the module a new name in the new package then the old module can re-export all of the symbols in the new module. In GHC I don't think there is a way for two packages to export the same module and have them be recognized as the same thing, as far as I know. Right, but you don't have to rename the module if you use the PackageImports extension. (Incidentally, this used by the haskell2010 ibrary, which is implemented in terms of base) http://haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#package-imports So the following should work, assuming 'original' is a package exporting the Foo.Bar module. {-# LANGUAGE PackageImports #-} module Foo.Bar (module Original) where import original Foo.Bar as Original Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Generating random graph
Hi Mitar, I have made this function to generate a random graph for Data.Graph.Inductive library: generateGraph :: Int - IO (Gr String Double) generateGraph graphSize = do when (graphSize 1) $ throwIO $ AssertionFailed $ Graph size out of bounds ++ show graphSize let ns = map (\n - (n, show n)) [1..graphSize] es - fmap concat $ forM [1..graphSize] $ \node - do nedges - randomRIO (0, graphSize) others - fmap (filter (node /=) . nub) $ forM [1..nedges] $ \_ - randomRIO (1, graphSize) gen - getStdGen Others have already remarked that you could implement this as a pure function. However, the mistake is the use of getStdGen here, which is (almost?) never what you need: two consecutive valls of getStdGen will return the same generator. You should call newStdGen instead. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Faster timeout but is it correct?
Hi Bas, The solution is probably to reverse the order of: unsafeUnmask $ forkIO to forkIO $ unsafeUnmask. Or just use forkIOUnmasked. The reason I didn't used that in the first place was that it was much slower for some reason. The reason is probably that in order for the forkIOUnmaske-d thread to receive an exception (which it must, to be killed), it first has to be activated at least once, to perform the unsafeUnmask. So the worker thread's call to killThread will block. Now if there were a way to specify the exception mask of the newly created thread directly, it should be just as fast as the unsafeUnmask . forkIO version. I have not checked the event manager based implementation in detail, but from your numbers it looks like the best option at this time. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Misleading MVar documentation
Mitar wrote: Hi! On Sat, Dec 25, 2010 at 11:58 AM, Edward Z. Yang ezy...@mit.edu wrote: I think you're right. A further comment is that you don't really need stringent timing conditions (which is the only thing I think of when I hear race) to see another thread grab the mvar underneath you Yes, MVars are (bounded, 1 space long) queues with predictable behavior. Maybe we should change documentation for swapMVar (and others) and replace notion of race condition with that it can block. But this was not the issue this thread was about, namely: If readMVar (to pick one example) is applied to a full MVar with a pending putMVar, then readMVar will block *after* reading the value from the MVar, before being able to put it back. This is surprising, as it means that the value of the MVar can change during the execution of readMVar. (it will be changed back before readMVar completes but then the damage may already be done.) I think atomicity is the right concept here: All these operations consist of two separate steps (and can be interrupted in the middle) and this behaviour is observable if the MVar is not used in a single token mutex fashion, where either the MVar is full or there is exactly one thread owning the token. (Unlike simple mutexes, the MVar token can be labeled and re-labeled by the thread owning it.) Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Are newtypes optimised and how much?
| Then we can define | | safeCoerce :: (a ~~ b) = a - b | safeCoerce = unsafeCoerce Yes, that's right. When I said we have the technology I meant that we (will) have something similar to ~~. See our paper Generative Type Abstraction and Type-level Computation http://www.cis.upenn.edu/~sweirich/newtypes.pdf. No unsafeCoerce required. The idea was to put safeCoerce into a library. The syntax extension would be light-weight because contexts, unlike expressions, still have plenty of room for extensions. The idea is is based on the assumption that to the compiler, 'unsafeCoerce' looks like an artificial safe coercion, so that after inlining safeCoerce, we get exactly the effect of a safe coercion during type checking and further compilation. Perhaps that assumption is wrong. I'll look at the paper. Regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Scrap your rolls/unrolls
Max Bolingbroke wrote: On 23 October 2010 15:32, Sjoerd Visscher sjo...@w3future.com wrote: A little prettier (the cata detour wasn't needed after all): data IdThunk a type instance Force (IdThunk a) = a Yes, this IdThunk is key - in my own implementation I called this Forced, so: type instance Force (Forced a) = a You can generalise this trick to abstract over type functions, though I haven't had a need to do this yet. Let us suppose you had these definitions: ... With type functions, Haskell finally type-level lambda of a sort :-) Indeed. I had a lot of fun with the ideas of this thread, extending the 'Force' type family (which I call 'Eval' below) to a small EDSL on the type level: The EDSL supports constants data Con (t :: *)-- type constant data Con1 (t :: * - *) -- unary type constructor data ConE (t :: * - *) -- like Con1, but the argument is used with Eval and a few operators data App a b -- apply a to b data Fix a -- compute fixpoint of a data a :.: b -- compose two unary type constructors There is a type family Eval that maps expressions from that EDSL to actual types, type family Eval t :: * The basic operations are straight-forward to implement, type instance Eval (Con t) = t type instance Eval (App (Con1 t) a) = t (Eval a) type instance Eval (App (ConE t) a) = t a type instance Eval (App (a :.: b) c) = Eval (App a (App b c)) Now we turn to fixed points. An easy definition would be type instance Eval (Fix f) = Eval (App f (Fix f)) Let's play with that. For that, we defined data TreeF a t = Node a [Eval t] type Tree a = Eval (Fix (ConE (TreeF a))) -- works fine in ghc 7.1 -- ghc 6.12.3 chokes on it for the recursive types deriving instance (Show a, Show (Eval t)) = Show (TreeF a t) And indeed, ghci Node 1 [Node 2 []] :: Tree Int Node 1 [Node 2 []] and a lot of other things work as expected. But what if we want to work with fixed points of the composition of several functors? This works fine: type Tree2 a b = Eval (Fix (ConE (TreeF a) :.: ConE (TreeF b))) t0 :: Tree2 Bool Int t0 = Node True [Node 1 [Node False []]] t1 :: Tree2 Int Bool t1 = Node 1 [Node True [Node 1 [Node False [ but this doesn't: t1 :: Tree2 Int Bool t1 = Node 1 [t0] We can help the type checker out by evaluating fixed points for compositions differently: Instead of applying the whole sequence of functors at once, apply them one by one, and keep the remaining sequence in a nice shape so that the type checker can identify equivalent compositions. The implementation is somewhat verbose, but quite straight-forward tree manipulation. -- easy case first type instance Eval (Fix (ConE f)) = f (Fix (ConE f)) -- handle compositions, phase 1.: find last element. type instance Eval (Fix (a :.: (b :.: c))) = Eval (Fix ((a :.: b) :.: c)) type instance Eval (Fix (a :.: (ConE b))) = Eval (Fix1 a (ConE b)) type instance Eval (Fix (a :.: (Con1 b))) = Eval (Fix1 a (Con1 b)) data Fix1 a b -- compositions, phase 2.: build left-associative composition chain type instance Eval (Fix1 (a :.: (b :.: c)) d) = Eval (Fix1 ((a :.: b) :.: c) d) type instance Eval (Fix1 (a :.: ConE b) c) = Eval (Fix1 a (ConE b :.: c)) type instance Eval (Fix1 (a :.: Con1 b) c) = Eval (Fix1 a (Con1 b :.: c)) -- compositions, final step: apply first element to fixpoint of shifted cycle type instance Eval (Fix1 (ConE a) b) = a (Fix (b :.: ConE a)) type instance Eval (Fix1 (Con1 a) b) = a (Eval (Fix (b :.: Con1 a))) And with that implementation, the above definition of t1 typechecks. Full code with more examples is available at http://int-e.cohomology.org/~bf3/haskell/Fix.hs Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Are newtypes optimised and how much?
Simon Peyton-Jones wrote: What you really want is to say is something like this. Suppose my_tree :: Tree String. Then you'd like to say my_tree ::: Tree Foo meaning please find a way to convert m_tree to type (Tree Foo), using newtype coercions. The exact syntax is a problem (as usual). We have the technology now. The question is how important it is. I think extending the syntax for contexts would be sufficient: Write a ~~ b for a can be converted to b by wrapping / unwrapping newtypes, which is a conservative approximation of a and b have the same representation. Then we can define safeCoerce :: (a ~~ b) = a - b safeCoerce = unsafeCoerce and your example would become safeCoerce my_tree :: Tree Foo The feature would add convenience to the language when working with newtypes, and reduce the tension between type safety and performance (where the choice is between using a newtype and unsafeCoerce, and just working with the plain underlying type.) So while the pressure is quite low, I imagine it would become quite a useful feature once we'd have it, but that's of course speculation. As far as I can see, the feature is nontrivial: Care has to be taken to not break abstractions (like safelycoercing IO to ST), so it's quite possible that the engineering effort outweighs the potential benefits. Best regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Eta-expansion destroys memoization?
Simon Marlow wrote: Interesting. You're absolutely right, GHC doesn't respect the report, on something as basic as sections! The translation we use is (e op) == (op) e once upon a time, when the translation in the report was originally written (before seq was added) this would have been exactly identical to \x - e op x, so the definition in the report was probably used for consistency with left sections. We could make GHC respect the report, but we'd have to use (e op) == let z = e in \x - z op x to retain sharing without relying on full laziness. We should keep in mind that this was changed deliberately in ghc 6.6, in order to support postfix operators. http://www.haskell.org/ghc/docs/6.6/html/users_guide/release-6-6.html The motivating example was the factorial operator which can currently be written as (n !) in ghc-Haskell. Cheers, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] hClose: invalid argument (Invalid or incomplete multibyte or wide character)
Hi, Daniel Fischer wrote: On Tuesday 05 October 2010 23:34:56, Johannes Waldmann wrote: main = writeFile check.out ü that's u-umlaut, and the source file is utf-8-encoded and ghc-6.12.3 compiles it without problems but when running, I get hClose: invalid argument (Invalid or incomplete multibyte or wide character) In order to make the behaviour independent of the locale (which is desirable for programs storing state in text files), you can use functions like writeFileUTF8 and readFileUTF8 here: import System.IO writeFileUTF8 file text = withFile file WriteMode $ \handle - do hSetEncoding handle utf8 hPutStr handle text readFileUTF8 file = do handle - openFile file ReadMode hSetEncoding handle utf8 hGetContents handle main = do let s = äöü writeFileUTF8 test.out s s' - readFileUTF8 test.out putStrLn $ unwords [s, ==, s'] Of course using System.IO.UTF8 from utf8-string would also work. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Why isn't there a cheaper split-in-two operation for Data.Set?
Ryan Newton wrote: Would there be anything wrong with a Data.Set simply chopping off half its (balanced) tree and returning two approximately balanced partitions ... cleave :: Set a - (Set a, Set a) cleave Tip = (Tip, Tip) cleave (Bin _ x l r) | size l size r = (l, insertMin x r) | otherwise = (insertMax x l, r) This function would expose some of the internal structure of Set - i.e. there could be equal sets s1 == s2 with cleave s1 /= cleave s2. Maybe a better idea than to expose such a function would be to split Data.Set into Data.Set.Internal and Data.Set, where Data.Set.Internal would export the actual Tip and Bin constructors. Then people who want to break the abstraction, for example to experiment with parallel folds, could do that easily. regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: base-3 -gt; base-4
Johannes Waldmann wrote: Ivan Lazar Miljenovic ivan.miljenovic at gmail.com writes: ... the only thing that changed of significance was the exception handling: Control.Exception now uses extensible exceptions base-4 also introduced the Control.Category.Category class and restructured Control.Arrow to use that. I'm pretty sure ghc-6.12.3 gives warnings this will not work with base-4 in some more places. I was hoping there is a comprehensive list somewhere. There's http://www.haskell.org/haskellwiki/Upgrading_packages but it's not up-to-date, probably because updating to ghc-6.12 was comparatively painless. And of course there are the ghc-6.10.1 release notes which also cover the changes in base-4: http://www.haskell.org/ghc/docs/6.10.1/html/users_guide/release-6-10-1.html Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Fast Integer Input
Serguey Zefirov wrote: 2010/8/23 200901...@daiict.ac.in: This function takes 1.8 seconds to convert 2000 integers of length 10^13000. I need it to be smaller that 0.5 sec. Is it possible? 2000 integers of magnitude 10^13000 equals to about 26 MBytes of data (2000 numbers each 13000 digits long). Rounding 1.8 seconds to two seconds we conclude that you proceed with speed about 13MBytes per second. Assuming you have CPU clock frequency near 2.6GHz, you performing about 200 clock cycles per input digit. 10^13000 roughly equal to 2^39000. Or (2^32)^1219 - 1219 32-bit words of representation. So you're doing some last nextN = (n*10)+currentDigit conversion operations in less that one clock cycle per word. You can do better than calculating (n*10 + d) repeatedly, using a divide and conquer scheme, which is in fact implemented in ByteString's readInteger: ((a*10 + b) * 100 + (c*10 + d)) * 1 + ... This helps because now we're multiplying numbers of roughly equal size, which using FFT and related methods, as gmp uses, can be sped up immensely, beating the quadratic complexity that you get with the naive approach. The timings seem about right. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak with unsafePerformIO
Henning Thielemann wrote: Attached is a program with a space leak that I do not understand. I have coded a simple 'map' function, once using unsafePerformIO and once without. UnsafePerformIO has a space leak in some circumstances. In the main program I demonstrate cases with and without space leak. Without space leak the program writes a file to the disk until it is full. Any idea? The program relies on the GC doing short-cut evaluation of record selectors to avoid a space leak. If the user of the function splitAtLazy | splitAtLazy :: [b] - [a] - ([a],[a]) | splitAtLazy nt xt = |(\ ~(ys,zs) - (ys,zs)) $ |case (nt,xt) of | (_:ns, x:xs) - | let (ys,zs) = splitAtLazy ns xs | in (x:ys,zs) | (_, xs) - ([],xs) somehow holds on to a reference of the returned pair while processing the first part of the list, there will be a space leak, because that means that the whole prefix remains reachable. splitAtLazy itself is not leaky, because the value returned by the recursive call is scrutinized as follows, Main.$wsplitAtLazy = ... (# case ds_sLb of wild_B1 { (ys_agC, zs_agE) - ys_agC }, case ds_sLb of wild_B1 { (ys_agC, zs_agE) - zs_agE } #) and ghc turns that into record selector thunks in the code generator. The precise rule can be found in compiler/codeGen/StgCmmBind.hs: | Note [Selectors] | ~~~ | We look at the body of the closure to see if it's a selector---turgid, | but nothing deep. We are looking for a closure of {\em exactly} the | form: | | ... = [the_fv] \ u [] - | case the_fv of |con a_1 ... a_n - a_i Now let's look at how the result of splitAtLazy is used. non-leaky version (case 0): Main.lvl1 = case Main.$wsplitAtLazy @ () @ GHC.Types.Char Main.xs Main.xs1 of ww_sLv { (# ww1_sLx, ww2_sLy #) - Main.go (GHC.Base.++ @ GHC.Types.Char ww1_sLx ww2_sLy) } The return values are passed on to (++) directly. The result pair is actually never built at all, so no reference to it can be kept. leaky version (case 3): Main.ds = case Main.$wsplitAtLazy @ () @ GHC.Types.Char Main.xs Main.xs1 of ww_sKy { (# ww1_sKA, ww2_sKB #) - (ww1_sKA, ww2_sKB) } This builds the pair returned by splitAtLazy. Main.lvl1 = case Main.ds of wild_Xw { (prefix_aCf, suffix_aCh) - prefix_aCf } Use of prefix: it's a record selector. This is fine. Main.lvl2 = Main.go Main.lvl1 The prefix is then passed to some worker function. Main.lvl3 = case Main.ds of wild_Xw { (prefix_aCf, suffix_aCh) - Main.go1 suffix_aCh } Use of suffix: Due to the call of Main.go1 this is *not* a record selector. It is compiled to an actual case expression, which to the garbage collector looks just like an ordinary thunk. A reference to Main.ds is kept around until the suffix is about to be processed and a memory leak ensues. If the compiler had produced Main.lvl3 = case Main.ds of wild_Xw { (prefix_aCf, suffix_aCh) - suffix_aCh } Main.lvl4 = Main.go1 Main.lvl3 instead, then there would not be a leak. This whole record selector thunk business is very fragile. The good news is that the problem is completely unrelated to unsafePerformIO (the presence of unsafePerformIO makes optimisations more difficult, but any pure function of sufficient complexity would have the same effect). There's a simple fix for the problem, too: Change let (prefix, suffix) = makeTwoLists 'a' to let !(prefix, suffix) = makeTwoLists 'a' in which case the compiler produces code similar to the non-leaky case for all alternatives. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Huffman Codes in Haskell
Andrew Bromage wrote: But honestly, it's just not that hard to do in linear time, assuming the symbols are sorted by frequency: Or maybe not so easy. But not much harder. data Tree a = Branch (Tree a) (Tree a) | Leaf a deriving Show huffmanTree :: (Ord a, Num a) = [(a, b)] - Tree b huffmanTree [] = error huffmanTree: empty code huffmanTree xs = let xs' = sortBy (comparing fst) [(a, Leaf b) | (a, b) - xs] branches ((a, l) : (b, r) : ts) = (a+b, Branch l r) : branches ts merged = merge (-1 :: Int) xs' (branches merged) merge n [] ys = take n ys merge n (x:xs) ys | n = 0 = x : merge (n+1) xs ys merge n (x:xs) (y:ys) = case comparing fst x y of GT - y : merge (n-1) (x:xs) ys _ - x : merge (n+1) xs (y:ys) in snd $ last merged regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] C variable access via FFI
Tom Hawkins wrote: I have a bunch of global variables in C I would like to directly read and write from Haskell. Is this possible with FFI, Yes it is, as explained in section 4.1.1. in the FFI specification [1]. An import for a global variable int bar would look like this: foreign import ccall bar bar :: Ptr CInt The difference to an import of a function int foo() is the extra . HTH, Bertram [1] http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi/ffise4.html#x7-170004.1.1 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC, odd concurrency space leak
Bulat Ziganshin wrote: This expands as always a = a always a = a a always a = a a a always a ... where each application is represented by a newly allocated object (or several, I have not looked at it in detail) on the heap. why you think so? At the time I wrote this, because it explains the space leak and because the space leak disappears if I address this precise issue. But I've since verified the theory by inspecting Core and Cmm code. i always thought that in ghc just sequentially executes statements, the RealWorld magic exists only at compile-time Yes, that's what happens once () gets actually executed in IO. But this fact and the RealWorld token have nothing to do with the whole issue, which is about accumulating a chain of IO actions that have not yet been executed. I'll continue to write a b, which in IO, modulo newtypes, stands for \(s :: RealWorld#) - case a s of (s', _) - b s' The fact that the state token disappears at runtime does not change that this is a closure, represented by a (function) heap node. So we have some IO action let x = always a Now we run x, but also hold onto the corresponding thunk to reuse it later, say let x = always a in x x In order to execute that, x is forced, and evaluated to let x = let x' = always a in a x' in x x or, equivalently, let x' = always a x = a x' in x x Then the first step of the IO action is performed, resulting in let x' = always a x = a x' in x' x And now the same reduction happens again for x', let x2 = always a x' = a x2 x = a x' in x2 x and then again for x2, let x3 = always a x2 = a x3 x' = a x2 x = a x' in x2 x and so on, ad infinitum. This leaks memory because x, x', x2 etc. can't be garbage collected - there's still a reference to x. Note that this also explains why the space leak disappears if we remove the 'forever' in the spawner thread in the original example. This would not happen if the 'always a' was reused, i.e. if the code tied a knot as let act = a act in act does, but as you can see in the Core (and even Cmm if you look closely enough) that does not happen in those cases where the code leaks memory. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC, odd concurrency space leak
Daniel Fischer wrote: Am Samstag 17 April 2010 14:41:28 schrieb Simon Peyton-Jones: I have not been following the details of this, I'm afraid, but I notice this: forever' m = do _ - m forever' m When I define that version of forever, the space leak goes away. What was the old version of forever that led to the leak? Control.Monad.forever forever :: Monad m = m a - m b forever m = m forever m However, that isn't the problem. In my tests, both variants of forever exhibit the same behaviour, what makes it leak or not is the optimisation level. This definition, plus sharing, is the source of the space leak. Consider this modification of your code: import Control.Concurrent always :: Monad m = m a - m b always a = -- let act = a act in act do _ - a always a noop :: IO () noop = return () body :: IO () body = always noop spawner :: IO () spawner = do forkIO $ body putStrLn Delaying threadDelay 100 body `seq` return () main :: IO () main = do putStrLn Spawning forkIO spawner putStrLn Delaying main threadDelay 400 Note that the 'always' in 'spawner' is gone, but it still exhibits the space leak. The leak goes away if the final line of 'spawner' is removed, hinting at the real problem: 'always' actually creates a long chain of actions instead of tying the knot. Indeed the following definition of 'always' (or 'forever') fares better in that regard, but is more susceptible to producing unproductive loops: always a = let act = a act in act (I used noop = yield for avoiding that problem in my tests) regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC, odd concurrency space leak
Bulat Ziganshin wrote: Hello Bertram, Sunday, April 18, 2010, 12:11:05 AM, you wrote: always a = -- let act = a act in act do _ - a always a hinting at the real problem: 'always' actually creates a long chain of actions instead of tying the knot. can you explain it deeper? it's what i see: always definition is equivalent to always a = do a always a what's the same as always a = a always a This expands as always a = a always a = a a always a = a a a always a ... where each application is represented by a newly allocated object (or several, I have not looked at it in detail) on the heap. With always a = let act = a act in act there's only one application being allocated. The principle is the same as with repeat x = x : repeat x versus repeat x = let xs = x : xs in xs HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC, odd concurrency space leak
Daniel Fischer wrote: Except that with optimisations turned on, GHC ties the knot for you (at least if always isn't exported). Without -fno-state-hack, the knot is tied so tightly that always (return ()) is never descheduled (and there's no leak). Yes, I was concentrating on -O2, without -fno-state-hack. With -fno-state-hack, I get Rec { Main.main_always :: GHC.Types.IO () - GHC.Types.IO () GblId [Arity 1 NoCafRefs Str: DmdType L] Main.main_always = \ (a_aeO :: GHC.Types.IO ()) - let { k_sYz :: GHC.Types.IO () LclId [Str: DmdType] k_sYz = Main.main_always a_aeO } in (\ (eta_ann :: GHC.Prim.State# GHC.Prim.RealWorld) - case (a_aeO `cast` (GHC.Types.NTCo:IO () :: GHC.Types.IO () ~ (GHC.Prim.State# GHC.Prim.RealWorld - (# GHC.Prim.State# GHC.Prim.RealWorld, () # eta_ann of _ { (# new_s_anz, _ #) - (k_sYz `cast` (GHC.Types.NTCo:IO () :: GHC.Types.IO () ~ (GHC.Prim.State# GHC.Prim.RealWorld - (# GHC.Prim.State# GHC.Prim.RealWorld, () # new_s_anz }) `cast` (sym (GHC.Types.NTCo:IO ()) :: (GHC.Prim.State# GHC.Prim.RealWorld - (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) ~ GHC.Types.IO ()) end Rec } Which is always = \a_aeO - let k_sYz = always a_aeO in a_aeO k_sYz specialised to IO, and with () inlined. Where is the knot? regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Fwd: [Haskell-cafe] Re: Simple game: a monad for each player
Limestraël wrote: Okay, I just understood that 'Prompt' was just a sort of view for 'Program'. Right. runMyStackT :: MyStackT (Player m) a - Player m a According to what Bertram said, each strategy can pile its own custom monad stack ON the (Player m) monad. Yes, and I meant what Heinrich wrote, you wrap some transformer around the common Player m monad. game :: Monad m = Player m () - Player m () - m () As it is written, it requires both players to run in the SAME monad. And if have a network player ( e.g.* Player (StateT Handle IO)* ) and an AI storing former opponent's moves ( e.g. *(Monad m) = Player (StateT [Move] m)* ), then they can't be in the same monad... The idea is to pick m = IO, and then use type NetPlayer a = StateT Handle (Player IO) a and type AIPlayer a = StateT [Move] (Player IO) a or possibly type AIPlayer a = StateT [Move] (Player Identity) a using the mapPlayerM (or mapMonad as suggested by Heinrich) function. You'd then provide functions like runAIPlayer :: AIPlayer a - Player IO a runAIPlayer player = {- mapMonad (return . runIdentity) $ -} evalStateT player [] This gives you most of what you want: You can add custom state and the like to each player. You can not hope to exchange the base monad m, because then the 'game' function would have to know how to run both of those base monads simultaneously. A function like mapMonad is the best device you can hope for, I think. regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Simple game: a monad for each player
Yves Parès wrote: I answered my own question by reading this monad-prompt example: http://paste.lisp.org/display/53766 But one issue remains: those examples show how to make play EITHER a human or an AI. I don't see how to make a human player and an AI play SEQUENTIALLY (to a TicTacToe, for instance). A useful idea is to turn the construction upside-down - rather than implementing the game logic using MonadPrompt (or operational), implement the players in such a monad. A sketch: {-# LANGUAGE GADTs, EmptyDataDecls #-} import Control.Monad.Prompt hiding (Lift) data Game -- game state data Move -- move data Request m a where Board:: Request m Game MakeMove :: Move - Request m () Lift :: m a - Request m a type Player m a = Prompt (Request m) a The core game logic would be provided by functions initGame :: Monad m = m Game initGame = undefined makeMove :: Monad m = Move - Game - m Game makeMove = undefined To run a game we need to mediate between the two players, performing their moves. To make this easier we turn the Player's program into a list of actions. (This is essentially the Prompt type of the operational package.) data Program p a where Return :: a - Program p a Then :: p b - (b - Program p a) - Program p a programView :: Prompt p a - Program p a programView = runPromptC Return Then game :: Monad m = Player m () - Player m () - m () game first second = do g - initGame let first' = programView first second' = programView second go :: Monad m = Game -- current state - Program (Request m) () -- player 1 - Program (Request m) () -- player 2 - m () go g (Return _) pl2 = return () go g (Then (Lift l) pl1) pl2 = l = \a - go g (pl1 a) pl2 go g (Then Board pl1) pl2 = go g (pl1 g) pl2 go g (Then (MakeMove mv) pl1) pl2 = makeMove mv g = \g - go g pl2 (pl1 ()) go g first' second' Note that MakeMove swaps the two players. What have we achieved? Both players still can only access functions from whatever monad m turns out to be. But now each strategy can pile its own custom monad stack on the Player m monad! And of course, the use of the m Monad is completely optional. Mapping between various 'm' monads may also be useful: mapPlayerM :: forall m1 m2 a . (forall a . m1 a - m2 a) - Player m1 a - Player m2 a mapPlayerM m1m2 pl = runPromptC return handle pl where handle :: Request m1 x - (x - Player m2 a) - Player m2 a handle (Lift a) x = prompt (Lift (m1m2 a)) = x handle (MakeMove mv) x = prompt (MakeMove mv) = x handle (Board) x = prompt (Board) = x This could be used to lock out the AI player from using IO, say. HTH, Bertram (aka int-e) P.S. this is what 'game' would look like without the intermediate 'Program' type, using bare continuations instead: newtype Pl m = Pl { runPl :: Pl m -- other player - Game -- current game state - m () } gameC :: Monad m = Player m () - Player m () - m () gameC first second = do g - initGame let pl1 = runPromptC ret handle first pl2 = runPromptC ret handle second ret _ = Pl $ \_ _ - return () handle :: Monad m = Request m a - (a - Pl m) - Pl m handle (Lift l) pl1 = Pl $ \pl2 g - l = \a - runPl (pl1 a) pl2 g handle Board pl1 = Pl $ \pl2 g - runPl (pl1 g) pl2 g handle (MakeMove mv) pl1 = Pl $ \pl2 g - runPl pl2 (pl1 ()) = makeMove mv g runPl pl1 pl2 = initGame ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Asynchronous exception wormholes kill modularity
Simon Marlow wrote: but they are needlessly complicated, in my opinion. This offers the same functionality: mask :: ((IO a - IO a) - IO b) - IO b mask io = do b - blocked if b then io id else block $ io unblock How does forkIO fit into the picture? That's one point where reasonable code may want to unblock all exceptions unconditionally - for example to allow the thread to be killed later. timeout t io = block $ do result - newEmptyMVar tid - forkIO $ unblock (io = putMVar result) threadDelay t `onException` killThread tid killThread tid tryTakeMVar result regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Asynchronous exception wormholes kill modularity
Simon Marlow wrote: On 09/04/2010 09:40, Bertram Felgenhauer wrote: Simon Marlow wrote: mask :: ((IO a - IO a) - IO b) - IO b How does forkIO fit into the picture? That's one point where reasonable code may want to unblock all exceptions unconditionally - for example to allow the thread to be killed later. Sure, and it works exactly as before in that the new thread inherits the masking state of its parent thread. To unmask exceptions in the child thread you need to use the restore operator passed to the argument of mask. This does mean that if you fork a thread inside mask and don't pass it the restore operation, then it has no way to ever unmask exceptions. At worst, this means you have to pass a restore value around where you didn't previously. timeout t io = block $ do result - newEmptyMVar tid - forkIO $ unblock (io = putMVar result) threadDelay t `onException` killThread tid killThread tid tryTakeMVar result This would be written timeout t io = mask $ \restore - do result - newEmptyMVar tid - forkIO $ restore (io = putMVar result) threadDelay t `onException` killThread tid killThread tid tryTakeMVar result I'm worried about the case when this function is called with exceptions already blocked. Then 'restore' will be the identity, and exceptions will continue to be blocked inside the forked thread. You could argue that this is the responsibility of the whole chain of callers (who'd have to supply their own 'restore' functions that will have to be incorporated into the 'io' action), but that goes against modularity. In my opinion there's a valid demand for an escape hatch out of the blocked exception state for newly forked threads. It could be baked into a variant of the forkIO primitive, say forkIOwithUnblock :: ((IO a - IO a) - IO b) - IO ThreadId Kind regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: breadth first search one-liner?
Ross Paterson wrote: On Mon, Mar 22, 2010 at 10:30:32AM +, Johannes Waldmann wrote: Nice! - Where's the 'nub'? A bit longer: bfs :: Eq a = (a - [a]) - a - [a] bfs f s = concat $ takeWhile (not . null) $ map snd $ iterate step ([], [s]) where step (seen, xs) = let seen' = xs++seen in (seen', nub $ [y | x - xs, y - f x, notElem y seen']) Basically the same idea: bfs next start = let go _ [] = [] go xs ys = let zs = nub (ys = next) \\ xs in ys ++ go (zs ++ xs) zs in go [start] [start] A slightly different approach is to add stage markers to the produced streams, say bfs next start = let xs = nub $ Left 0 : Right s : (xs = next') next' (Left n) = [Left (n + 1)] next' (Right s) = map Right (next s) stop (Left _ : Left _ : _) = [] stop (Left x : xs) = stop xs stop (Right x : xs) = x : stop xs in stop xs or bfs next start = lefts . takeWhile (not . null) . unfoldr (Just . span (either (const False) (const True)) . tail) $ fix (nub . (Left 0 :) . (Right start :) . (= either ((:[]) . Left . succ) (map Right . next))) This has the advantage that nub can be used directly. But it's far from beautiful. regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: breadth first search one-liner?
Bertram Felgenhauer wrote: or bfs next start = lefts . takeWhile (not . null) I copied the wrong version. This should be bfs next start = rights . concat . takeWhile (not . null) -- rest unchanged . unfoldr (Just . span (either (const False) (const True)) . tail) $ fix (nub . (Left 0 :) . (Right start :) . (= either ((:[]) . Left . succ) (map Right . next))) Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] parallel matrix multiply (dph, par/pseq)
Johannes Waldmann wrote: Hello. How can I multiply matrices (of Doubles) with dph (-0.4.0)? (ghc-6.12.1) - I was trying type Vector = [:Double:] type Matrix = [:Vector:] times :: Matrix - Matrix - Matrix times a b = mapP ( \ row - mapP ( \ col - sumP ( zipWithP (*) row col ) ) ( transposeP b ) ) a but there is no such thing as transposeP. It's possible to implement transposeP as follows, {-# LANGUAGE PArr #-} ... import qualified Data.Array.Parallel.Prelude.Int as I transposeP :: Matrix - Matrix transposeP a = let h = lengthP a w = lengthP (a !: 0) rh = I.enumFromToP 0 (h I.- 1) -- or [: 0 .. h I.- 1 :] rw = I.enumFromToP 0 (w I.- 1) -- or [: 0 .. w I.- 1 :] in if h == 0 then [: :] else mapP (\y - mapP (\x - a !: x !: y) rh) rw Maybe there is a better way? Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Are there standard idioms for lazy, pure error handling?
Duncan Coutts wrote: Another approach that some people have advocated as a general purpose solution is to use: data Exceptional e a = Exceptional { exception :: Maybe e result:: a } However it's pretty clear from the structure of this type that it cannot cope with lazy error handling in sequences. If you try it you'll find you cannot do it without space leaks. It's not all that clear. Consider this toy example (from a private discussion with Henning Thielemann a while ago), which runs in constant space: import System.IO.Unsafe import System.Environment import Control.Monad import Data.List data Exceptional e a = Exceptional { exception :: Maybe e, result :: a } ok a = Exceptional Nothing a fault e a = Exceptional (Just e) a faulty :: Int - IO (Exceptional Int [Int]) faulty 0 = return (fault 0 []) faulty 1 = return (ok []) faulty n = unsafeInterleaveIO $ do -- getChar r - faulty (n-2) return $ Exceptional (exception r) (n : result r) main = do n - readIO . head = getArgs Exceptional exc res - faulty n print $ last res when (n `mod` 3 == 0) $ print exc This works because ghc's garbage collector evaluates record selectors. (There are a simpler cases where this matters, for example last $ fst $ unzip [(a,a) | a - [1..1]] which also runs in constant space.) The approach is very fragile, though. For example, if we change main to main = do n - readIO . head = getArgs f - faulty n print $ last (result f) when (n `mod` 3 == 0) $ print (exception f) then the space leak reoccurs - doing the pattern match on the Excpeptional constructor before using the result is essential. Bad things also happen if ghc's optimiser turns the record selectors into explicit pattern matches in the worker ('faulty' in the example). Kind regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Fair diagonals
Martijn van Steenbergen wrote: Bonus points for the following: * An infinite number of singleton axes produces [origin] (and finishes computing), e.g. forall (infinite) xs. diagN (map (:[]) xs) == map (:[]) xs This can't be done - you can not produce any output before you have checked that all the lists are not empty: diag (replicate n [0] ++ [[]]) == [] Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] \Statically checked binomail heaps?
Maciej Kotowicz wrote: I'm trying to implement a binomial heaps from okaski's book [1] but as most it's possible to be statically checked for correctness of definition. How about this encoding in Haskell 98? data Tree a t = Tree { root :: a, children :: t } data Nest a t = Nest { head :: Tree a t, tail :: t } where - Tree a () is a binomial tree of order 0 - Tree a (Nest a ()) is a binomial tree of order 1 - Tree a (Nest a (Nest a ())) is a binomial tree of order 2. and so on data Heap' a t = Z' | D0' (Heap' a (Nest a t)) | D1' (Heap' a (Nest a t)) (Tree a t) With (Tree a t) representing a binomial tree of rank n, - Heap' a t represents the list of binomial trees of rank = n in a heap - Z' represents an empty list - D0' xs represents a list /without/ a tree of rank n - D1' xs x represents a list /with/ a tree of rank n, namely x. Finally, define type Heap a = Heap' a () This forces Heap to be a well-shaped binomial heap, up to the equality D0' Z' = Z'. The main difference to standard binomial heaps is the existence of the D0' nodes which represent skipped ranks. This makes the type checking much easier (no comparison of natural numbers is required). It also makes rank calculations unecessary - the rank increases implicitely as the heap is traversed. Heap order can be maintained by using a smart constructor: -- combine two binomial trees of rank n into one of rank n+1 mkTree' :: Ord a = Tree a t - Tree a t - Tree a (Nest a t) mkTree' l@(Tree a x) r@(Tree b y) | a b = Tree a (Nest r x) | True = Tree b (Nest l y) Here is insert, as an example of a non-trivial operation: insert' :: Ord a = Heap' a t - Tree a t - Heap' a t insert' Z'b = D1' Z' b insert' (D0' x) b = D1' x b insert' (D1' x a) b = D0' (x `insert'` mkTree' a b) HTH, Bertram P.S. As a refinement, one can define data Heap a = Z | D0 | D1 (Heap' a a) a This improves memory efficiency at the cost of requiring more code, like mkTree :: Ord a = a - a - Tree a a mkTree a b | a b = Tree a b | True = Tree b a insert :: Ord a = Heap a - a - Heap a insert Zb = D1 Z' b insert (D0 x) b = D1 x b insert (D1 x a) b = D0 (x `insert'` mkTree a b) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Haskell Platform - changing the global install dir
Paul Moore wrote: grep global -A7 D:\Documents and Settings\uk03306\Application Data\cabal\config install-dirs global -- prefix: D:\\Apps\\Haskell\\Cabal ^^^ You should remove the '-- '. Lines beginning with '--' are comments. So this line has no effect. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Quadratic complexity though use of STArrays
Dan Rosén wrote: What complexity does these functions have? I argue that the shuffleArr function should be O(n), since it only contains one loop of n, where each loop does actions that are O(1): generating a random number and swapping two elements in an array. However, they both have the same runnig time (roughly), and through looking at the plot it _very_ much looks quadratic. Right. In the case of shuffleArr, it's the garbage collection time that explodes. I think that the reason is that the garbage collector has to scan the whole array (which lives in the old generation) on each garbage collection, while the young generation is kept very small, making GCs very frequent. Indeed the program behaves much better with just one GC generation: ./a.out +RTS -G1 (100, 1988698, 1760732, 1.98870,1.76073) (150, 2991546, 2683592, 1.99436,1.78906) (200, 4018388, 3611451, 2.00919,1.80573) Output format: (n, time for your original shuffleArray (t1), time for Daniel's stricter version (t2), t1/n, t2/n) While with the default settings, it looks much worse: ./a.out (100, 7575850, 7958790, 7.57585, 7.95879) (150,17119397,19449044,11.41293,12.96602) (200,30687335,35125661,15.34367,17.56283) You can also increase the initial heap size, ./a.out +RTS -H250M (100, 1131828, 901863, 1.13183, 0.90186) (150, 1726737, 1427783, 1.15116, 0.95186) (200, 2324647, 1935705, 1.16232, 0.96785) Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How to calculate de number of digits of an integer? (was: Is logBase right?)
Uwe Hollerbach wrote: Here's my version... maybe not as elegant as some, but it seems to work. For base 2 (or 2^k), it's probably possible to make this even more efficient by just walking along the integer as stored in memory, but that difference probably won't show up until at least tens of thousands of digits. Uwe ilogb :: Integer - Integer - Integer ilogb b n | n 0 = ilogb b (- n) | n b = 0 | otherwise = (up 1) - 1 where up a = if n (b ^ a) then bin (quot a 2) a else up (2*a) bin lo hi = if (hi - lo) = 1 then hi else let av = quot (lo + hi) 2 in if n (b ^ av) then bin lo av else bin av hi We can streamline this algorithm, avoiding the repeated iterated squaring of the base that (^) does: -- numDigits b n | n 0 = 1 + numDigits b (-n) numDigits b n = 1 + fst (ilog b n) where ilog b n | n b = (0, n) | otherwise = let (e, r) = ilog (b*b) n in if r b then (2*e, r) else (2*e+1, r `div` b) It's a worthwhile optimization, as timings on n = 2^100 show: Prelude T length (show n) 301030 (0.48 secs, 17531388 bytes) Prelude T numDigits 10 n 301030 (0.10 secs, 4233728 bytes) Prelude T ilogb 10 n 301029 (1.00 secs, 43026552 bytes) (Code compiled with -O2, but the interpreted version is just as fast; the bulk of the time is spent in gmp anyway.) Regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Improving MPTC usability when fundeps aren't appropriate?
Daniel Peebles wrote: I've been playing with multiparameter typeclasses recently and have written a few uncallable methods in the process. For example, in class Moo a b where moo :: a - a Another solution would be to artificially force moo to take a dummy b so that the compiler can figure out which instance you meant. That's what I've been doing in the mean time, but wouldn't it be simpler and less hackish to add a some form of instance annotation, like a type annotation, that would make it possible to specify what instance you wanted when it's ambiguous? Syntax aside, dummy arguments have a disadvantage when it comes to optimizing code, because the compiler doesn't know that the dummy argument is unused in the function; indeed you could define instances where the dummy argument is used. For this reason it's technically better to use a newtype instead: newtype Lambda t a = Lambda { unLambda :: a } and, say, class Moo a b where moo :: Lambda b (a - a) Note that we can convert between functions taking dummy arguments and such lambda types easily: lambdaToDummy :: Lambda t a - t - a lambdaToDummy a _ = unLambda a dummyToLambda :: (t - a) - Lambda t a dummyToLambda f = Lambda (f undefined) In fact, lambdaToDummy makes a great infix operator: (@@) :: Lambda t a - t - a (@@) = lambdaToDummy infixl 1 @@ Now we can write moo @@ (undefined :: x) where with dummy arguments we would write moo (undefined :: x) . The compiler will inline (@@) and lambdaToDummy (they're both small) and produce unLambda moo , that is, the plain value of type a - a that we wanted to use in the first place. All this happens after type checking and fixing the instance of Moo to use. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: [Haskell] ANNOUNCE: OpenGL 2.3.0.0
Rafael Gustavo da Cunha Pereira Pinto wrote: Sorry for all this annoyance, but I was starting to study those libraries (OpenGL, GLUT and GLFW) using Haskell and the update broke some of my code. Here is a patch that makes it compile, but then it breaks all code developed for GLFW-0.3, as all Floats need to be changed to CFloat. --- GLFW-0.3/Graphics/UI/GLFW.hs 2008-01-15 20:23:18.0 -0200 +++ GLFW.hs 2009-07-30 21:09:55.0 -0300 @@ -517,11 +517,11 @@ _GLFW_INFINITY = 10.0 :: Double -- Callback function type -type GLFWwindowsizefun = Int32 - Int32 - IO () +type GLFWwindowsizefun = CInt - CInt - IO () [snip] You should use the type aliases GLint, GLfloat, etc. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] excercise - a completely lazy sorting algorithm
Petr Pudlak wrote: Would it be possible to create a lazy selection/sorting algorithm so that getting any element of the sorted list/array by its index would require just O(n) time, and getting all the elements would still be in O(n * log n)? The (merge) sorting algorithm provided by Data.List has this property, providing the first k elements in O(n + k log(n)) time. (source at [1]) As a mental model, you can think of suspended 'merge' evaluations as internal nodes in a tree, with the two arguments as subtrees. In that model, the algorithm turns into heap sort: It builds a balanced binary tree with n external nodes, taking O(n) time -- this job is done by merge_pairs -- and then repeatedly extracts the minimum element, taking O(log(n)) time for each one. regards, Bertram [1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/src/Data-List.html#mergesort ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Type families and polymorphism
Jeremy Yallop wrote: Why does compiling the following program give an error? {-# LANGUAGE TypeFamilies, RankNTypes #-} type family TF a identity :: (forall a. TF a) - (forall a. TF a) identity x = x GHC 6.10.3 gives me: Couldn't match expected type `TF a1' against inferred type `TF a' In the expression: x In the definition of `identity': identity x = x The error message is slightly better in GHC head: Couldn't match expected type `TF a1' against inferred type `TF a' NB: `TF' is a type function, and may not be injective In the expression: x In the definition of `identity': identity x = x Dan Doel already explained how the lack of injectivity leads to a type checking error. FWIW, the same code would work with a data family, because data families are injective. regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] STM/Data Invariant related Segfault with GHC 6.10.3
Jan Schaumlöffel wrote: I just discovered that programs compiled with GHC 6.10.3 segfault when accessing a TVar created under certain conditions. This is a known bug, but it hasn't gotten much attention: http://hackage.haskell.org/trac/ghc/ticket/3049 Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Slightly off-topic: Lambda calculus
Miguel Mitrofanov wrote: Correction: I think that one can find an expression that causes name clashes anyway, I'm just not certain that there is one that would clash independent of whichever order you choose. Yes there is. Consider (\f g - f (f (f (f (f (f g)) (\l a b - l (b a)) (\x - x) which has 6 variables in total. This reduces to the normal form \a b c d e f g - g (f (e (d (c (b a) which requires 7 variables. So without alpha-conversion at least one of the original 6 variables will clash with itself. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance of functional priority queues
Sebastian Sylvan wrote: On Mon, Jun 15, 2009 at 4:18 AM, Richard O'Keefe o...@cs.otago.ac.nz wrote: There's a current thread in the Erlang mailing list about priority queues. I'm aware of, for example, the Brodal/Okasaki paper and the David King paper. I'm also aware of James Cook's priority queue package in Hackage, have my own copy of Okasaki's book, and have just spent an hour searching the web. One of the correspondents in that thread claims that it is provably impossible to have an efficient priority queue implementation A priority queue based on skewed binomial heaps is asymptotically optimal (O(1) for everything except deleteMin which is O(log n)), so if that's what he means by efficient then he's most definitely wrong. What about decreaseKey in a purely functional setting? I suppose it's O(log n), based on the intuition of trees with limited branching factor. Fibonacci heaps can do it in O(1), which makes a difference for Dijkstra's algorithm, for example. regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] nubBy seems broken in recent GHCs
Cale Gibbard wrote: According to the Report: nubBy:: (a - a - Bool) - [a] - [a] nubBy eq [] = [] nubBy eq (x:xs) = x : nubBy eq (filter (\y - not (eq x y)) xs) Hence, we should have that nubBy () (1:2:[]) = 1 : nubBy () (filter (\y - not (1 y)) (2:[])) = 1 : nubBy () [] = 1 : [] However in ghc-6.10.3: Prelude Data.List nubBy () [1,2] [1,2] Interesting. This was changed in response to http://hackage.haskell.org/trac/ghc/ticket/2528 | Tue Sep 2 11:29:50 CEST 2008 Simon Marlow marlo...@gmail.com | * #2528: reverse the order of args to (==) in nubBy to match nub | This only makes a difference when the (==) definition is not | reflexive, but strictly speaking it does violate the report definition | of nubBy, so we should fix it. It turns out that 'elem' differs from the report version and should have its comparison reversed. Of course that would only ever matter for broken Eq instances. However, the report also states that the nubBy function may assume that the given predicate defines an equivalence relation. http://haskell.org/onlinereport/list.html#sect17.6 So I'm not sure there's anything to be fixed here - although backing out the above patch probably won't hurt anybody. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Still having problems building a very simple Executable ....
Hi Vasili, Vasili I. Galchin wrote: I picked an exceedingly case to build an Executable: Executable QNameTest Hs-source-dirs: Swish/ Main-Is:HaskellUtils/QNameTest.hs Other-Modules: HaskellUtils.QName I'm not sure what you did; the original Swish code doesn't have any hierarchical modules. Starting with that, I could build QNameTest like this: executable QNameTest hs-source-dirs: HaskellUtils HaskellRDF HaskellRDF/HUnit main-is: QNameTest.hs other-modules: HUnitLang HUnitBase HUnitText QName QNameTest build-depends: base, haskell98 ghc-options: -main-is QNameTest The trick here is to list all used subdirectories in hs-source-dirs; the module HUnit for example will be found in HaskellRDF/HUnit/ under the name HUnit.lhs. If you have a module A.B.C then the source file should be in foo/A/B/C.hs (or lhs or some other recognized extension) and foo should be listed in hs-source-dirs. Also note the ghc-options line: it tells ghc to use QNameTest instead of Main for the main module. I also managed to produce a Swish executable after a bit of tweaking; you can find my changes in http://int-e.home.tlink.de/haskell/Swish.diff The cabal file is at the end of the diff. HTH, Bertram P.S. I took Swish code from here: http://www.ninebynine.org/RDFNotes/Swish/Intro.html#SwishLinks ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] type checking that I can't figure out ....
Michael Snoyman wrote: On Wed, Jun 3, 2009 at 8:42 AM, Daniel Fischer daniel.is.fisc...@web.dewrote: Am Mittwoch 03 Juni 2009 06:12:46 schrieb Michael Snoyman: 2. lookup does not return any generalized Monad, just Maybe (I think that should be changed). Data.Map.lookup used to return a value in any monad you wanted, I believe until 6.8 inclusive. I don't think it's going to change again soon. Is there a reason why it only returns in the Maybe monad? I often times have to write a liftMaybe function to deal with that. Here's the proposal that changed it: http://hackage.haskell.org/trac/ghc/ticket/2309 The discussion about the proposal can be found here: http://www.haskell.org/pipermail/libraries/2008-May/009698.html (There's even the suggestion of adding a function like liftMaybe to Data.Maybe, but apparently nobody turned that into a formal proposal.) Regards, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Cabal/primes
michael rice wrote: Finally got adventurous enough to get Cabal working, downloaded the primes package, and got the following error message when trying isPrime. Am I missing something here? The Data.Numbers.Primes module of the primes package does not implement 'isPrime'. The Numbers package is probably the one you want. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Stack overflow
Krzysztof Skrzętnicki wrote: 2009/5/27 Bertram Felgenhauer bertram.felgenha...@googlemail.com: I wrote: Krzysztof Skrzętnicki wrote: The code for modifying the counter: (\ msg - atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg,( atomicModifyIORef does not force the new value of the IORef. If the previous contents of the IORef is x, the new contents will be a thunk, (\ cnt - (cntMsg cnt msg,())) x Sorry, it's slightly worse than that. The contents becomes sel_0 (\ cnt - (cntMsg cnt msg, ())) x where sel_0 is basically an RTS internal version of fst. Instead of reading the new value of the IORef, you could also force the old one: atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg, msg)) = (return $!) Thanks for the tip, although it seems tricky to get it right. I wonder why there is no strict version of atomicModifyIORef? Something like this? -- | Stricter version of 'atomicModifyIORef', which prevents building -- up huge thunks in the 'IORef' due to repeated modification. -- Unlike 'atomicModifyIORef', 'atomicModifyIORef'' may block. atomicModifyIORef' :: IORef a - (a - (a, b)) - IO b atomicModifyIORef' ioref f = do res - atomicModifyIORef ioref f new - readIORef ioref new `seq` return res (The step that may block is forcing the new value - if another thread is already evaluating part of the thunk, the currently executing thread will block, waiting for the other thread to finish.) Dually there might be a strict version of IORef datatype. One interesting feature of atomicModifyIORef is that its implementation is lock-free, and never blocks (which affects exception handling): replacing the old value by the new value is done with compare-and-swap operation in a tight loop. Each iteration executes very quickly because all it does is replace the reference to the old value in the new thunk. With a strict IORef, the time window between reading the old value and storing the new value would become arbitrarily large, because you'd have to force the new value before exchanging it with the old value. So a reasonable implementation would have to use locks instead, I think, making atomicModifyIORef more expensive, and less useful in contexts that block exceptions. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Stack overflow
Krzysztof Skrzętnicki wrote: The code for modifying the counter: (\ msg - atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg,( atomicModifyIORef does not force the new value of the IORef. If the previous contents of the IORef is x, the new contents will be a thunk, (\ cnt - (cntMsg cnt msg,())) x You can try forcing the new value, say by adding readIORef ioref = (return $!) after the atomicModifyIORef. The datatype itself is strict. So where is the thunk actually accumulating? In the IORef. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Stack overflow
I wrote: Krzysztof Skrzętnicki wrote: The code for modifying the counter: (\ msg - atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg,( atomicModifyIORef does not force the new value of the IORef. If the previous contents of the IORef is x, the new contents will be a thunk, (\ cnt - (cntMsg cnt msg,())) x Sorry, it's slightly worse than that. The contents becomes sel_0 (\ cnt - (cntMsg cnt msg, ())) x where sel_0 is basically an RTS internal version of fst. Instead of reading the new value of the IORef, you could also force the old one: atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg, msg)) = (return $!) Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Issues with IO and FFIs
Jon Harrop wrote: Does anyone have any comments on the following criticism of some difficulties with FFI, including IO, in Haskell: http://groups.google.com/group/comp.lang.functional/msg/6d650c086b2c8a49?hl=en That post conflates two separate questions. 1) binding to foreign libraries that export functions that are semantically pure, but can't be imported as pure functions, for example because they use pointers. unsafePerformIO was added in the FFI spec for exactly this purpose, as you can see for yourself here: http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi/ffise5.html#x8-240005.1 2) global variables. This has been a topic of long and heated debates, with no clear winner. I don't think another debate will be fruitful. In particular, is it not always possible to write IO libraries safely in Haskell? Flame bait? Yes, it is not always possible, because some interfaces are inherently unsafe, putting the burden of using them safely on the user. unsafePerformIO itself is a prime example for this. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Link errors in Gtk2Hs are more general than I thought.
Jeff Heard wrote: I tried to get yi to run on my Mac earlier and I get the following errors: dyld: lazy symbol binding failed: Symbol not found: _cairo_quartz_font_face_create_for_atsu_font_id Referenced from: /opt/local/lib/libpangocairo-1.0.0.dylib Expected in: /opt/local/lib/libcairo.2.dylib dyld: Symbol not found: _cairo_quartz_font_face_create_for_atsu_font_id Referenced from: /opt/local/lib/libpangocairo-1.0.0.dylib Expected in: /opt/local/lib/libcairo.2.dylib This looks like the quartz backend was disabled in the cairo C library, not like a gtk2hs problem. I don't know how ports work, but http://trac.macports.org/browser/trunk/dports/graphics/cairo/Portfile defines a 'quartz' variant that enables that backend. Another idea is to reinstall pango. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] trying to download leksah ....
Vasili I. Galchin wrote: vigalc...@ubuntu:~/FTP$ darcs get http://code.haskell.org/leksah Invalid repository: http://code.haskell.org/leksah darcs failed: Failed to download URL http://code.haskell.org/leksah/_darcs/inventory : HTTP error (404?) I did a google on HTTP 404 = not found why? It's a darcs 2 repository, so you'll need to install darcs 2. See http://code.haskell.org/leksah/_darcs/format Admittedly, the error message is horrible. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] TMVar's are great but fail under ghc 6.10.1 windows
Alberto G. Corona wrote: however, It happens that fails in my windows box with ghc 6.10.1 , single core here is the code and the results: ---begin code: module Main where import Control.Concurrent.STM import Control.Concurrent import System.IO.Unsafe import GHC.Conc mtxs= unsafePerformIO $ mapM newTMVarIO $ take 5 $ repeat 0 proc i= atomically $ do unsafeIOToSTM $ putStr $ init of process ++ show i++\n As Sterling points out, unsafeIOToSTM is really unsafe. A fundamental restriction is that the IO action used with unsafeIOToSTM may not block. However, putStr may block. It's actually possible to do the logging without blocking: cut here module Main where import Control.Concurrent import Control.Concurrent.STM import GHC.Conc import Control.Concurrent.MVar import Data.IORef import Control.Monad data Logger = Logger (IORef [String]) (MVar ()) newLogger :: IO Logger newLogger = do ref - newIORef [] wake - newEmptyMVar return $ Logger ref wake logLogger :: Logger - String - IO () logLogger (Logger ref wake) msg = do atomicModifyIORef ref $ \msgs - (msg:msgs, ()) tryPutMVar wake () return () dumpLogger :: Logger - IO () dumpLogger (Logger ref wake) = forever $ do takeMVar wake msgs - atomicModifyIORef ref $ \msgs - ([], msgs) putStr $ unlines . reverse $ msgs proc log mtxs i = do let logSTM = unsafeIOToSTM . log xs' - atomically $ do logSTM $ init of process ++ show i xs - mapM takeTMVar mtxs mapM (\(mtx, x) - putTMVar mtx (x+1)) $ zip mtxs xs xs' - mapM readTMVar mtxs logSTM $ End of processs ++ show i ++ result = ++ show xs' return xs' log $ Final result of process ++ show i ++ = ++ show xs' main = do log - newLogger forkIO $ dumpLogger log mtxs - replicateM 5 $ newTMVarIO 0 mapM (forkIO . proc (logLogger log) mtxs) [1..5] threadDelay 100 And that gives reasonable results, for example: init of process 1 End of processs 1 result= [1,1,1,1,1] Final result of process 1 = [1,1,1,1,1] init of process 2 End of processs 2 result= [2,2,2,2,2] Final result of process 2 = [2,2,2,2,2] init of process 3 init of process 4 End of processs 4 result= [3,3,3,3,3] Final result of process 4 = [3,3,3,3,3] init of process 5 End of processs 5 result= [4,4,4,4,4] Final result of process 5 = [4,4,4,4,4] End of processs 3 result= [3,3,3,3,3] init of process 3 End of processs 3 result= [5,5,5,5,5] Final result of process 3 = [5,5,5,5,5] HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Definition of tail recursive wrt Folds
Ben Franksen wrote: Mark Spezzano wrote: Just looking at the definitions for foldr and foldl I see that foldl is (apparently) tail recursive while foldr is not. Why? Is it because foldl defers calling itself until last whereas foldr evaluates itself as it runs? What, strictly speaking, is the definition of ”tail recursive” as opposed to just “recursive”? An application of some function f inside another function g is in 'tail position' (or a 'tail call') if the result of applying f is the result of g. Operationally speaking, calling f is the very last thing g does. Tail calls can be optimized by a compiler (or interpreter) so that the call does not use additional stack; that is, the call can be replaced by a jump. A function is called ”tail recursive” (as opposed to just “recursive”) if the recursive call to itself is in tail position. If the compiler performs tail call optimization, tail recursive functions can work with constant stack space, similar to a (imperative) loop. Looking at a definition of foldl, e.g. foldl f z0 xs0 = lgo z0 xs0 where lgo z [] = z lgo z (x:xs) = lgo (f z x) xs you see that lgo calls itself in tail position, thus is tail recursive. In contrast, foldr can be defined as foldr k z xs = go xs where go [] = z go (y:ys) = y `k` go ys where you see that the result of go is not the recursive call to go. Instead, the result is y `k` go ys . Thus, foldr is not tail recursive. So, if you are saying that foldl defers calling itself until last whereas foldr evaluates itself as it runs then you got the right idea, I think. The point is that foldr still needs to do something (namely to apply (y `k`)) to the result of applying itself. It needs to remember to do so, and thus the stack grows linearly with the size of the list. Sorry, but that's wrong. It would be right in a strict language. In Haskell, the 'go ys' term is not evaluated straight away; it is instead turned into a suspended evaluation (a thunk) that is typically stored on the heap. The following discussion is implementation specific, with ghc in mind. Haskell itself has no notion of a stack. In fact both using foldl and using foldr can produce stack overflows: Prelude foldl (+) 0 [1..10^7] *** Exception: stack overflow Prelude foldr (+) 0 [1..10^7] *** Exception: stack overflow Let's examine why. First, consider the foldl case. For simplicity, I'll ignore the evaluation of [1..10^7]. The first few evaluation steps are foldl (+) 0 [1,2,3..100] - lgo 0 [1,2,3..100] - lgo (0+1) [2,3,4..100] - lgo ((0+1)+2) [3,4,5..10] None of these steps uses the stack - foldl is indeed tail recursive. However, the ((0+1)+2) is a thunk on the heap. Continuing, - lgo ((...(0+1)+...)+999) [1000] - lgo ((...(0+1)+...)+1000) [] - ((...(0+1)+...)+1000) Now we have to evaluate that huge thunk. This turns out to cause trouble because + (for Integers) is strict. So in order to find x+1000, the code needs the value of x first. And that's where the stack gets involved: the information of the pending addition, (?+1000) is pushed onto the stack, and evaluation proceeds with the first term. Denoting the stack by [[item1, item2, ...]], evaluation continues like this: - [[(?+1000)]] ((...(0+1)+...)+999) - [[(?+1000),(?+999)]] The stack will keep growing, until the (0+1) is reached, or the stack overflows. To make things more confusing, ghc has a strictness analyzer that sometimes manages to avoid such a thunk being built up. For an example how strictness helps see foldl' (below). Now for the foldr case. Evaluation in that case looks a bit different: foldr (+) 0 [1,2,3..1000] - go [1,2,3..1000] - 1 + go [2,3,4..1000] No stack was used so far; the go [2,3,4..100] is a thunk. Now, as above, we need to add two numbers. And that's where the stack gets involved again: - [[(1+?)]] go [2,3,4..1000] - [[(1+?)]] 2 + go [3,4,5..1000] - [[(1+?),(2+?)]] go [3,4,5..1000] - [[(1+?),(2+?),(3+?)]] go [4,5,6..1000] and so on, until we reach go [] or the stack overflows. Note that there is no reference to 'go' on the stack at all. If instead of (+), we had a lazy function like (:), the stack would not get involved in this way: foldr (:) [] [1,2,3..1000] - go [1,2,3..1000] - 1 : go [2,3,4..1000] Which is in weak head normal form, so evaluation stops here. Later, when other code examines the list, the 'go [2,3,4..1000]' thunk will get evaluated. As a final note, the stack overflow with foldl above is cured by using foldl', which is _strict_ in the accumulator. For reference, foldl' f z0 xs0 = lgo z0 xs0 where lgo z [] = z lgo z (x:xs) = let z' = f z x in z' `seq` lgo z' xs foldl' (+) 0 [1,2,3..100] - lgo 0 [1,2,3..100] - let z' = 0+1 in z' `seq` lgo z' [2,3,4..100] Now
Re: [Haskell-cafe] Performance question
hask...@kudling.de wrote: Do you think it would be feasable to replace the GHC implementation of System.Random with something like System.Random.Mersenne? There's a problem with using the Mersenne Twister: System.Random's interface has a split method: class RandomGen g where split:: g - (g, g) The Mersenne Twister is good at producing a single stream of random numbers - in fact it works by generating a whole block of random numbers in one go, then consuming the block, and only then generating the next block. I have no idea how to implement a split method that produces independent streams. Even if I did, using split a lot would likely spoil the performance benefit of the generator. (System.Random.Mersenne.Pure64 provides a RandomGen instance for PureMT, but it cheats:) split = error System.Random.Mersenne.Pure: unable to split the mersenne twister Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data.Binary, strict reading
Neil Mitchell wrote: Hi, I want to read a file using Data.Binary, and I want to read the file strictly - i.e. when I leave the read file I want to guarantee the handle is closed. The reason is that (possibly immediately after) I need to write to the file. The following is the magic I need to use - is it all necessary, is it guaranteed correct, should I use something else? src - decodeFile _make/_make Map.size mp `seq` performGC With binary 0.5, src - decodeFile _make/_make return $! src should close the file, assuming that all the data is read from the file, thanks to this patch: Mon Aug 25 23:01:09 CEST 2008 Don Stewart d...@galois.com * WHNF the tail of a bytestring on decodeFile, will close the resource For older versions, import qualified Data.Binary.Get as Get data EOF = EOF instance Binary EOF where get = do eof - Get.isEmpty return (if eof then EOF else error EOF expected) put EOF = return () ... (src, EOF) - decodeFile _make/_make accomplishes the same effect. Btw, contrary to what Duncan said, Get is a lazy monad (lazy in its actions, that is): instance Binary EOF where get = do eof - Get.isEmpty when (not eof) error EOF expected return EOF put EOF = return () does not help, because the result (EOF) does not depend on the value returned by isEmpty. The idea of using isEmpty for closing the file is not perfect though; due to the lazy nature of Get, there's a stack overflow lurking below: main = do encodeFile w.bin [0..100 :: Int] m - decodeFile w.bin print $ foldl' (+) 0 (m :: [Int]) One idea to fix this is to force the read data before checking for EOF, as follows: data BinaryRNF a = BinaryRNF a instance (NFData a, Binary a) = Binary (BinaryRNF a) where get = (\a - rnf a `seq` BinaryRNF a) `fmap` get put (BinaryRNF a) = put a main = do encodeFile w.bin [0..100 :: Int] (BinaryRNF m, EOF) - decode `fmap` L.readFile w.bin print $ foldl' (+) 0 (m :: [Int]) HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data.Binary, strict reading
I wrote: With binary 0.5, Or binary 0.4.3 and later. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Pickling a finite map (Binary + zlib) [was: [Haskell-cafe] Data.Binary poor read performance]
Don Stewart wrote: dons: [...] Just serialising straight lists of pairs, [...] And reading them back in, main = do [f] - getArgs m - decode `fmap` L.readFile f print (length (m :: [(B.ByteString,Int)])) print done Well, you don't actually read the whole list here, just its length: instance Binary a = Binary [a] where put l = put (length l) mapM_ put l get= do n - get :: Get Int replicateM n get To demonstrate, this works: main = do L.writeFile v (encode (42 :: Int)) m - decode `fmap` L.readFile v print (length (m :: [Int])) So instead, we should try something like this: import Control.Parallel.Strategies instance NFData B.ByteString where rnf bs = bs `seq` () main = do [f] - getArgs m - decode `fmap` L.readFile f print (rnf m `seq` length (m :: [(B.ByteString,Int)])) My timings: reading list, without rnf: 0.04s with rnf: 0.16s reading a Data.Map: 0.52s with rnf: 0.62s Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Pickling a finite map (Binary + zlib) [was: [Haskell-cafe] Data.Binary poor read performance]
Felipe Lessa wrote: On Tue, Feb 24, 2009 at 4:59 AM, Don Stewart d...@galois.com wrote: Looks like the Map reading/showing via association lists could do with further work. Anyone want to dig around in the Map instance? (There's also some patches for an alternative lazy Map serialisation, if people are keen to load maps -- happstack devs?). From binary-0.5: instance (Ord k, Binary k, Binary e) = Binary (Map.Map k e) where put m = put (Map.size m) mapM_ put (Map.toAscList m) get = liftM Map.fromDistinctAscList get instance Binary a = Binary [a] where put l = put (length l) mapM_ put l get= do n - get :: Get Int replicateM n get Can't get better, I think. We can improve it slightly (about 20% runtime in dons example [*]): instance (Ord k, Binary k, Binary e) = Binary (Map.Map k e) where get = liftM (Map.fromDistinctAscList . map strictValue) get where strictValue (k,v) = (v `seq` k, v) The point is that Data.Map.Map is strict in the keys, but not in the values of the map. In the case of deserialisation this means the values will be thunks that hang on to the Daya.Binary buffer. Now, from containers-0.2.0.0: fromDistinctAscList :: [(k,a)] - Map k a fromDistinctAscList xs = build const (length xs) xs where -- 1) use continutations so that we use heap space instead of stack space. -- 2) special case for n==5 to build bushier trees. build c 0 xs' = c Tip xs' build c 5 xs' = case xs' of ((k1,x1):(k2,x2):(k3,x3):(k4,x4):(k5,x5):xx) - c (bin k4 x4 (bin k2 x2 (singleton k1 x1) (singleton k3 x3)) (singleton k5 x5)) xx _ - error fromDistinctAscList build build c n xs' = seq nr $ build (buildR nr c) nl xs' where nl = n `div` 2 nr = n - nl - 1 buildR n c l ((k,x):ys) = build (buildB l k x c) n ys buildR _ _ _ [] = error fromDistinctAscList buildR [] buildB l k x c r zs = c (bin k x l r) zs The builds seem fine, but we spot a (length xs) on the beginning. Maybe this is the culprit? We already know the size of the map (it was serialized), so it is just a matter of exporting fromDistinctAscSizedList :: Int - [(k, a)] - Map k a Eliminating the 'length' call helps, too, improving runtime by another about 5%. The result is still a factor of 1.7 slower than reading the list of key/value pairs. Bertram [*] Notes on timings: 1) I used `rnf` for all timings, as in my previous mail. 2) I noticed that in my previous measurements, the GC time for the Data.Map tests was excessively large (70% and more), so I used +RTS -H32M this time. This resulted in a significant runtime improvement of about 30%. 3) Do your own measurements! Some code to play with is available here: http://int-e.home.tlink.de/haskell/MapTest.hs http://int-e.home.tlink.de/haskell/Map.hs ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ONeillPrimes.hs - priority queue broken?
Eugene Kirpichov wrote: Hi, I've recently tried to use the priority queue from the ONeillPrimes.hs, which is famous for being a very fast prime generator: actually, I translated the code to Scheme and dropped the values, to end up with a key-only heap implementation. However, the code didn't work quite well, and I decided to check the haskell code itself. Turns out that it is broken. module PQ where import Test.QuickCheck data PriorityQ k v = Lf | Br {-# UNPACK #-} !k v !(PriorityQ k v) !(PriorityQ k v) deriving (Eq, Ord, Read, Show) Let size Lf = 0 size (Br _ _ l r) = 1 + sizePQ l + sizePQ r be the size of the priority queue. To work, the code maintains heap order and the invariant that the left subtree is at least as large as the right one, and at most one element larger. validSize Lf = True validSize (Br _ _ l r) = validSize l validSize r 0 = d d = 1 where d = size l - size r This invariant justifies the assumption that Daniel Fischer pointed out. The code is careful to maintain this invariant, but it is broken in one place: leftrem :: PriorityQ k v - (k, v, PriorityQ k v) leftrem (Br vk vv Lf Lf) = (vk, vv, Lf) (Why not this?) leftrem (Br vk vv Lf _) = (vk, vv, Lf) leftrem (Br vk vv t1 t2) = (wk, wv, Br vk vv t t2) where (wk, wv, t) = leftrem t1 Here, the left subtree is replaced by one that is one element smaller. This breaks the invariant if the two original subtrees had equal size. The bug is easy to fix; just swap the two subtrees on the right side: leftrem (Br vk vv t1 t2) = (wk, wv, Br vk vv t2 t) where (wk, wv, t) = leftrem t1 leftrem _= error Empty heap! *PQ s [3,1,4,1,5,9,2,6,5,3,5,8] [1,1,2*** Exception: Empty heap! *PQ s [3,1,4,1,5,9,2,6,5,3,5,8] [1,1,2,3,3,4,5,5,5,6,8,9] HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ONeillPrimes.hs - priority queue broken?
Eugene Kirpichov wrote: module PQ where import Test.QuickCheck data PriorityQ k v = Lf | Br {-# UNPACK #-} !k v !(PriorityQ k v) !(PriorityQ k v) deriving (Eq, Ord, Read, Show) For the record, we can exploit the invariant that the sizes of the left and right subtrees have difference 0 or 1 to implement 'size' in better than O(n) time, where n is the size of the heap: -- Return number of elements in the priority queue. -- /O(log(n)^2)/ size :: PriorityQ k v - Int size Lf = 0 size (Br _ _ t1 t2) = 2*n + rest n t1 t2 where n = size t2 -- rest n p q, where n = size q, and size p - size q = 0 or 1 -- returns 1 + size p - size q. rest :: Int - PriorityQ k v - PriorityQ k v - Int rest 0 Lf _ = 1 rest 0 _ _ = 2 rest n (Br _ _ p1 p2) (Br _ _ q1 q2) = case r of 0 - rest d p1 q1 -- subtree sizes: (d or d+1), d; d, d 1 - rest d p2 q2 -- subtree sizes: d+1, (d or d+1); d+1, d where (d, r) = (n-1) `quotRem` 2 Of course we can reduce the cost to O(1) by annotating the heap with its size, but that is less interesting, and incurs a little overhead in the other heap operations. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: speed: ghc vs gcc
Don Stewart wrote: If we take what I usually see as the best loops GHC can do for this kind of thing: import Data.Array.Vector main = print (sumU (enumFromToU 1 (10^9 :: Int))) And compile it: $ ghc-core A.hs -O2 -fvia-C -optc-O3 We get ideal core, all data structures fused away, and no heap allocation: $wfold_s15t :: Int# - Int# - Int# $wfold_s15t = \ (ww1_s150 :: Int#) (ww2_s154 :: Int#) - case # ww2_s154 ww_s14U of wild_aWm { False - $wfold_s15t (+# ww1_s150 ww2_s154) (+# ww2_s154 1); True - ww1_s150 }; } in case $wfold_s15t 0 1 Which produces nice assembly: s16e_info: cmpq6(%rbx), %rdi jg .L2 addq%rdi, %rsi leaq1(%rdi), %rdi jmp s16e_info Note that this does the addition to the accumulator first, and then increments the counter. [snip] I wondered if we just got worse code on backwards counting loops. So translating into the obvious translation, counting up: main = print (sum0 0 1) sum0 :: Int - Int - Int sum0 acc n | n 10^9 = acc | otherwise = sum0 (acc + n) (n + 1) We start to notice something interesting: $wsum0 :: Int# - Int# - Int# $wsum0 = \ (ww_sOH :: Int#) (ww1_sOL :: Int#) - case lvl2 of wild1_aHn { I# y_aHp - case # ww1_sOL y_aHp of wild_B1 { False - letrec { $wsum01_XPd :: Int# - Int# - Int# $wsum01_XPd = \ (ww2_XP4 :: Int#) (ww3_XP9 :: Int#) - case # ww3_XP9 y_aHp of wild11_Xs { False - $wsum01_XPd (+# ww2_XP4 ww3_XP9) (+# ww3_XP9 1); True - ww2_XP4 }; } in $wsum01_XPd (+# ww_sOH ww1_sOL) (+# ww1_sOL 1); True - ww_sOH } This is odd, but it doesn't hurt the inner loop, which only involves $wsum01_XPd, and is identical to $wfold_s15t above. Checking the asm: $ ghc -O2 -fasm sQ3_info: .LcRt: cmpq 8(%rbp),%rsi jg .LcRw leaq 1(%rsi),%rax addq %rsi,%rbx movq %rax,%rsi jmp sQ3_info So for some reason ghc ends up doing the (n + 1) addition before the (acc + n) addition in this case - this accounts for the extra instruction, because both n+1 and n need to be kept around for the duration of the addq (which does the acc + n addition). Checking via C: $ ghc -O2 -optc-O3 -fvia-C Better code, but still a bit slower: sQ3_info: cmpq8(%rbp), %rsi jg .L8 addq%rsi, %rbx leaq1(%rsi), %rsi jmp sQ3_info This code is identical (up to renaming registers and one offset that I can't fully explain, but is probably related to a slight difference in handling pointer tags between the two versions of the code) to the nice assembly above. Running: $ time ./B 55 ./B 1.01s user 0.01s system 97% cpu 1.035 total Hmm, about 5% slower, are you sure this isn't just noise? If not noise, it may be some alignment effect. Hard to say. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: convertible (first release)
wren ng thornton wrote: John Goerzen wrote: Hi folks, I have uploaded a new package to Haskell: convertible. At its heart, it's a very simple typeclass that's designed to enable a reasonable default conversion between two different types without having to remember a bunch of functions. I once again point out that realToFrac is *wrong* for converting from Float or Double. Yes, realToFrac is just broken, or at least the Real instances of Float and Double are. Of course, with the restriction to single parameter type classes in Haskell98, it's hard to come up with anything better - we'd end up with fooToBar for all Real / Fractional pairs. realToFrac (1/0::Float) ::Double 3.402823669209385e38 realToFrac (0/0::Float) ::Double -5.104235503814077e38 realToFrac (0/0::Double) ::Float -Infinity [snip] GHC makes the mess just a bit messier. The following program prints different answers when compiled with -O or without: main = do print (realToFrac (1/0::Float) :: Double) print (realToFrac (0/0::Float) :: Double) print (realToFrac (0/0::Double) :: Float) Without -O: 3.402823669209385e38 -5.104235503814077e38 -Infinity With -O: Infinity NaN NaN The reason for this behaviour are rules replacing realToFrac by direct conversions from Float to Double or vice versa, where applicable. These two evils -- realToFrac and changing the behaviour with -O -- amount to something good: proper treatment of NaNs and Infinities. Your RealToFrac class is a definitive improvement over this situation. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Comments from OCaml Hacker Brian Hurt
Andrew Wagner wrote: I think perhaps the correct question here is not how many instances of Monoid are there?, but how many functions are written that can use an arbitrary Monoid. E.g., the fact that there are a lot of instances of Monad doesn't make it useful. There are a lot of instances of Monad because it's useful to have instances of Monad. Why? Because of http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html ! Look at all the cool stuff you can automagically do with your type just because it's an instance of Monad! I think that's the point. What can you do with arbitrary Monoids? Not much, as evidenced by http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html One example where Monoids (in full generality) are useful is that of measurements in the Data.Sequence paper (which is sadly not implemented in the library, although it is used to maintain the length for efficient indexing), http://www.soi.city.ac.uk/~ross/papers/FingerTree.html The concept applies to any tree that represents an ordered list. The basic idea is that given a measurement for single elements, class Monoid v = Measured a v where measure :: a - v we can annotate a tree with cached measurements of the corresponding sequences, data Tree a v = Empty | Leaf v a | Node v (Tree a v) (Tree a v) measureTree :: Measured a v = Tree a v - v measureTree Empty = mzero measureTree (Leaf v _) = v measureTree (Node v _ _) = v which can be calculated easily by smart constructors: leaf :: Measured a v = a - Tree a v leaf a = Leaf (measure a) a node :: Measured a v = Tree a v - Tree a v - Tree a v node l r = Node (measureTree l `mappend` measureTree r) l r Because v is a monoid, the construction satisfies the law measureTree = mconcat . map measure . toList where toList Empty = [] toList (Leaf _ a) = [a] toList (Node _ l r) = toList l ++ toList r All usually efficient tree operations, insertion, deletion, splitting, concatenation, and so on, will continue to work, if the cached values are ignored on pattern matching and the smart constructors are used for constructing the new trees. measure or `mappend` will be called for each smart constructor use - if they take constant time, the complexity of the tree operations doesn't change. Applications include: - finding and maintaining the sum of any substring of the sequence. - maintaining minimum and maximum of the list elements - maintaining the maximal sum of any substring of the sequence (this can be done by measuring four values for each subtree: 1. the sum of all elements of the sequence 2. the maximum sum of any prefix of the sequence 3. the maximum sum of any suffix of the sequence 4. the maximum sum of any substring of the sequence) I also found the idea useful for http://projecteuler.net/index.php?section=problemsid=220 starting out with -- L system basis class Monoid a = Step a where l :: a r :: a f :: a and then providing a few instances for Step, one of which was a binary tree with two measurements. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes
Evan Laforge wrote: On Mon, Dec 29, 2008 at 1:15 PM, Ryan Ingram ryani.s...@gmail.com wrote: Both readTVar and writeTVar are worse than O(1); they have to look up the TVar in the transaction log to see if you have made local changes to it. Right now it looks like that operation is O(n) where n is the number of TVars accessed by a transaction, so your big transaction which is just accessing a ton of TVars is likely O(n^2). So this actually brings up a tangential question I've wondered about for a while. The lock-free datastructure paper at http://research.microsoft.com/users/Cambridge/simonpj/papers/stm/lock-free-flops06.pdf shows lock vs. STM with very similar performance for single processor with STM quickly winning on multi-processors. I have not verified this, but a possible cause is that Control.Concurrent.QSem isn't efficient if there are many waiters. It should use two lists for managing the waiters (ala Okasaki). But why does it manually manage the waiters at all? MVars are fair, in ghc at least. So this should work: newtype Sem = Sem (MVar Int) (MVar Int) newSem :: Int - IO Sem newSem initial = liftM2 Sem (newMVar initial) newEmptyMVar -- | Wait for a unit to become available waitSem :: Sem - IO () waitSem (Sem sem wakeup) = do avail' - modifyMVar sem (\avail - return (avail-1, avail-1)) when (avail' 0) $ takeMVar wakeup = putMVar sem -- | Signal that a unit of the 'Sem' is available signalSem :: Sem - IO () signalSem (Sem sem wakeup) = do avail - takeMVar sem if avail 0 then putMVar wakeup (avail+1) else putMVar sem (avail+1) (I should turn this into a library proposal.) Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How to think about this? (profiling)
Magnus Therning wrote: This behaviour by Haskell seems to go against my intuition, I'm sure I just need an update of my intuition ;-) I wanted to improve on the following little example code: foo :: Int - Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = foo (n - 1) + foo (n - 2) + foo (n - 3) Two more ideas: How about -- loop keeping the last three elements of the sequence -- O(n) per call, constant memory foo' :: Int - Int foo' n = go n 0 1 2 where go 0 a _ _ = a go n a b c = go (n - 1) b c (a + b + c) or -- analogue of the folklore fibonacci definition: -- fibs = 0 : 1 : zipWith (+) fibs (tail fibs) foos :: [Int] foos = 0 : 1 : 2 : zipWith3 (\a b c - a + b + c) foos (tail foos) (tail (tail foos)) [snip] Then I added a convenience function and called it like this: createArray :: Int - UArray Int Int createArray n = array (0, n) (zip [0..n] (repeat (-1))) main = do (n:_) - liftM (map read) getArgs print $ evalState (foo n) (createArray n) [snip] main = do (n:_) - liftM (map read) getArgs print $ evalState (mapM_ foo [0..n] foo n) (createArray n) Then I started profiling and found out that the latter version both uses more memory and makes far more calls to `foo`. That's not what I expected! (I suspect there's something about laziness I'm missing.) Anyway, I ran it with `n=35` and got foo n : 202,048 bytes , foo entries 100 mapM_ foo [0..n] foo n : 236,312 , foo entries 135 + 1 The number of function calls is to be expected: to evaluate foo n for the first time, you need to call foo (n-1), foo (n-2) and foo (n-3), making 4 calls per evaluated value. 36*4 = 144 is pretty close to 135. (The missing 9 calls correspond to foo 0, foo 1 and foo 2) The difference of 35 can be explained in the same way: the first version makes 35 fewer explicit calls to 'foo'. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Memoization-question
Mattias Bengtsson wrote: The program below computes (f 27) almost instantly but if i replace the definition of (f n) below with (f n = f (n - 1) * f (n -1)) then it takes around 12s to terminate. I realize this is because the original version caches results and only has to calculate, for example, (f 25) once instead of (i guess) four times. There is probably a good reason why this isn't caught by the compiler. But I'm interested in why. Anyone care to explain? GHC does opportunistic CSE, when optimizations are enabled. See http://www.haskell.org/haskellwiki/GHC:FAQ#Does_GHC_do_common_subexpression_elimination.3F (http://tinyurl.com/33q93a) I've found it very hard to predict whether this will happen or not, from a given source code, because the optimizer will transform the program a lot and the opportunistic CSE rule may apply to one of the transformed versions. It's best to make sharing explicit when you need it, as you did below. main = print (f 27) f 0 = 1 f n = let f' = f (n-1) in f' * f' Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The Knight's Tour: solutions please
Dan Doel wrote: On Monday 01 December 2008 1:39:13 pm Bertram Felgenhauer wrote: As one of the posters there points out, for n=100 the program doesn't actually backtrack if the 'loneliest neighbour' heuristic is used. Do any of our programs finish quickly for n=99? The Python one doesn't. Nothing I tried finished. Do you have any figures on how much backtracking needs to be done to find a solution for n=99 (there is a solution, right?)? Yes, there is a solution. After changing successors n b = sortWith (length . succs) . succs to successors n b = sortWith (length . (succs =) . succs) . succs in the LogicT solution, it finds one with no backtracking at all. This heuristic fails on other n though, n=8 and n=66 at least. The obvious next step, successors n b = sortWith (length . (succs =) . (succs =) . succs) . succs works without backtracking up to n=100. These improved heuristics don't come cheap though. Here are some timings for n = 100: LogicT : 0.48 user 0.00 system 0:00.48 elapsed LogicT' : 2.16 user 0.00 system 0:02.16 elapsed LogicT'': 13.84 user 0.01 system 0:13.86 elapsed Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: The Knight's Tour: solutions please
ChrisK wrote: Hmmm... it seems that n=63 is a special case. [EMAIL PROTECTED] wrote: Yes, there is a solution for n=99 and for n=100 for that matter -- which can be found under one second. I only had to make a trivial modification to the previously posted code tour n k s b | k n*n = return b | otherwise = do next - (foldr mplus mzero).map return $ successors n b s tour n (k+1) next $ insrt next k b I replaced foldl1 mplus with foldr mplus zero. The old version sees no solution to n=63 quite quickly: time nice ./fromwiki-63 63 fromwiki-63: Prelude.foldl1: empty list real 0m0.285s user 0m0.172s sys 0m0.026s That's a bug. When the list of candidates is empty at that point, the program should backtrack, not terminate. In fact there are solutions for n=63. Using the first improved heuristic from my previous mail in this thread: time ./tour2 63 14 143 148 2116 229 226 2418 553 578 571 10 551 584 573 12 549 630 643 14 547 670 665 16 545 684 679 18 543 770 765 20 541 816 867 22 539 952 995 24 537 1044 1121 26 535 1208 1231 28 533 1300 1307 30 531 494 489 32 491 404 39 34 37 ... real0m1.750s user0m1.564s sys 0m0.008s The version with the 'tour' given above does not halt after running up to 0.4 GB of RAM, so I killed it. In fact, replacing foldl1 mplus by foldr mplus mzero fixed that bug. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The Knight's Tour: solutions please
Don Stewart wrote: Lee Pike forwarded the following: Solving the Knight's Tour Puzzle In 60 Lines of Python http://developers.slashdot.org/article.pl?sid=08/11/30/1722203 Seems that perhaps (someone expert in) Haskell could do even better? Maybe even parallelize the problem? :) As one of the posters there points out, for n=100 the program doesn't actually backtrack if the 'loneliest neighbour' heuristic is used. Do any of our programs finish quickly for n=99? The Python one doesn't. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] '#' in literate haskell
John MacFarlane wrote: Can anyone explain why ghc does not treat the following as a valid literate haskell program? - test.lhs # This is a test foo = reverse . words I believe this is an artifact of ghc trying to parse cpp style line number information: foo.lhs # 123 foo.foo t = will print this error: foo.foo:124:6: parse error on input `' Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Permutations
Daniel Fischer wrote: Needs an Ord constraint: inserts :: [a] - [a] - [[a]] inserts [] ys = [ys] inserts xs [] = [xs] inserts xs@(x:xt) ys@(y:yt) = [x:zs | zs - inserts xt ys] ++ [y:zs | zs - inserts xs yt] Heh, I came up with basically the same thing. I'd call this function 'merges' - it returns all possibilities for merging two given lists. uniquePermutations :: Ord a = [a] - [[a]] uniquePermutations = foldr (concatMap . inserts) [[]] . group . sort Which can also be written as uniquePermutations = foldM merges [] . group . sort Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Histogram creation
Alexey Khudyakov wrote: Hello! I'm tryig to write efficient code for creating histograms. I have following requirements for it: 1. O(1) element insertion 2. No reallocations. Thus in place updates are needed. accumArray won't go because I need to fill a lot of histograms (hundrends) from vely long list of data (possibly millions of elements) and it will traverse input data for each histogram. That's just not true, for GHC's implementation of accumArray at least, which goes via the ST monad. It creates a mutable array, fills it, traversing the input list exactly once, and finally freezes the array and returns it. This is just what you suggest below. If you still run into performance problems, try out unboxed arrays. If that isn't enough, unsafeAccumArray from Data.Base may help. I'd try both before using the ST monad directly. It seems that I need to use mutable array and/or ST monad or something else. Sadly both of them are tricky and difficult to understand. So good examples or any other ideas greatly appreciated. http://www.haskell.org/haskellwiki/Shootout/Nsieve_Bits perhaps. There must be better examples out there. I can think of two common problems with mutable arrays and ST: 1) You need to specify a type signature for the array being created, because the compiler can't guess the MArray instance that you want. For example, from the shootout entry: arr - newArray (0,m) False :: IO (IOUArray Int Bool) In ST, this is slightly trickier, because the phantom 's' type parameter has to be mirrord in the ST*Array type constructor. You can use scoped type variables, which allow you to write {-# LANGUAGE ScopedTypeVariables #-} import Control.Monad.ST import Data.Array.ST foo :: forall s . ST s () foo = do arr - newArray (0,42) False :: ST s (STUArray s Int Bool) ... Alternatively you can define helper functions to specify just the part of the type signature that you care about. stuArray :: ST s (STUArray s i e) - ST s (STUArray s i e) stuArray = id foo :: ST s () foo = do arr - stuArray $ newArray (0,42 :: Int) False ... 2) runST $ foo bar doesn't work. You have to write runST (foo bar) But in the end it's just imperative array programming with a rather verbose syntax -- you can do only one array access per statement, and 'readArray' and 'writeArray' are rather long names. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Histogram creation
Alexey Khudyakov wrote: Hello! I'm tryig to write efficient code for creating histograms. I have following requirements for it: 1. O(1) element insertion 2. No reallocations. Thus in place updates are needed. accumArray won't go because I need to fill a lot of histograms (hundrends) from vely long list of data (possibly millions of elements) and it will traverse input data for each histogram. Sorry, Duncan is right. I misread here. My first idea would still be to use accumArray though, or rather, accum, processing the input data in chunks of an appropriate size (which depends on the histogram sizes.) But actually, the ST code isn't bad (I hope this isn't homework): import Control.Monad.ST import Control.Monad import Data.Array.ST import Data.Array.Unboxed stuArray :: ST s (STUArray s i e) - ST s (STUArray s i e) stuArray = id -- Create histograms. -- -- Each histogram is described by a pair (f, (l, u)), where 'f' maps -- a data entry to an Int index, and l and u are lower and upper bounds -- of the indices, respectively. -- mkHistograms :: [(a - Int, (Int, Int))] - [a] - [UArray Int Int] mkHistograms hs ds = runST collect where -- Why is the type signature on 'collect' required here? collect :: ST s [UArray Int Int] collect = do -- create histogram arrays of appropriate sizes histograms - forM hs $ \(_, range) - do stuArray $ newArray range 0 -- iterate through the data forM_ ds $ \d - do -- iterate through the histograms forM_ (zip hs histograms) $ \((f, _), h) - do -- update appropriate entry writeArray h (f d) . succ = readArray h (f d) -- finally, freeze the histograms and return them -- (using unsafeFreeze is ok because we're done modifying the -- arrays) mapM unsafeFreeze histograms test = mkHistograms [((`mod` 3), (0,2)), ((`mod` 5), (0,4))] [0..10] -- test returns -- [array (0,2) [(0,4),(1,4),(2,3)], -- array (0,4) [(0,3),(1,2),(2,2),(3,2),(4,2)]] Bertram P.S. Ryan is right, too - I'm not sure where I got confused there. runST $ foo didn't work in ghc 6.6; I knew that it works in ghc 6.8.3, but I thought this was changed again. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Writing an IRC bot, problems with plugins
Alexander Foremny wrote: I am writing an single server, multi channel IRC bot with the support of plugins and limited plugin communication. With the plugin system I am facing problems I cannot really solve myself. Here's an approach built completely around Data.Typeable. The fundamental idea is that a Plugin encompasses a set of interfaces of unknown types, which have Typeable instances. All we need is an operation to extract such an interface from a Plugin. data Plugin = Plugin { getInterface :: forall i. Typeable i = Maybe i } Then we can define the interfaces we want to use, for example: data BaseInterface = BaseInterface { identifier :: String, rawMessage :: (MonadIO m) = Message - PL m () } deriving Typeable Sending a message to a plugin can be implemented as sendMessage :: Plugin - Message - PL m () sendMessage p msg = do let pI :: Maybe BaseInterface pI = getInterface p case pI of Nothing - error Plugin does not support BaseInterface Just pI' - rawMessage pI' msg A more complete example follows below. Does that help? Bertram {-# LANGUAGE GADTs, Rank2Types, DeriveDataTypeable #-} module PluginTest (main) where import Data.Typeable import Data.IORef import Control.Monad.Trans import Control.Monad.State import qualified Data.Map as M -- Types -- A Plugin is just a method that returns various interfaces. data Plugin = Plugin { getInterface :: forall i. Typeable i = Maybe i } -- The basic interface. -- -- It should be made a part of Plugin, but it's a queryable interface -- in this example for demonstration purposes. data BaseInterface = BaseInterface { identifier :: String, rawMessage :: (MonadIO m) = Message - PL m () } deriving Typeable type Message = String type PL = StateT PluginConfig type PluginConfig = M.Map String Plugin -- Main -- look up a plugin by name findPlugin :: Monad m = String - PL m (Maybe Plugin) findPlugin k = get = return . M.lookup k -- register a plugin registerPlugin :: MonadIO m = Plugin - PL m () registerPlugin p = do -- note: 'getInterface' can return 'Nothing' - needs error checking let Just i = getInterface p modify (M.insert (identifier i) p) -- unregister, etc. main' :: MonadIO m = PL m () main' = do -- create two plugins (see below) and register them. a - createAPlugin registerPlugin a b - createBPlugin registerPlugin b -- extract base interfaces of a and b and send some messages -- (needs error checking) let aI, bI :: BaseInterface Just aI = getInterface a Just bI = getInterface b liftIO $ putStrLn - Sending message to A rawMessage aI dummy liftIO $ putStrLn - Sending message to B rawMessage bI Hi, here's a message from B liftIO $ putStrLn - Sending another message to A rawMessage aI dummy main :: IO () main = evalStateT main' M.empty -- Plugin A -- -- This plugin provides an additional Interface that allows to -- query and change a string value in its state. data APlugin = APlugin (IORef String) data AInterface = AInterface { aGet :: (MonadIO m) = PL m String, aPut :: (MonadIO m) = String - PL m () } deriving Typeable createAPlugin :: (MonadIO m) = PL m Plugin createAPlugin = do r - liftIO (newIORef initial state) let a = APlugin r return $ Plugin { getInterface = cast (aBase a) `mplus` cast (aInterface a) } aBase (APlugin r) = BaseInterface { identifier = A, rawMessage = msg } where msg _ = liftIO $ do s - readIORef r putStrLn (A has state ( ++ s ++ )!) aInterface :: APlugin - AInterface aInterface (APlugin r) = AInterface { aGet = liftIO (readIORef r), aPut = \v - liftIO (writeIORef r v) } -- Plugin B -- -- Plugin B knows about Plugin A and uses its additional interface for -- modifying its state createBPlugin :: (MonadIO m) = PL m Plugin createBPlugin = return $ Plugin { getInterface = cast bBase } bBase = BaseInterface { identifier = B, rawMessage = msg } where msg s = do -- find A plugin Just a - findPlugin A -- and get its additional interface let aI :: AInterface Just aI = getInterface a aPut aI s ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Anyone know why this always returns invalid texture objects?
[CCing gtk2hs-users] Jefferson Heard wrote: import Graphics.UI.Gtk import Graphics.UI.Gtk.Glade import Graphics.UI.Gtk.OpenGL import qualified Graphics.Rendering.OpenGL as GL import Graphics.Rendering.OpenGL (($=)) main = do initGUI initGL initGL may be slightly misleading - it initialises the gtkglext gtk+ extension. It does not create a GL context. GL.shadeModel $= GL.Flat GL.depthFunc $= Just GL.Less (window1,gui,dlgs) - constructGUIObject (sX, sY) - liftM (mapPair fromIntegral) . widgetGetSize . drawing_canvas $ gui -- get the canvas size for determining the part of the widget to repaint pb - pixbufNew ColorspaceRgb False 8 (round pbWidth) (round pbHeight) pixbufFill pb 0 0 0 255 pxbufs - initSubpixbufs pb texRows texCols textures - GL.genObjectNames (texRows*texCols) print textures There is no active GL context at this point. GtkGLExt creates new GL contexts for GL enabled widgets when they're realized - I think. I'm a bit fuzzy about the exact life time of the GL context. [1] After the context was created, you have to activate it before doing any GL operations. In Gtk2hs you can use the GLDrawingArea widget, which provides withGLDrawingArea for easy activation of the GL context. There's an example in the gtk2hs sources, in examples/opengl. Enabling GL for other widgets is not supported well at the moment. (There are low level bindings (using DrawWindow), but no generic binding to the higher level gtk_widget_set_gl_capability() call. Such support wouldn't be too hard to add, I think.) HTH, Bertram [1] see http://gtkglext.sourceforge.net/reference/gtkglext/gtkglext-gtkglwidget.html#gtk-widget-get-gl-context ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Array bug?
Andrew Coppin wrote: Bertram Felgenhauer wrote: Yes, it's a known bug - a conscious choice really. See http://hackage.haskell.org/trac/ghc/ticket/2120 It's somewhat ironic that this behaviour was introduced by a patch that made arrays safer to use in other respects. ...so it's *not* going to be fixed then? It's not going to be fixed by itself - the first comment for the bug report basically asks interested parties to submit a proposal for changing this. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Array bug?
Andrew Coppin wrote: Consider the following GHCi session: GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Prelude Data.Array.IO t - newArray ((0,0),(5,4)) 0 :: IO (IOUArray (Int,Int) Int) Prelude Data.Array.IO getBounds t ((0,0),(5,4)) Prelude Data.Array.IO Is this a known bug? Is it likely to be fixed any time soon? (I'm guessing the bug is as simple is converting indicies to integers and then checking the integers are in-range, rather than the underlying index type.) Yes, it's a known bug - a conscious choice really. See http://hackage.haskell.org/trac/ghc/ticket/2120 It's somewhat ironic that this behaviour was introduced by a patch that made arrays safer to use in other respects. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Why 'round' does not just round numbers ?
George Pollard wrote: There's also the ieee-utils package, which provides an IEEE monad with `setRound`: http://hackage.haskell.org/packages/archive/ieee-utils/0.4.0/doc/html/Numeric-IEEE-RoundMode.html Hmm, this does not work well with the threaded RTS: import Numeric.IEEE.Monad import Numeric.IEEE.RoundMode (RoundMode (..)) import Control.Monad import Control.Concurrent main = withIeeeDo $ do replicateM_ 2 $ forkIO $ forever $ putChar '.' forkIO $ do runIEEE $ do withRoundMode Downward $ forever $ do IEEE . putStr . (++ \n) . show = getRound threadDelay 100 When run with +RTS -N2 -RTS, the output randomly alternates between Downward and ToNearest - for me at least. The problem is that the setRound call will only affect one worker thread, while the RTS will sometimes migrate RTS threads from one worker to another. runIEEE really has to be executed in a bound thread (see forkOS documentation). Using `par` will also cause trouble - in fact even more. I think that conceptually, the cleanest approach is to provide separate data types for Double and Float in each of the rounding modes. This is quite expensive: basically, it means setting the rounding mode on each operation, and we would miss out on the code generator support for floating point math. (A primop for setting the rounding mode could help here, to some extent.) Maybe tracking the rounding mode per RTS thread would be a useful compromise between performance and usability for computations with mostly uniform rounding mode - this is what the Numeric.IEEE.Monad module seems to be aiming at. `par` would still be unusable with that approach though. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: is there a way to pretty print a module?
Jason Dagit wrote: Could you use haskell-src from TH and then unsafePerformIO to get the reading to work during compile time? I've done something like this in the past with Language.Haskell and TH. I described it here: http://blog.codersbase.com/2006/09/01/simple-unit-testing-in-haskell/ Maybe someone who has studied more TH knows a way to remove the unsafePerformIO. Replace tests :: [String] tests = unsafePerformIO $ by tests :: Q [String] tests = runIO $ and $(mkChecks tests) by $(mkChecks = tests) (The Q monad should really have an MonadIO instance.) HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: ghc error: requested module name differs from name found in interface file
Larry Evans wrote: On 10/20/08 12:33, Larry Evans wrote: With a file containing: module Main where import Array import Control.Functor.Fix I get: make ghc -i/root/.cabal/lib/category-extras-0.53.5/ghc-6.8.2 -c catamorphism.example.hs Yes, using -i to give paths to installed packages does not work - you really have to tell ghc about the corresponding package.conf file, using -package-conf file. See also http://www.haskell.org/ghc/docs/latest/html/users_guide/packages.html#package-databases catamorphism.example.hs:19:0: Bad interface file: /root/.cabal/lib/category-extras-0.53.5/ghc-6.8.2/Control/Functor/Fix.hi Something is amiss; requested module main:Control.Functor.Fix differs from name found in the interface file category-extras-0.53.5:Control.Functor.Fix make: *** [all] Error 1 The problem is that all modules found by -i are expected to be in the current package - which is 'main' by default. (Build tools like Cabal specify a different package name for libraries; for example the Control.Functor.Fix is in the 'category-extras-0.53.5' package here.) So, I've got to figure how to tell cabal install to install in right place :( Have you tried 'cabal install --global'? To make it stick, put 'user-install: False' in root's .cabal/config file. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] is 256M RAM insufficient for a 20 million element Int/Int map?
Bulat Ziganshin wrote: Hello Bertram, Sunday, October 19, 2008, 6:19:31 AM, you wrote: That's 5 words per elements ... that, like everything else, should be multiplied by 2-3 to account GC effect True. You can control this factor though. Two RTS options help: -c (Enable compaction for all major collections) - mostly avoids fragmentation in the old generation. -Ffactor (Control the amount of memory reserved in terms of the size of the oldest generation. The default is 2, meaning that if the oldest generation is 200MB in size, 400 MB of heap will be used) Consider this program, module Main (main) where import qualified Data.IntMap as M import Data.List (foldl') main = do loop (M.fromList [(i,0) | i - [1..500]]) 1 loop dict j = do i - readLn print $ dict M.! (i :: Int) let dict' = foldl' (\m (k, v) - M.insert k v m) dict [(,) i $! j*i | i - [j`mod`10 * 50 + 1..j`mod`10 * 50 + 50]] loop dict' (j+1) This program maintains an IntMap with 5 million entries, which means 200 MB of live data on a 32 bit computer. It updates the map a lot, too, so I think this is a fairly realistic example. Running it on a 51 line input with various RTS options [*], we get: Options Memory used Time used +RTS -c -F1.1220 MB3m22s +RTS -c -F1.2243 MB2m12s +RTS -c -F1.5306 MB1m58s +RTS -c 398 MB1m57s +RTS -F 1.1 406 MB1m43s +RTS -F 1.2 425 MB1m15s +RTS -F 1.5 483 MB1m6s none 580 MB1m11s Heap residency was around 200.5 million bytes in all runs. As expected, saving memory this way doesn't come cheap - it can dramatically increase the program's runtime. But if a program builds and slowly updates a large dictionary, playing with these options can help a lot. Bertram [*] time (seq 50; echo 0) | ./Main +RTS -sstderr -c -F1.2 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] package question/problem
Galchin, Vasili wrote: I am trying to cabal install HSQL. I am using ghc 6.8.2. I get the following error about a non-visible/hidden package (old-time-1.0.0.0): [EMAIL PROTECTED]:~$ cabal install hsql [snip] Database/HSQL.hsc:66:7: Could not find module `System.Time': it is a member of package old-time-1.0.0.0, which is hidden cabal: Error: some packages failed to install: hsql-1.7 failed during the building phase. The exception was: exit: ExitFailure 1 [EMAIL PROTECTED]:~$ This happens because Cabal invokes ghc with the -hide-all-packages flag. It then adds explicit -package flags for all packages listed as build dependencies. (The idea is to force developers to specify all their package's dependencies in the library's .cabal file.) Sadly, hsql looks unmaintained. You can try unpacking the hsql-1.7 tarball and adding the required dependencies to the build-depends: field of the .cabal file. Running 'cabal install' in the source directory will attempt to install the package. What is the history of old-time package? old-time was split out of the base package in base 3.0 (which ships with ghc 6.8.x). HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] is 256M RAM insufficient for a 20 million element Int/Int map?
Don Stewart wrote: tphyahoo: I'm trying to run a HAppS web site with a large amount of data: stress testing happstutorial.com. Well, 20 million records doesn't sound that large by today's standards, but anyway that's my goal for now. I have a standard Data.Map.Map as the base structure for one of my macid data tables (jobs), but I noticed something that is probably causing problems for me. Even a simple 20 million record with int/int key values causes an out of memory error for me in ghci, Int keys, Int values eh? Does using IntMap help? Interesting. Map Int Int, IntMap Int and [(Int, Int)] use pretty much the same amount of memory, assuming that no keys or values are shared: Map Int Int: data Map k a = Tip | Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a) - all 'Tip's are shared. - For each (key, value) pair, there is one 'Bin', needing tag[*] (1 word) size (unpacked, 1 word) key, value (pointer, tag, value = 3 words each) two more pointers (1 word each) for a total of 10 words per element. IntMap Int: data IntMap a = Nil | Tip {-# UNPACK #-} !Key a | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) - one 'Tip' per element: tag (1 word) + key (1 word) + value (3 words) - one 'Bin' per element (minus 1): tag (1 word) + prefix, mask (1 word each) + 2 pointers (1 word each) for a total of 10 words per element. [(Int, Int)]: - one '(:)' per element: - tag (1 word) + 2 pointers (1 word each) - one '(Int, Int)' per element: - tag (1 word) + key (3 words) + value (3 words) for a total of 10 words per element, again. Now if we want to save memory, we can specialise Data.Map to Int keys and values, saving 4 words per element, and encode the external leaves (Tips) in the constructor, saving another word: data IntIntMap = Tip | BinBB !Size !Int !Int IntIntMap IntIntMap | BinBT !Size !Int !Int IntIntMap | BinTB !Size !Int !Int IntIntMap | BinTT !Size !Int !Int -- sprinkle {-# UNPACK #-} as needed That's 5 words per elements. Would that be worthwhile? Bertram [*] actually, the info pointer in the Spineless, Tagless G-machine. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] 'par' - why has it the type a - b - b ?
Henning Thielemann wrote: What is the reason for implementing parallelism with 'par :: a - b - b'? Analogy to 'seq'? I'd think it's actually easier to implement than par2 below; evaluating par x y sparks a thread evaluating x, and then returns y. The analogy to 'seq' is there, of course. I thought parallelism would be introduced most naturally by a function which does two computations in parallel and puts together their results after completion. Say par2 :: (a - b - c) - (a - b - c) to be used like par2 (+) expensiveComputationA expensiveComputationB I assume that par2 can be implemented this way: par2 f x y = f x (par x y) For this to work, f has to evaluate its second argument before the first one, or the par will be useless. Try this: par2 f x y = x `par` y `par` f x y (In complete analogy to using `seq` for enforcing strictness.) HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] random colors, stack space overflow, mersenne and mersenne.pure64
Cetin Sert wrote: [snip] colorR :: RandomGen g ⇒ (RGB,RGB) → g → (RGB,g) colorR ((a,b,c),(x,y,z)) s0 = ((r,g,b),s3) where (r,s1) = q (a,x) s0 (g,s2) = q (b,y) s1 (b,s3) = q (c,z) s2 q = randomR Look closely at how you use the variable 'b'. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] two problems with Data.Binary and Data.ByteString
Tim Newsham wrote: [snip] I would have expected this to fix my problems: binEof :: Get () binEof = do more - not $ isEmpty when more $ error expected EOF decodeFully :: Binary b = B.ByteString - b decodeFully = runGet (get binEof) where a b = a = (\x - b return x) Remember that the Get monad is lazy; the result of binEof is never used, so the action is not performed. decodeFully :: Binary b = B.ByteString - b decodeFully = runGet (get binEof) where a b = a = (\x - b = \y - y `seq` return x) works, for example, and also where a b = a = (\x - b = \y - return (y `seq` x)) and where () = liftM2 (flip seq) HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] -fvia-C error
Duncan Coutts wrote: Don, this does not work: includes: SFMT.h SFMT_wrap.h install-includes: SFMT.h Sorry, that was my fault. (It does work with ghc 6.9, but that's not much of an excuse) Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Mersenne Build Problem
Dominic Steinitz wrote: I'm getting errors (see below) trying to build the tests in http://hackage.haskell.org/cgi-bin/hackage-scripts/package/mersenne-random-0.1.1 [snip] Linking Unit ... Unit.o: In function `s4Da_info': (.text+0x1b21): undefined reference to `genrand_real2' Unit.o: In function `s4RA_info': (.text+0x3e75): undefined reference to `genrand_real2' Unit.o: In function `s4S4_info': (.text+0x3f61): undefined reference to `genrand_real2' Unit.o: In function `s5su_info': (.text+0x40bc): undefined reference to `genrand_real2' /usr/local/lib/mersenne-random-0.1.1/ghc-6.9.20080517/[...] [snip] The missing symbols are inlined functions. ghc 6.9 doesn't include the header files anymore when compiling via C. (The solution is to create C wrappers around those functions. I guess I'll whip up a patch.) HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Fwd: installing happy 1.17
Duncan Coutts wrote: The immediate workarounds are: * unregister Cabal-1.5.2 Better, hide it (that's reversible) - or does that not work with cabal-install? Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] [ANN] hs-pgms 0.1 -- Programmer's Minesweeper in Haskell
Hi, I've just uploaded hs-pgms to hackage. It is a Haskell implementation of Programmer's Minesweeper [1], which allows programmers to implement minesweeper strategies and run them. (Note: ghc = 6.8 is required.) hs-pgms uses MonadPrompt to achieve a clean separation between strategies, game logic, and presentation. There are two frontends, one command line frontend which is mainly useful for collecting statistics, and a GUI frontend, using gtk2hs, for watching the strategies in action. There's a git repo, see http://repo.or.cz/w/hs-pgms.git . enjoy, Bertram [1] http://www.ccs.neu.edu/home/ramsdell/pgms/index.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [ANNOUNCE] git-darcs-import 0.1
Darrin Thompson wrote: On Sun, Jun 1, 2008 at 2:44 PM, Bertram Felgenhauer [EMAIL PROTECTED] wrote: I'm pleased to announce yet another tool for importing darcs repositories to git. [...] What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that. For myself, git-darcs-import itself is an opportunity to learn more about both darcs and git. It wasn't meant to be argument in the git vs. darcs discussion, although it was inevitable that it would be seen as such. I really like darcs' concepts, but in my opinion, darcs doesn't get enough power out of the theory of patches to really shine so far. This is a hard problem, and I can't offer solutions. Ideally, you'd have semantic patches which just commute with virtually all other patches because they know what they are about. The only thing that darcs offers in that direction - besides handling conflicts, mergers and undos gracefully, which is quite useful in itself - is a keyword substitution patch type. In the meantime, I prefer git to darcs, mainly because I'm sort of attached to seeing the development history, i.e. I prefer to think of patches as (partially) ordered instead of being a cloud of patches that darcs uses as a model. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [ANNOUNCE] git-darcs-import 0.1
Thomas Schilling wrote: On 1 jun 2008, at 20.44, Bertram Felgenhauer wrote: [git-darcs-import] Nice! Do you happen to also have a darcs (or Git) repository somewhere? I've uploaded my (git) repo to repo.or.cz, see http://repo.or.cz/w/git-darcs-import.git Patches are welcome. enjoy, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] [ANNOUNCE] git-darcs-import 0.1
Hi, I'm pleased to announce yet another tool for importing darcs repositories to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in Haskell, on top of the darcs2 source code. The result is a much faster program - it can convert the complete ghc 6.9 branch (without libraries) in less than 15 minutes on my slightly dated machine (Athlon XP 2500+), which is quite fast [3]. Incremental updates work, too. The program is still rough around the edges, and there's some cosmetical work to do, especially with respect to converting author names. The program should recover from most errors, as long as nobody else modifies the destination repository. Nevertheless, it seems quite useable already. I hope somebody finds this useful. You can grab the source at http://int-e.home.tlink.de/haskell/git-darcs-import-0.1.tar.bz2 Look at the README for further information. Credits go to: David Roundy and all contributors for darcs2. The code base is surprisingly pleasant to work with. And of course, Linus Torvalds, Junio Hamano and all other git contributors. Enjoy, Bertram [1] http://repo.or.cz/w/darcs2git.git?a=shortlog [2] http://git.sanityinc.com/?p=darcs-to-git.git [3] http://nominolo.blogspot.com/2008/05/thing-that-should-not-be-or-how-to.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe