[Haskell-cafe] ANN: BUG FIX release of regex-tdfa-1.1.2
Hello, While occasionally and slowly updating the future version of regex-tdfa I found a bug that exists in the released 1.1.1 version. It was just a matter of passing the wrong value into a function, so was easy to fix when I figured it out. The test case triggered an impossible error call and so I know this can kill a program that uses a buggy version of this library. Please upgrade regex-tdfa to version 1.1.2 which is available on hackage at: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parallel combinator, performance advice
You create one MVar for each task in order to ensure all the tasks are done. This is pretty heavyweight. You could create a single Control.Concurrent.QSemN to count the completed tasks, starting with 0. Each task is followed by signalQSemN with a value of 1. (I would use finally). As parallel_ launches the tasks it can count their number, then it would call waitQSemN for that quantity to have finished. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parallel combinator, performance advice
Neil Mitchell wrote: Sorry, accidentally sent before finishing the response! I also noticed you sent this directly to me, not to -cafe, was that intentional? The mail/news gateway makes it look like that, but I also sent to the mailing list. You mean something like: parallel_ xs = sem - createSemapore (length xs) enqueue [x signalSemapore sem | x - xs] waitOnSemaphore sem I thought of something like this, but then the thread that called parallel_ is blocked, which means if you fire off N threads you only get N-1 executing. If you have nested calls to parallel, then you end up with thread exhaustion. Is there a way to avoid that problem? Thanks Neil Your parallel_ does not return until all operations are finished. parallel_ (x:xs) = do ys - mapM idempotent xs mapM_ addParallel ys sequence_ $ x : reverse ys By the way, there is no obvious reason to insert reverse there. What I meant was something like: para [] = return () para [x] = x para xs = do q - newQSemN 0 let wrap x = finally x (signalQSemN q 1) go [y] n = wrap x waitQSemN q (succ n) go (y:ys) n = addParallel (wrap y) go ys $! succ n go xs 0 This is nearly identical to your code, and avoid creating the MVar for each operation. I use finally to ensure the count is correct, but if a worker threads dies then bas things will happen. You can replace finally with () if speed is important. This is also lazy since the length of the list is not forced early. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parallel combinator, performance advice
Neil Mitchell wrote: I guess the nested calls to parallel_ bit is the part of the spec that makes everything much harder! Thanks Neil Yes. Much more annoying. But the problem here is generic. To avoid it you must never allow all thread to block at once. The parallel_ function is such a job, so you solved this with the 'idempotent' trick. You solution works by blocking all but 1 thread. 1a) Some worker thread 1 executes parallel_ with some jobs 1b) These get submitted the work queue 'chan' 1c) worker thread 1 starts on those same jobs, ignoring the queue 1d) worker thread 1 reaches the job being processed by thread 2 1e) worker thread 1 blocks until the jobs is finished in modifyMVar 2a) Worker thread 2 grabs a job posted by thread 1, that calls parallel_ 2b) This batch of jobs gets submitted to the work queue 'chan' 2c) worker thread 2 starts on those same jobs, ignoring the queue 1d) worker thread 2 reaches the job being processed by thread 3 1e) worker thread 2 blocks until the jobs is finished in modifyMVar 3...4...5... And now only 1 thread is still working, and it has to work in series. I think I can fix this... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Use unsafePerformIO to catch Exception?
You should ensure that the result of evaluate is in normal form, not just weak head normal form. You can do this with the Control.Parallel.Strategies module: import Control.Exception(ArithException(..),try,evaluate) import Control.Parallel.Strategies(NFData,using,rnf) import System.IO.Unsafe(unsafePerformIO) tryArith :: NFData a = a - Either ArithException a tryArith = unsafePerformIO . try . evaluate . flip using rnf test :: [Either ArithException Integer] test = map (tryArith . (div 5)) [2,1,0,5] testResult = [Right 2,Right 5,Left DivideByZero,Right 1] withPair :: Integer - (Integer,Integer) withPair x = (x,throw Overflow) main = do print (test == testResult) print (tryArith (withPair 7)) print (tryArith' (withPair 7)) in ghci *Main main main True Left arithmetic overflow Right (7,*** Exception: arithmetic overflow This rnf :: Strategy a ensures that the result of evaluate is in normal form. This means it should not have any embedded lazy thunks, so any errors from such thunks will be forced while in the scope of the try. Otherwise a complex type like the result of withPair can hide an error. Xiao-Yong Jin wrote: Hi, I just feel it is not comfortable to deal with exceptions only within IO monad, so I defined tryArith :: a - Either ArithException a tryArith = unsafePerformIO . try . evaluate and it works quite good as map (tryArith . (div 5)) [2,1,0,5] evaluates to [Right 2,Right 5,Left divide by zero,Right 1] However, I guess unsafePerformIO definitely has a reason for its name. As I read through the document in System.IO.Unsafe, I can't convince myself whether the use of 'tryArith' is indeed safe or unsafe. I know there have been a lot of discussion around unsafePerformIO, but I still can't figure it out by myself. Can someone share some thoughts on this particular use of unsafePerformIO? Is it safe or not? And why? Thanks, Xiao-Yong ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: regex-tdfa 1.1.1
This is both a bug fix release and a feature release. The bug fix is a bit embarrassing, the indices were correct but the captured text was wrong in version 1.1.0. Oops. As of version 1.1.1 the following GNU extensions are recognized, all anchors: \` at beginning of entire text \' at end of entire text \ at beginning of word \ at end of word \b at either beginning or end of word \B at neither beginning nor end of word There is a newSyntax field of CompOptions that can enable/disable this syntax. These were requested by users of the Yi project. The home page is on the wiki and UPDATED a bit at http://haskell.org/haskellwiki/Regular_expressions#regex-tdfa The HADDOCK fails on hackage for now, but is up at http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/doc/html/regex-tdfa/Text-Regex-TDFA.html The released code is on hackage at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa The darcs repository is at http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/ Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: doubts about runGetState in the binary package
Manlio Perillo wrote: Hi. I have some doubts about the runGetState function in the binary package. The signature is: runGetState :: Get a - LBS - Int64 - (a, LBS, Int64) however the Int64 input parameter is not documented. What value should I pass? How will be used? Thanks Manlio Perillo hackage has the code at http://hackage.haskell.org/packages/archive/binary/0.5.0.1/doc/html/src/Data-Binary-Get.html#runGetState And I have pieced together an answer at the bottom... -- | The parse state data S = S {-# UNPACK #-} !B.ByteString -- current chunk L.ByteString -- the rest of the input {-# UNPACK #-} !Int64 -- bytes read -- | The Get monad is just a State monad carrying around the input ByteString -- We treat it as a strict state monad. newtype Get a = Get { unGet :: S - (a, S) } mkState :: L.ByteString - Int64 - S mkState l = case l of L.Empty - S B.empty L.empty L.Chunk x xs - S x xs -- | Run the Get monad applies a 'get'-based parser on the input -- ByteString. Additional to the result of get it returns the number of -- consumed bytes and the rest of the input. runGetState :: Get a - L.ByteString - Int64 - (a, L.ByteString, Int64) runGetState m str off = case unGet m (mkState str off) of (a, ~(S s ss newOff)) - (a, s `join` ss, newOff) getBytes :: Int - Get B.ByteString getBytes n = do S s ss bytes - get if n = B.length s then do let (consume,rest) = B.splitAt n s put $! S rest ss (bytes + fromIntegral n) return $! consume else ... The Int64 passed to runGetState just initializes the running total of consumed bytes. The updated total is returned by runGetState. The absolute value of the Int64 is never used; it is only increased by getBytes. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: regex-tdfa-1.1.0
I have just uploaded the new regex-tdfa-1.1.0 to hackage. This version is a small performance update to the old regex-tdfa-1.0.0 version. Previously all text (e.g. ByteString) being search was converted to String and sent through a single engine. The new version uses a type class and SPECIALIZE pragmas to avoid converting to String. This should make adding support for searching other Char containers easy to do. The new version includes six specialized engine loops to take advantage of obvious optimizations of the traversal. The previous version had only a couple of such engines. The new code paths have been tested for correctness and no performance degradations have shown up. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?
Thomas Hartman wrote: testPcre = ( subRegex (mkRegex (?!\n)\n(?!\n)) asdf\n \n\n\nadsf ) == asdf \n\n\nadsf quoting from the man page for regcomp: REG_NEWLINE Compile for newline-sensitive matching. By default, newline is a completely ordinary character with no special meaning in either REs or strings. With this flag, `[^' bracket expressions and `.' never match newline, a `^' anchor matches the null string after any newline in the string in addition to its normal function, and the `$' anchor matches the null string before any newline in the string in addition to its normal function. This is the carried over to Text.Regex with mkRegexWithOpts Source :: String The regular expression to compile - Bool True = '^' and '$' match the beginning and end of individual lines respectively, and '.' does not match the newline character. - Bool True = matching is case-sensitive - Regex Returns: the compiled regular expression Makes a regular expression, where the multi-line and case-sensitive options can be changed from the default settings. Or with regex-posix directly the flag is compNewline: http://hackage.haskell.org/packages/archive/regex-posix/0.94.1/doc/html/Text-Regex-Posix-Wrap.html The defaultCompOpt is (compExtended .|. compNewline). You want to match a \n that is not next to any other \n. So you want to turn off REG_NEWLINE. import Text.Regex.Compat r :: Regex r = mkRegexWithOpts (^|[^\n])\n($|[^\n]) False True -- False is important here The ^ and $ take care of matching a lone newline at the start or end of the whole text. In the middle of the text the pattern is equivalent to [^\n]\n[^\n]. When substituting you can use the \1 and \2 captures to restore the matched non-newline character if one was present. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?
Thomas Hartman wrote: testPcre = ( subRegex (mkRegex (?!\n)\n(?!\n)) asdf\n \n\n\nadsf ) == asdf \n\n\nadsf quoting from the man page for regcomp: REG_NEWLINE Compile for newline-sensitive matching. By default, newline is a completely ordinary character with no special meaning in either REs or strings. With this flag, `[^' bracket expressions and `.' never match newline, a `^' anchor matches the null string after any newline in the string in addition to its normal function, and the `$' anchor matches the null string before any newline in the string in addition to its normal function. This is the carried over to Text.Regex with mkRegexWithOpts Source :: String The regular expression to compile - Bool True = '^' and '$' match the beginning and end of individual lines respectively, and '.' does not match the newline character. - Bool True = matching is case-sensitive - Regex Returns: the compiled regular expression Makes a regular expression, where the multi-line and case-sensitive options can be changed from the default settings. Or with regex-posix directly the flag is compNewline: http://hackage.haskell.org/packages/archive/regex-posix/0.94.1/doc/html/Text-Regex-Posix-Wrap.html The defaultCompOpt is (compExtended .|. compNewline). You want to match a \n that is not next to any other \n. So you want to turn off REG_NEWLINE. import Text.Regex.Compat r :: Regex r = mkRegexWithOpts (^|[^\n])\n($|[^\n]) False True -- False is important here The ^ and $ take care of matching a lone newline at the start or end of the whole text. In the middle of the text the pattern is equivalent to [^\n]\n[^\n]. When substituting you can use the \1 and \2 captures to restore the matched non-newline character if one was present. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?
Don Stewart wrote: tphyahoo: Is there something like subRegex... something like =~ s/.../.../ in perl... for haskell pcre Regexen? I mean, subRegex from Text.Regex of course: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-compat Thanks for any advice, Basically, we should have it. Let me open the discussion with all the questions I can quickly ask: What should the subRegex function do, exactly? (Single replacement,global replacement,once per line,...) What should the replacement template be able to specify? (Can it refer to all text before a match or all text after?) (Can it access the start/stop offsets as numbers?) Should the replacement template be specified in a String? As an abstract data type or syntax tree? With combinators? What happens if the referenced capture was not made? Empty text? How will syntax errors in the template be handled (e.g. referring to a capture that does not exist in the regular expression)? Will the output text be String? ByteString? ByteString.Lazy? Seq Char? Note: String and Strict Bytestrings are poor with concatenation. Can the output text type differ from the input text type? -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Alternative to Data.Binary
Grzegorz Chrupala wrote: Hi all, Is there a serialization library other than the Data.Binary from hackage? Yes. binary-strict is one alternative: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-strict I am using Data.Binary in a couple of projects, but I have found its stack and memory usage very hard to control. Its very common that decoding a map or list of non-trivial size uses up all available RAM, or causes a stack overflow. I don't care that much about it being extremely fast, I just want to stop worrying that if I try to read a file a few percent larger than the last time, my program will suddenly stop working. Best, -- Grzegorz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?
Thomas Hartman wrote: Is there something like subRegex... something like =~ s/.../.../ in perl... for haskell pcre Regexen? I mean, subRegex from Text.Regex of course: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-compat Thanks for any advice, thomas. Short answer: No. This is a FAQ. The usual answer to your follow up Why not? is that the design space is rather huge. Rather than justify this statement, I will point at the complicated module: http://hackage.haskell.org/packages/archive/split/0.1.1/doc/html/Data-List-Split.html The above module is a wide range of strategies for splitting lists, which is a much simpler problem than your subRegex request, and only works on lists. A subRegex library should also work on bytestrings (and Seq). At the cost of writing your own routine you get exactly what you want in a screen or less of code, see http://hackage.haskell.org/packages/archive/regex-compat/0.92/doc/html/src/Text-Regex.html#subRegex for subRegex which is 30 lines of code. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: do nmergeIO or mergeIO preserve order?
Anatoly Yakovenko wrote: do nmergeIO or mergeIO preserve order? or not preserve order? If you have a list of operations [IO a] then the future package at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/future can do this. It's 'forkPromises' function returns a Chan a which can be used to get the non-order preserving results (actually Either SomeExcption a). If you are feeling lucky you can use getChanContents and filter to get a lazy [a] which is the results as they are completed. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: ANN: Future 1.1.0 concurrency library
Don Stewart wrote: Who needs to build futures into the language -- all you need is MVars, eh? For a pure computation in Haskell one can use par (which did take changing the runtime, and arguably adding to the language). The future package I uploaded is just a clean way to get something a little like par for an IO computation, as a library. One can build many useful APIs quite cheaply using MVars. Hackage even has a few examples (many under the Concurrency heading). This API was interesting solely because it is in the C++ standard and the discussion about how the standard left out useful proposed parts of the API. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: Future 1.1.0 concurrency library
Hello, As a side effect of the discussion of the new C++ future/promise features at http://lambda-the-ultimate.org/node/3221 I have implemented a Haskell package called future at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/future This ought to do what C++ standard futures/promises do, plus a bit more. The main operation is forkPromise :: IO a - IO (Promise a) This sets the IO a operation running in a fresh thread. The eventual result can be accessed in many ways (non-blocking, blocking, blocking with timeout). let one :: Int; one = 1 p - forkPromise (return (one+one)) x - get y - wait x is an Int with value 2. y is an (Either SomeException Int) with value (Right 2). The useful thing about futures, as opposed to various IVar packages, is handling the case where the forked operation ends with an exception. The exception becomes the return value of the promise. The get operation rethrows it, the wait operation returns it as (Left ...). There is also an abort command to kill a promise. The dead promise may then have an exceptions as its value. The plus a bit more than C++ is the nonblocking addTodo feature. This takes a continuation function from the Either SomeException a to an IO operation. These continuation functions get queued and they are run immediately after the the forked operation completes. Once completed any new addTodo continuations run immediately. These continuations allow you to race a list of action and take the first one done, or to collect the answers as they complete into a Chan. Both of those options are demonstrated in Future.hs as racePromises and forkPromises. It should be safe to use unsafePerformIO . get or unsafePeformIO . wait to get lazy access to the result, which is itself immutable once set. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Data.Binary stack overflow with Data.Sequence String
I have collected some of the backing code to decode. This is all pasted below so we can look at it. I will not improperly guess at the cause of the problem, and be totally wrong. I observe Get is a lazy monad: Prelude Data.Binary Data.Binary.Get Data.Monoid World == runGet ((return $! undefined) return World) mempty True Prelude Data.Binary Data.Binary.Get Data.Monoid 'W' == head (runGet ((return $! undefined) = \t - return $! ('W':t)) mempty) True Prelude Data.Binary Data.Binary.Get Data.Monoid orld == tail (runGet ((return $! undefined) = \h - return $! (h:orld)) mempty) True This may have implication for building the String from Char. The get for Char uses return $! char but this is no good unless the Char is being forced, as the (return $! undefined) above shows. The instance Get [a] inherits the laziness of replicateM which is sequence. The instance (Seq.Seq e) does not force the x :: String value. And even if it did it would only force the leading (:) cons cell and not the characters themselves. The instance is strict in what passes for the spine of the Seq, not the contents, and certainly not the deep contents. You might try using newtype when deserializing ArticleDB and make a much stricter version of the code. All the relevant code (yi binary ghc): type Article = String type ArticleDB = Seq Article -- | Read in database from 'dbLocation' and then parse it into an 'ArticleDB'. readDB :: YiM ArticleDB readDB = io $ (dbLocation = r) `catch` (\_ - return empty) where r x = fmap (decode . BL.fromChunks . return) $ B.readFile x -- We read in with strict bytestrings to guarantee the file is closed, -- and then we convert it to the lazy bytestring data.binary expects. -- This is inefficient, but alas... decode :: Binary a = ByteString - a decode = runGet get instance (Binary e) = Binary (Seq.Seq e) where put s = put (Seq.length s) Fold.mapM_ put s get = do n - get :: Get Int rep Seq.empty n get where rep xs 0 _ = return $! xs rep xs n g = xs `seq` n `seq` do x - g rep (xs Seq.| x) (n-1) g instance Binary Int where put i = put (fromIntegral i :: Int64) get = liftM fromIntegral (get :: Get Int64) instance Binary Int64 where put i = put (fromIntegral i :: Word64) get = liftM fromIntegral (get :: Get Word64) instance Binary Word64 where put = putWord64be get = getWord64be instance Binary a = Binary [a] where put l = put (length l) mapM_ put l get= do n - get :: Get Int replicateM n get -- Char is serialised as UTF-8 instance Binary Char where put a | c = 0x7f = put (fromIntegral c :: Word8) | c = 0x7ff= do put (0xc0 .|. y) put (0x80 .|. z) | c = 0x = do put (0xe0 .|. x) put (0x80 .|. y) put (0x80 .|. z) | c = 0x10 = do put (0xf0 .|. w) put (0x80 .|. x) put (0x80 .|. y) put (0x80 .|. z) | otherwise = error Not a valid Unicode code point where c = ord a z, y, x, w :: Word8 z = fromIntegral (c .. 0x3f) y = fromIntegral (shiftR c 6 .. 0x3f) x = fromIntegral (shiftR c 12 .. 0x3f) w = fromIntegral (shiftR c 18 .. 0x7) get = do let getByte = liftM (fromIntegral :: Word8 - Int) get shiftL6 = flip shiftL 6 :: Int - Int w - getByte r - case () of _ | w 0x80 - return w | w 0xe0 - do x - liftM (xor 0x80) getByte return (x .|. shiftL6 (xor 0xc0 w)) | w 0xf0 - do x - liftM (xor 0x80) getByte y - liftM (xor 0x80) getByte return (y .|. shiftL6 (x .|. shiftL6 (xor 0xe0 w))) | otherwise - do x - liftM (xor 0x80) getByte y - liftM (xor 0x80) getByte z - liftM (xor 0x80) getByte return (z .|. shiftL6 (y .|. shiftL6 (x .|. shiftL6 (xor 0xf0 w return $! chr r replicateM:: (Monad m) = Int - m a - m [a] replicateM n x= sequence (replicate n x) sequence :: Monad m = [m a] - m [a] {-# INLINE sequence #-} sequence ms = foldr k (return []) ms where k m m' = do { x - m; xs - m'; return (x:xs) } -- Chris ___ Haskell-Cafe mailing list
[Haskell-cafe] major speed improvement: regex-tdfa reaches 1.0.0
Announcing the version 1.0.0 release of regex-tdfa. I am proud of this release. This is not just a bug fix release. It is a serious improvement in the asymptotic running time. The previous versions allowed bad combinations of pattern and searched text length to scale badly in the length of the text. Previously the worst case for text of length N was O(N^3). The new worst case asymptotic runtime scaled as O(N). There is never any backtracking. And the worst case storage space is independent of N. The package is on hackage at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa The darcs repository is at http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/ All non-overlapping matches are found and returned, along with all captured (parenthesized) subexpressions. The result is precisely what Posix extended regular expressions are supposed to return. To be concrete, consider example text with length of N of (2^n)+2: longInput = replicate (2^n) 'a' ++ bc And define 4 worst-case-scenario patterns. I wrote the code and I know how to kill it: wcs0 = a*b wcs1 = a*c wcs2 = (a|aa)*b wcs3 = (a|aa)*c wcs0 is easy. wcs1 causes the old code to backtrack. wcs2 causes the old code's storage to scale as O(N). wcs3 causes both backtracking and O(N) storage with the old code. The old code's time scales as O(N) for wcs0, O(N^2) for wcs1 and wcs2, and O(N^3) for wcs3. The new code is always O(N). The actual timings for the old code on my G4 laptop for wcs on 2^8 and 2^9 and 2^10 are: Reason:compare-tdfa chrisk$ time ./Test-TDFA-np wcs3 8 +RTS -sstderr 21 | head -4 ./Test-TDFA-np wcs3 8 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(257,1)),(1,(-1,0))]] 263,776,756 bytes allocated in the heap user0m1.017s sys 0m0.058s Reason:compare-tdfa chrisk$ time ./Test-TDFA-np wcs3 9 +RTS -sstderr 21 | head -4 ./Test-TDFA-np wcs3 9 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(513,1)),(1,(-1,0))]] 1,998,647,452 bytes allocated in the heap user0m7.083s sys 0m0.289s Reason:compare-tdfa chrisk$ time ./Test-TDFA-np wcs3 10 +RTS -sstderr 21 | head -4 ./Test-TDFA-np wcs3 10 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(1025,1)),(1,(-1,0))]] 15,653,277,200 bytes allocated in the heap user0m53.350s sys 0m2.056s The doubling of length is causing a nearly eight-fold time increase. The heap allocation is also increasing nearly eight-fold. The new code with the same input pattern and texts gives: Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 8 +RTS -sstderr 21 | head -4 ./Test-TDFA2-0.99.19-np2 wcs3 8 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(257,1)),(1,(-1,0))]] 2,135,324 bytes allocated in the heap user0m0.017s sys 0m0.017s Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 9 +RTS -sstderr 21 | head -4 ./Test-TDFA2-0.99.19-np2 wcs3 9 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(513,1)),(1,(-1,0))]] 3,588,656 bytes allocated in the heap user0m0.024s sys 0m0.017s Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 10 +RTS -sstderr 21 | head -4 ./Test-TDFA2-0.99.19-np2 wcs3 10 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(1025,1)),(1,(-1,0))]] 6,345,436 bytes allocated in the heap user0m0.038s sys 0m0.018s Note that the heap allocation for the 1026 character example above is 2466 times less than the old code. That was too fast to prove the scaling, so take more input: Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 20 +RTS -sstderr 21 | head -4 ./Test-TDFA2-0.99.19-np2 wcs3 20 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(1048577,1)),(1,(-1,0))]] 5,708,574,848 bytes allocated in the heap user0m26.023s sys 0m0.985s Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 21 +RTS -sstderr 21 | head -4 ./Test-TDFA2-0.99.19-np2 wcs3 21 +RTS -sstderr Test for [Char] Right [array (0,1) [(0,(2097153,1)),(1,(-1,0))]] 11,416,354,056 bytes allocated in the heap user0m52.656s sys 0m1.985s The length and time both doubled, as did the heap allocation. And the new code has searched two million characters in the time the old code searched one thousand. How about away from the worst case scenario? On the testing suite the new code is running slightly slower: Reason:compare-tdfa chrisk$ time ./Test-TDFA-np -r 1 100 /dev/null user0m4.841s sys 0m3.019s Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 -r 1 100 /dev/null user0m5.970s sys 0m3.012s So that is an increase of execution time of 14%. This small dip in performance might be reclaimable with more optimization. I think the gain in worst case performance already offsets the slight cost. The code for String is complete. The strict and lazy bytestrings and the (Seq Char) are currently using the String code for matching. This will be improved in a future release. Cheers, Chris
[Haskell-cafe] ANN: bug fix for regex-tdfa, version 0.97.4 (and regex-ast)
Hello, The regex-tdfa package has had a series of bug fix releases (0.97.1 and 2 and 3 and now 4). This 0.97.4 releases finishes fixing the bug that was only mostly fixed in the 0.97.1 release. An example of the fixed bug: Apply the regex pattern (BB(B?))+(B?) to the text . The BB in the pattern should be used twice and both B? should match nothing. My code grouped the + wrong and matched the BB once and then both the B? matched a B. The case fixed here was not initially caught because of how I search for unknown bugs. I use Arbitrary from QuickCheck to generate random patterns and strings to search, and compare regex-tdfa to another POSIX engine. Because I am on OS X, I am limited by the the native POSIX libraries bugs: this bug in regex-tdfa was triggered only when the native POSIX was also buggy. But the source of most of my unit tests is ATT research [1], and they have a libast with a POSIX implementation. I have adapted my regex-* wrapper packages to make a regex-ast Haskell interface, but the difficulties with the ATT headers prevent me from releasing this on hackage. This regex-ast has given me access to a less buggy POSIX back-end, and randomized testing has led to catching the bug fixed here (as well as a few bug reports back to ATT). So while regex-tdfa will not win many speed contests, it is the only POSIX regular expression library I have running that passes all the unit tests. [1] http://www.research.att.com/sw/download/ http://www.research.att.com/~gsf/testregex/ http://www.research.att.com/~gsf/testregex/re-interpretation.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: speed: ghc vs gcc
On a G4: s.hs (which does not need bang patterns) is: main = seq (sum0 (10^9) 0) (return ()) sum0 :: Int - Int - Int sum0 0 acc = acc sum0 x acc = sum0 (x-1) $! (acc+x) And s.c is (actually including 10^9, which Bulat's did not): main() { int sum=0; for(int i=1000*1000*1000; i0; i--) sum += i; } I compiled them with ghc --make -O2 s.hs -o shs gcc -o sc -std=c99 -O3 -funroll-loops s.c And timed them: $ time ./shs real0m3.309s user0m3.008s sys 0m0.026s $ time ./sc real0m0.411s user0m0.316s sys 0m0.006s So C is 9.4 times faster. And via-C did not help: $ ghc -fvia-C -optc -O3 -funroll-loops --make -O2 s.hs -o shs-via-C $ time ./shs-via-C real0m7.051s user0m3.010s sys 0m0.050s -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: Bug fix to regex-tdfa, new version 0.97.3
To Haskell and Libraries and Haskell-Cafe, Whilst improving regex-tdfa I have run across new bugs. Some patterns were getting compiled wrong and others were affected by an execution bug. As this package has actual users, I wanted to make sure they get these fixes immediately. Three Cheers For QuickCheck! The new version is 0.97.3 at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa And this version passes all the unit tests I have, including coverage for the new bugs. This is no warranty that regex-tdfa is bug free, since I made that same claim last release. For instance: I suspect 0.97.3 may be buggy if used in the optional left-associative mode. The new improved version of regex-tdfa is still a long way off. Cheers, Chris Kuklewicz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Fastest regex package?
Eugene Kirpichov wrote: All in all, my question remains: what is the fastest way to do this kind of parsing on a lazy bytestring? Your example regular expression works the same in both Posix and Perl-ish semantics. Do you know the difference? Posix libraries look for the longest match of all possible matches. Perl-ish is left-bias and looks at the left branch first and only looks at the right branch when the left fails and it has to backtrack. So the ++ operator is a hack to try and control the backtracking of Perl regular expressions. Such a things has no meaning in Posix where the implementation details are totally different. You might try this variant of the example pattern: /foo.xml.*fooid=([0-9]+)[^0-9].*barid=([0-9]+) The [^0-9] can be used when you know that there is at least one junk character before the barid, which I suspect will always occur in a URL. I expect regex-posix to be slower than regex-pcre. I have not used the new pcre-light. I wrote regex-tdfa — it is pure haskell and not a C library wrapper. There are patterns where regex-pcre will backtrack and take exponentially more time than regex-tdfa's automaton (which is not yet ideal and may get faster). So what is the lazy bytestring with its multiple buffers doing for you when using PCRE, PCRE-light, or regex-posix? Absolutely nothing. To run against these C libraries the target text is converted to a single buffer, i.e. a CStringLen in Haskell. Thus it is morally converted into a strict bytestring. This may involve copying the logfile into a NEW strict bytestring EVERY TIME you run a match. Please Please Please convert to a strict bytestring and then run regex-pcre or pcre-light (or any of the others). regex-tdfa does not convert it into a strict bytestring, but is otherwise much slower than pcre for your simple pattern. As for regex-pcre's interfaceyou should use the API in regex-base to get a pure interface. The RegexLike functions are the pure interface for this, and the RegexContext class offers a slew of instances with useful variants. But if you have been getting to the low level IO API in regex-pcre then you probably do not need or want the RegexContext transformations. And BoyerMoore (which I think I helped optimize): this may be faster because it does not copy your whole Lazy bytestring into a Strict ByteString for each search. But you may wish to test it with a Strict ByteString as input anyway. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: evaluation semantics of bind
Gregg Reynolds wrote: getChar = \x - getChar An optimizer can see that the result of the first getChar is discarded True, so 'x' is not used, and it can be garbage collected, and may not even be created. But that data dependency is simple not the data dependency that make IO sequential. Here is code from IOBase.lhs for GHC: newtype IO a = IO (State# RealWorld - (# State# RealWorld, a #)) The # are unboxed types and thus strict, but here we can erase them for clarity: newtype IO a = IO (State RealWorld - (State RealWorld, a)) getChar is of type IO Char so that is constructor IO applied to a function from the State RealWorld to a strict pair of State RealWorld and Char. Since this is strict there is no laziness and the code must evaluate the input and output State RealWorld to ensure they are not bottom or error. Here is the rest of the plumbing in GHC: unIO :: IO a - (State# RealWorld - (# State# RealWorld, a #)) unIO (IO a) = a instance Functor IO where fmap f x = x = (return . f) instance Monad IO where {-# INLINE return #-} {-# INLINE () #-} {-# INLINE (=) #-} m k = m = \ _ - k return x= returnIO x m = k = bindIO m k fail s = failIO s failIO :: String - IO a failIO s = ioError (userError s) liftIO :: IO a - State# RealWorld - STret RealWorld a liftIO (IO m) = \s - case m s of (# s', r #) - STret s' r bindIO :: IO a - (a - IO b) - IO b bindIO (IO m) k = IO ( \ s - case m s of (# new_s, a #) - unIO (k a) new_s ) thenIO :: IO a - IO b - IO b thenIO (IO m) k = IO ( \ s - case m s of (# new_s, _ #) - unIO k new_s ) returnIO :: a - IO a returnIO x = IO (\ s - (# s, x #)) The bind operation's case statement forces the unboxed new_s :: State# RealWorld to be strictly evaluated, and this depends on the input strict s :: State# RealWorld. This data dependency of new_s on s is what forces IO statements to evaluate sequentially. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Monad explanation
All Haskell programs start as main :: IO () though... so they all get evaluated in the context of another IO () don't they? True for most cases now, but historically false. Haskell existed and people wrote programs for years before the Monad class and IO were created. A Haskell98 program can be taken to have a main :: IO (), but that is not essential. Which is the point jcc made: Well... Haskell compilers and runhaskell-style interpreters (not regular Hugs/ghci!) take the value of Main.main as `the program'. But that feels (to me --- I could be wrong) like an aspect of a particular hosted environment. REPLs can handle programs that aren't wrapped up in IO at all; and there's no reason why IO has to be the type of IO-performning-things in REPLs, either. You could just as well write a REPL that took, say, tangible values [http://haskell.org/haskellwiki/TV] as input instead, and displayed them. So it's more a matter of Haskell implementations can be given an IO value to run than that combining IO values together somehow runs them. jcc ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: evaluation semantics of bind
Jonathan Cast wrote: On Fri, 2009-02-06 at 00:51 +0100, Peter Verswyvelen wrote: On Thu, Feb 5, 2009 at 8:20 PM, ChrisK hask...@list.mightyreason.com wrote: Since this is strict there is no laziness and the code must evaluate the input and output State RealWorld to ensure they are not bottom or error. Interesting. I also thought it was the passing of the RealWorld that caused the sequencing, I never realized that the strictness played an important role here. So what would happen if it would be lazy instead of strict? What kind of craziness would occur? The order of side effects would be demand-driven, rather than order-of-statement driven. So if I said: [snip] Essentially, the program would act as if every statement was wrapped up in an unsafeInterleaveIO. jcc I do not think so. Consider http://darcs.haskell.org/packages/base/GHC/IOBase.lhs to see unsafeInterleaveIO: unsafeInterleaveIO m = unsafeDupableInterleaveIO (noDuplicate m) unsafeDupableInterleaveIO (IO m) = IO ( \ s - let r = case m s of (# _, res #) - res in (# s, r #)) And Control.Monad.State.Lazy which is the non-strict State monad: newtype State s a = State { runState :: s - (a, s) } instance Monad (State s) where return a = State $ \s - (a, s) m = k = State $ \s - let (a, s') = runState m s in runState (k a) s' And you can see that the data dependence is broken in unsafeInterleaveIO by _, and not broken in State.Lazy. But neither s nor s' are not forced to WHNF. What you can do State.Lazy is put an bottom into the state. Or you can put an unevaluated thunk into the state, and if it gets replaced later then you never paid to evaluate it. But the sequencing is retained. One could make an unsafeInterleaveIO for state: interleaveState :: State s a - State s a interleaveState (State m) = State ( \ s - let (a,_) = runState m s in (a,s) ) Now the dependency chain is broken in the above, due to the _ being ignored. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: regex-posix-unittest-1.0 AND regex-posix-0.94.1 AND regex-tdfa-0.97.1
I have three announcements to make about regex-* related packages. The regex-posix-0.94.1 package update provides better semantics for multiple matches. Below version 0.94, if any match was empty the matching would stop. Now the empty match is returned and the position is incremented and the searching continues. The regex-tdfa-0.71.1 package update provides the same new multiple match semantics. It also fixes a bug I found. I know of no outstanding bugs in regex-tdfa, and version 0.71.1 now passes all the tests used in regex-posix-unittest-1.0 announced below. We should care about the correctness of our operating system libraries. To help with this, I have a NEW package to announce: regex-posix-unittest-1.0 The accompanying wiki page is http://www.haskell.org/haskellwiki/Regex_Posix This new package provides an executable called regex-posix-unittest which you can install as --user or --global. The regex-posix-unittest executable with no arguments runs a suite of unit tests, all of which are described by text files in the package, the format is documented in the wiki page. By editing the text files in the package you can add to or delete from the unit tests being run. With two arguments the program expects the text first and the pattern second and will run just that match and print all the results. How does regex-posix-unittest help us care about the OS libraries? The regex-posix distributed in the GHC bundle uses the OS C library's regex.h API. The regex-posix-unittest package will quite likely show you that your OS C library regex.h API is full of bugs. If you are on Linux, it will show you a plethora of GLIBC bugs in Posix conformance. If you are on OS X, FreeBSD, or NetBSD, it will show you many bugs including a critical bug where it fail to find a match where one actually exists. These bugs in the OS library are inherited by your sed program as well as regex-posix and Haskell. If you are on Windows, or OpenBSD, or Solaris, or anything else, then please update the wiki page at http://www.haskell.org/haskellwiki/Regex_Posix or email me with your results so I can update the wiki. You may have evil and ingenious tests of Posix extended regular expressions to add to the test suite. Adding them is easy and if you send them to me I will put them in an updated version of regex-posix-unittest. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: ANN: HTTPbis / HTTP-4000.x package available
Duncan Coutts wrote: Proxy auto-configuration files are JavaScript. It uses more or less the full JavaScript language (ECMA these days), though with a small subset of the standard library. W T F So we want a tiny naive javascript interpreter, hopefully in pure Haskell. The dumbest interpreter than can parse and evaluate the language. An separate program that just translates .pac file to something declarative is sorely needed. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Existencial quantification and polymorphic datatypes
Mauricio wrote: Hi, I'm trying, without success, to understand the difference between existencial quantification and polymorphic datatypes. Can you give me a hint, or an example where one is valid and the other is not? The first thing to ensure you know is that Haskell can have functions (usually in type classes) that have the Perl-like ability to return different types of values depending on the context. The usual example for this is fromInteger :: Num a = Integer - a. Every integer in your source code is put through this function. If the number is used in an Int context then it makes and Int, if use in a Word8 context then it makes a Word8. The other way we talk about the type of context is that this is the type demanded by the user of the value. Concretely: x :: forall a. Num a = a x = fromInteger 1 The type of 'x' is any Num type chosen by the user. The critical thing here is that fromInteger does not get to choose the type. This is bounded or constrained polymorphism, the type 'a' is polymorphic but bounded by the Num constraint. Now I can refer to ghc's manual on existential quantification: http://www.haskell.org/ghc/docs/latest/html/users_guide/data-type-extensions.html#existential-quantification So what about wanting to write a function myNum that returns some Num type that myNum gets to choose instead of the user demanding a type. We can do this with existential types, which usually is used with a data type (often a GADT) and here I call this SomeNum: {-# LANGUAGE ExistentialQuantification #-} import Int import Data.Typeable data SomeNum = forall a. (Typeable a, Num a) = SomeNum a myNum :: Integer - SomeNum myNum x | abs x 2^7 = let i :: Int8 i = fromInteger x in SomeNum i | abs x 2^31 = let i :: Int32 i = fromInteger x in SomeNum i | otherwise = SomeNum x display :: SomeNum - String display (SomeNum i) = show i ++ :: ++ show (typeOf i) main = do putStrLn (display (myNum (2^0))) putStrLn (display (myNum (2^8))) putStrLn (display (myNum (2^32))) In GHCI I see *Main main main 1 :: Int8 256 :: Int32 4294967296 :: Integer In the above you can see the polymorphism of the return type of fromInteger, it returns a Int8 or a Int32. You can see the polymorphism of the argument of show, it takes an Int8 or Int32 or Integer. The latest ghc-6.10.1 also allows avoiding use of SomeNum, see impredicative-polymorphism: http://www.haskell.org/ghc/docs/latest/html/users_guide/other-type-extensions.html#impredicative-polymorphism So this next bit is very very new: {-# LANGUAGE ExistentialQuantification, RankNTypes, ImpredicativeTypes #-} displayNum :: [forall a. Num a = Integer - a] - String displayNum converts = unlines $ concatMap withF converts where withF :: (forall b. Num b = Integer - b) - [String] withF f = [ show (f 1 :: Int8) , show (f (2^10) :: Int32) , show (f (2^32) :: Integer) ] The first argument to displayNum is a list of an existential type. Before ImpredicativeType this would have required defining and using a data type like SomeNum. So the latest ghc lets one avoid the extra data type. displayNum cannot demand any (Num b = Integer-b) function. The list holds SOME functions, but which one is unknowable to displayNum. The first argument of withF is polymorphic and while this requires RankNTypes (or Rank2Types) the type of withF is not existential. In withF the code demands three different types of results from the 'f'. This works because the return type of (f 1) is really (Num b = b) and this is polymorphic and any type 'b' which is a Num can be demanded. This can be tested with putStr $ displayNum [ fromInteger , fromInteger . (2*) , fromInteger . (min 1000) ] which passes in a list of three different conversion functions. In ghci the result is: 1 1024 4294967296 2 2048 8589934592 1 1000 1000 -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Existencial quantification and polymorphic datatypes (actually, components...)
Great, thanks! I'm enlightened :) And no one had to hit you with a stick first! But how is this: data SomeNum = forall a. SN a different from: data SomeNum = SN (forall a. a) ? At a glance they look the same to me — but only the first is accepted by ghc. There is also the GADT syntax: data SomeNum where SomeNum :: forall a. (Typeable a, Num a) = a - SomeNum which is accepted by ghc with the LANGUAGE GADTs extension. The GADT is more than simple syntactic sugar, it allows for easier use this kind of existential type. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Bug in Text.Regex.PCRE - do not accept national symbol in pattern
Alexandr, Thanks for sending me this question about unicode and regex-pcre. I will share with the mailing list. This is an encoding issue. From the haddock documentation for regex-pcre: http://hackage.haskell.org/packages/archive/regex-pcre/0.94.1/doc/html/Text-Regex-PCRE.html Using the provided CompOption and ExecOption values and if configUTF8 is True, then you might be able to send UTF8 encoded ByteStrings to PCRE and get sensible results. This is currently untested. This is a literate Haskell post so you can save with file extension .lhs and pass this to ghci. The answer is a combination of man 3 pcre and the haddock documentation for haskell-pcre and using makeRegexOpts. I show one possible way to use utf8 below, via the 'utf8-string' package from hackage. There are other ways to use the same package and other packages available. {-# LANGUAGE FlexibleContexts #-} import Text.Regex.PCRE hiding ((=~)) --import Text.Regex.PCRE.Wrap(configUtf8) import qualified Data.ByteString.UTF8 as U import qualified System.IO.UTF8 as U import Data.Bits((.|.)) Here I copied the original source for (=~) from http://hackage.haskell.org/packages/archive/regex-pcre/0.94.1/doc/html/Text-Regex-PCRE-Wrap.html#v%3A%3D~ I then editied it to create a custom (=~) that defines its own options. You can add compNoUTF8Check for performance/safety tradeoff (see man 3 pcre). makeRegexUtf8 :: (RegexMaker Regex CompOption ExecOption source) = source - Regex makeRegexUtf8 r = let co = defaultCompOpt .|. compUTF8 -- need compUTF8 flag when using makeRegexOpts -- co = defaultCompOpt .|. compUTF8 .|. compNoUTF8Check -- in makeRegexOpts co defaultExecOpt r (=~) :: (RegexMaker Regex CompOption ExecOption source,RegexContext Regex source1 target) = source1 - source - target (=~) x r = let q = makeRegexUtf8 r in match q x If you are going to use the same pattern against many different texts then you should NOT use (=~). Instead you should call makeRegexUtf8 and reuse the resulting Regex value. Otherwise you have to recompile the pattern for each match performed. Below, 're_test' was changed internally to convert the [Char] into a ByteString holding a utf8 encoded representation. The 'makeRegexOpts' and 'match' calls will then run the libpcre routines directly on the the memory that backs the ByteString. This is an optimal was to use the library. re_test :: String - String - Bool re_test re str = (U.fromString str) =~ (U.fromString re) -- test for national symbols main = do putStrLn $ If this line ends with True then your libpcre has UTF8 support: ++ show configUTF8 let pattern1,pattern2,pattern3,text :: String pattern1 = ^п.* pattern2 = ^..ив.* pattern3 = ^..$ text = привет U.putStrLn $ The 3 patterns are: ++ pattern1 ++ , ++ pattern2 ++ , and ++pattern3 U.putStrLn $ The text to be matched is ++ text putStrLn $ The length of the text to be matched is ++show (length text) putStrLn All three lines below should print True print $ re_test pattern1 text print $ re_test pattern2 text print $ re_test pattern3 text The output when I run this on my machine is If this line ends with True then your libpcre has UTF8 support: True The 3 patterns are: ^п.*, ^..ив.*, and ^..$ The text to be matched is привет The length of the text to be matched is 6 All three lines below should print True True True True ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How to make code least strict?
Robin Green wrote: What guidelines should one follow to make Haskell code least-strict? Obviously the use of seq and bang-patterns make code more strict. Code is strict when it evaluates values to determine a pattern match. So avoiding that makes code lazier. Values are evaluated when decisions have to be make in order to choose what an expression will evaluate to. Avoiding case statements and things that de-sugar to case statements such as if then else and pattern matching. Put off examining the input values. Occasionally the use of lazy patterns, preceded by ~, can help make code both more compact and less strict. Consider that the order of pattern matching can matter as well, the simplest common case being zip: zip xs [] = [] zip [] ys = [] zip (x:xs) (y:ys) = (x,y) : zip xs ys The order of the first two lines of zip's definition affects whether zip [] (error boom) or zip (error bam) [] will be an error. This shows that least-strict is not a unique goal. For the choice I just made the zip [] (error boom) will cause an error because the first definition line of zip checks the second argument, while zip (error bam) [] will evaluate to []. The other way to reduce strictness is to be more polymorphic because this reduces what can be sensibly done with the arguments. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: some ideas for Haskell', from Python
Manlio Perillo wrote: Brandon S. Allbery KF8NH ha scritto: ...in theory. In practice GHC needs help with circular imports, and some cycles might be impossible to resolve. This is interesting. Where can I find some examples? Is this explained in the Real World Haskell book? I have no idea about RWH, but there are certainly mutual import cycles that cannot be resolved by using hs-boot files with GHC. Consider three modules A and B and C, which are A-B-C permutations of module A(A,AKBC, AKCB) import B(B,BKAC) import C(C,CKAB) data A AKBC :: Either B C AKCB :: Either C B instance Show (A,BKAC,CKAB) where ... There is no way to break the ?K?? import cycle with just hs-boot files. I had to solve this by generating helper modules. Call the data A the rank-1 declarations. Then the ?K?? are built on rank-1 types such as B and C and are rank-2 declarations. The rank-1 declarations can all be put in hs-boot files but the rank-2 declaration import cycle cannot be broken with the same hs-boot files. Some of these need to be put in separate modules. It may be possible to make a useful definition of rank-3 and higher declarations. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Comments from OCaml Hacker Brian Hurt
Dan Weston wrote: Richard Feinman once said: if someone says he understands quantum mechanics, he doesn't understand quantum mechanics. But what did he know... Well, I am a physicist and Feynman (with a y, not an i), is not talking about the linear algebra. Of course, linear algebra [1] here is used a vector space [2]. The tricky thing is that humans then measure the state. And this is confusing step that causes Feynman to say that no one understands it. But the measurement step and how it interacts with the vector space can be approximated by an algorithm [3] using ExistentialQuantification and Arrows. [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/numeric-quest http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hmatrix http://hackage.haskell.org/cgi-bin/hackage-scripts/package/Vec http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas [2] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/vector-space [3] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/quantum-arrow -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Comments from OCaml Hacker Brian Hurt
Here is a great Monoid found in the wild story: I just implemented a library for binary message serialization that follows Google's protocol buffer format. The documentation of this was very scattered in some respects but I kept reading snippets which I have pasted below. The effect of these snippets is to document that the messages on the wire should mimic an API where they can be combined in various merge operations (right-biased, concatenation, and recursive merging), and that well-formed messages have default values for all fields (which can be set in the spec). So the code below is a well thought out collection of properties that has reinvented the wheel known as Monoid, so the Haskell API creates Monoid instances. http://code.google.com/apis/protocolbuffers/docs/encoding.html Normally, an encoded message would never have more than one instance of an optional or required field. However, parsers are expected to handle the case in which they do. For numeric types and strings, if the same value appears multiple times, the parser accepts the last value it sees. For embedded message fields, the parser merges multiple instances of the same field, as if with the Message::MergeFrom method – that is, all singular scalar fields in the latter instance replace those in the former, singular embedded messages are merged, and repeated fields are concatenated. The effect of these rules is that parsing the concatenation of two encoded messages produces exactly the same result as if you had parsed the two messages separately and merged the resulting objects. That is, this: MyMessage message; message.ParseFromString(str1 + str2); is equivalent to this: MyMessage message, message2; message.ParseFromString(str1); message2.ParseFromString(str2); message.MergeFrom(message2); This property is occasionally useful, as it allows you to merge two messages even if you do not know their types. And this at http://code.google.com/apis/protocolbuffers/docs/proto.html As mentioned above, elements in a message description can be labeled optional. A well-formed message may or may not contain an optional element. When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field. The default value can be specified as part of the message description. For example, let's say you want to provide a default value of 10 for a SearchRequest's result_per_page value. optional int32 result_per_page = 3 [default = 10]; If the default value is not specified for an optional element, a type-specific default value is used instead: for strings, the default value is the empty string. For bools, the default value is false. For numeric types, the default value is zero. For enums, the default value is the first value listed in the enum's type definition. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Compiling regex-posix-0.93.2 on windows
Paulo: I suggest doing this more carefully. Get the source from hackage. Edit the regex-posix.cabal file to add the include and lib directories you need on Cygwin. cabal configure it. cabal build it. cabal install it. Then in an unrelated directory try and run ghci -package regex-posix. This should load regex-posix-VERSION. If this fails then I think that the cabal file needs fixing. If that works then please test it with these commands: Prelude :m + Text.Regex.Posix Prelude Text.Regex.Posix (Text.Regex.Posix.=~) ab (()|[ab])(b) :: (String,String,String,[String]) The right answer by the way is: (,ab,,[a,,b]) But on FreeBSD/NetBSD/OS X there is a bug that I have found and it prints: (a,b,,[,,b]) Which just goes to show that a mountain of QuickCheck is what is sometimes needed to catch really hard to trigger bugs. This is the other bug I reported: Prelude Text.Regex.Posix (Text.Regex.Posix.=~) XababaY (X)(aba|ab|b)*(Y) :: (String,String,String,[String]) (,XababaY,,[X,b,Y]) The above answer is impossible (what matched a next to b?), but I now think I know WTF the library code is doing wrong (I think it is matching ababa in two passes and the second pass is greedy which is a broken strategy). The right answer is: (,XababaY,,[X,aba,Y]) Cheers, Chris PS: Yes, I have reported these bug to Apple, FreeBSD, and NetBSD this month. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Comments from OCaml Hacker Brian Hurt
Thomas DuBuisson wrote: How does forcing them to learn proposed terminology such as `Appendable' help here? Learners of Haskell do still need to learn what the new word means. The contention is that 'Appendable' is an intuitive naming that people will already have a rudimentary grasp of. This as opposed to Monoid, which absolutely requires looking up for the average coder. Intuition tells me: * 'Appendable' add an element to the back of a (finite) linear collection. * There is a 'Prependable' somewhere that add the element to the front. * There is an inverse 'pop' or 'deque' operation nearby. Absolutely none of those things are true. Let's try for 'Mergeable' * mconcat joins two collections, not a collection and an element. * Is should be a split operation. The above is true for the list instance, but false in general. Look at the instances already given that violate the collection idea: Monoid Any Monoid All Monoid (Last a) Monoid (First a) Num a = Monoid (Product a) Num a = Monoid (Sum a) And I don't even see an (Ord a)=(Max a) or a Min instance. So the original article, which coined 'Appendable', did so without much thought in the middle of a long post. But it does show the thinking was about collections and there is one ONE instance of Monoid at http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html#t%3AMonoid that is about a collection (Monoid ([] a)) that has a split operation. ONE. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Monads aren't evil? I think they are.
Henning Thielemann wrote: I have seen several libraries where all functions of a monad have the monadic result (), e.g. Binary.Put and other writing functions. This is a clear indicator, that the Monad instance is artificial and was only chosen because of the 'do' notation. I completely disagree with that example. The Put monad is, mainly, a specialized State monad. The internal state being the current fixed-size bytestring memory buffer that has been allocated and is being filled. The monad make the execution sequential so that there is only one memory buffer being filled at a time. In Put, when one memory buffer has been filled it allocates the next one to create a Lazy Bytestring. This is not to say that all M () are really monads, but just that Put () is. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: System.CPUTime and picoseconds
Neil Davies wrote: I've found the pico second accuracy useful in working with 'rate equivalent' real time systems. Systems where the individual timings (their jitter) is not critical but the long term rate should be accurate - the extra precision helps with keeping the error accumulation under control. When you are selling something (like data bandwidth) and you are pacing the data stream on a per packet basis you definitely want any error to accumulate slowly - you are in the 10^10 events per day range here. Neil Now I am posting just because I like to look at the time scales. A rate of 10^10 per Day is a period of 8.64 microseconds. If you want to slip only 1 period per year then you need a fractional accuracy of 2.74 * 10^-13. In one day this is a slip of 23.7 nanoseconds. So atomic time radio synchronization is too inaccurate. I have seen GPS receivers that claim to keep the absolute time to within 100 nanoseconds. Lennart is right that 1 picosecond accuracy is absurd compared to all the jitters and drifts in anything but an actual atomic clock in your room. But since CPUs tick faster than nanosecond the CPUTime needs better than 1 nanosecond granularity. I agree with Lennart — I also want an Integral type; it keeps the granularity constant and avoids all the pitfalls of doing math with a Double. Out of simplicity I can see why the granularity was set to 1 picosecond as it is slightly easier to specify than 100 picosecond or 10 picosecond or 1/60 nanosecond (hmmm... arcnanosecond?). Maybe Haskell should name the 1/60 nanosecond unit something clever and create a new Time submodule using it for April 1st. [ Base 60 is the real standard: http://en.wikipedia.org/wiki/Babylonian_mathematics has an 1800 B.C. tablet with the (sqrt 2) in base 60 as (1).(24)(51)(10) ] -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] ANN: ghci-haskeline 0.1
Haskeline is designed to remove the readline dependency, because Windows does not have readline. So rlwrap is useless there. Andrew Hunter wrote: On Mon, Jan 12, 2009 at 12:57:57PM -0800, Judah Jacobson wrote: I'm pleased to announce the first release of ghci-haskeline. This package uses the GHC API to reimplement ghci with the Haskeline library as a backend. Haskeline is a library for line input in command-line programs, similar to readline or editline, which is written in Haskell and thus (hopefully) more easily integrated into other Haskell programs. Perhaps this has already been discussed at length, in which case I apologize but, well, why provide line input editing at all? A number of languages/programs (off the top of my head: sml, most Schemes) don't; the standard method to get line editing is rlwrap. And this works (in my limited experience) quite well. The disadvantage as I see it of using editline or Haskeline or whatever is that it's going to be sutbly different than other methods; presumably, people won't like the changes in behavior. It seems to me that from a UNIX-y separation of concern view, the right thing to do (as many languages have chosen) is to /not/ provide line editing, and just let the user do that with any number of convenient tools that focus on getting/that/ right (like rlwrap.) Is there a reason we've not taken that approach? Thanks, AHH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: System.CPUTime and picoseconds
Tony Finch wrote: The FreeBSD kernel uses a 64+64 bit fixed point type to represent time, where the integer part is a normal Unix time_t. The fractional part is 64 bits wide in order to be able to represent multi-GHz frequencies precisely. multi-GHz being a euphemism for 18.45*10^9 GHz, over 18 billion GHz. I just read through that. The granularity is 2^-64 seconds, or 5.4*^-20 seconds? That is 54 nano-pico-seconds. I can see needing better than nanosecond, and going to milli-nanoseconds like Haskell, but to jump close to pico-nano-seconds? That skips right past micro-nano-seconds and nano-nano-seconds. That's 20 million times more resolution than Haskell's picoseconds. My that was fun to write. It looks like an excellent performance hack for OS kernels. 64-bits make for simple register and cache access, the compiled code is small and quick, etc. As a portable API it is far too complicated to use. Not in the least because only FreeBSD probably has that API. Note that at 10^-20 seconds the general relativistic shift due to altitude will matter over less than the thickness of a closed laptop. Defining now that accurately has meaning localized to less then your computer's size. The warranty for the bottom of your screen will expire sooner than that of the top. Only stock traders and relativistic particles care about time intervals that short. FreeBSD — designed for the interstellar craft to tomorrow Hmm...The W and Z bosons decay the fastest with 10^-25 second lifetimes, the shortest known lifetimes that I can find. The fundamental Planck scale, the shortest amount of time in today's physics, is 5.4*10^-44 seconds. So with 80 more bits FreeBSD would be at the fundamental limit. Of course the conversion then depends on the values of h, c, and G. Now that would also be a good April Fool's joke proposal. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: System.CPUTime and picoseconds
An Double or Int64 are both 8 bytes and counts with picoseconds precision for 2.5 hours to 106 days. Going to 12 byte integer lets you count to 3.9 billion years (signed). Going to 16 byte integer is over 10^38 years. Lennart Augustsson wrote: A double has 53 bits in the mantissa which means that for a running time of about 24 hours you'd still have picoseconds. I doubt anyone cares about picoseconds when the running time is a day. The above is an unfounded claim about the rest of humanity. That's why I think a Double is a good choice, it adapts to the time scale involved. Let's compute: tTooBig :: Double tTooBig = 2^53 main = do print (tTooBig == 1+ tTooBig) The above prints True. How long does your computer have to be running before losing picosecond resolution? tHours = tTooBig / (10^12) / 60 / 60 tHours is 2.501999792983609. My laptop battery lasts longer. Nanosecond precision is lost after 106 days. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Computer time, independent of date
Mauricio wrote: patients, I wanted to be sure not to save wrong information. It wouldn't matter if the clock is saying we are on XVII century, as long as 10 seconds would never be 10.1. Chris (yes I am an experimental physicist) asks: What are the interval durations you need to measure? 0.1 second is very different from 1 ms or 0.1 μs ! (1 ms is the atomic time radio signal accuracy) (0.1 μs can be had with GPS receivers) Since they are from equipment, what is the spec? What is the error tolerance for these durations? How are these errors allowed or forbidden to accumulate? Since you say you do not trust the incoming timing then you need to be able to measure it. Do you need to monitor the timing of each and every measurement while running? Could you measure and certify the timing of the equipment before using it? Got a cheap oscilloscope? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: System.CPUTime and picoseconds
Manlio Perillo wrote: Hi. Just out of curiosity, but why Haskell 98 System.CPUTime library module uses picoseconds instead of, say, nanoseconds? At least on POSIX systems, picoseconds precision is *never* specified. I have not idea. But at a guess, I would say that 1 ns is not such a small time interval anymore. The CPU speeds are about 3 GHz, so 0.3 ns per CPU clock. Even the RAM clock in a laptop (e.g. Apple's 17 Mac Pro) is 1066 MHz, so the internal there is just under 1 ns. Whoever picked picoseconds has made it possible to talk about a single clock interval for hardware like this. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Haskell not ready for Foo [was: Re: Hypothetical Haskell job in New York]
John A. De Goes wrote: Here's hoping someone develops a native messaging framework for Haskell, which is the equal of RabbitMQ. The first thing would be to make a Haskell client library to speak AMQP (Advanced Message Queuing Protocol) on the wire. It is a very open binary standard (with defined semantics!) at http://jira.amqp.org/confluence/display/AMQP/Advanced+Message+Queuing+Protocol I would be mildly surprised if zero people were working on this. Once that is in place then the question of a Haskell Broker for AMQP arises. But I suspect that Erlang's runtime will still rule there for production use. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Tying a simple circularly STM linked list
You can use undefined or error ... : {-# LANGUAGE RecursiveDo #-} import Control.Concurrent.STM import Control.Monad.Fix -- Transactional loop. A loop is a circular link list. data Loop a = ItemLink { item :: a , prev :: TVar (Loop a) , next :: TVar (Loop a) } -- Create a new empty transactional loop. newLoop :: a - STM (TVar (Loop a)) newLoop item = do tLoop - newTVar undefined writeTVar tLoop (ItemLink item tLoop tLoop) return tLoop Hmmm.. STM does not have a MonadFix instance. But IO does: -- Use MonadFix instance of newLoopIO newLoopIO :: a - IO (TVar (Loop a)) newLoopIO item = mfix (\ tLoop - newTVarIO (ItemLink item tLoop tLoop)) But mfix (like fix) is difficult to read in large amounts, so there is mdo: -- Use RecursiveDo notation newLoopMDO :: a - IO (TVar (Loop a)) newLoopMDO item = mdo tLoop - newTVarIO (ItemLink item tLoop tLoop) return tLoop Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: ANN: bytestring-trie 0.1.1 (bugfix)
Question and suggestion: looking at http://hackage.haskell.org/packages/archive/bytestring-trie/0.1.1/doc/html/src/Data-Trie.html#Trie I am questioning your choice of foldr in fromList: -- | Convert association list into a trie. On key conflict, values -- earlier in the list shadow later ones. fromList :: [(KeyString,a)] - Trie a fromList = foldr (uncurry insert) empty -- | /O(1)/, The empty trie. {-# INLINE empty #-} empty :: Trie a empty = Empty -- | Insert a new key. If the key is already present, overrides the -- old value {-# INLINE insert #-} insert:: KeyString - a - Trie a - Trie a insert = alterBy (\_ x _ - Just x) -- | Generic function to alter a trie by one element with a function -- to resolve conflicts (or non-conflicts). alterBy :: (KeyString - a - Maybe a - Maybe a) - KeyString - a - Trie a - Trie a alterBy f_ q_ x_ | S.null q_ = mergeBy (\x y - f_ q_ x (Just y)) (singleton q_ x_) | otherwise = go q_ where -- | /O(1)/, Is the trie empty? {-# INLINE null #-} null :: Trie a - Bool null Empty = True null _ = False So it looks like the reduction is fromList - uncurry insert - alterBy - null. Let me use insert in place of uncurry insert: fromList ( (a,1) : ( (b,2) : ( (c,3) : [] ) ) ) (a,1) `insert` ( (b,2) `insert` ( (c,3) `insert` Empty ) ) ) So fromList forces the whole call chain above to be traversed until it hits the Empty. For a large input list this will force the whole list to be allocated before proceeding AND the call chain might overflow the allowed stack size in ghc. For a large trie (which is a likely use case) this is a poor situation. If you use foldl' then the input list is only forced one element at a time. A small change to the lambda that insert passes to adjustBy will retain the same semantics of earlier key wins (which are an especially good idea in the foldl' case). Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Different QSem(N) (was Re: IORef vs TVar performance: 6 seconds versus 4 minutes)
I think I can improve on your code. Bertram Felgenhauer wrote: But why does it manually manage the waiters at all? MVars are fair, in ghc at least. So this should work: data Sem = Sem (MVar Int) (MVar Int) I changed the above to be a data newSem :: Int - IO Sem newSem initial = liftM2 Sem (newMVar initial) newEmptyMVar -- | Wait for a unit to become available waitSem :: Sem - IO () waitSem (Sem sem wakeup) = do avail' - modifyMVar sem (\avail - return (avail-1, avail-1)) Threads can get out of order at this point. This order bug may be undesirable. Also, killing the thread while it waits for wakeup below would be bad. You need an exception handler and some kind of cleanup. when (avail' 0) $ takeMVar wakeup = putMVar sem -- | Signal that a unit of the 'Sem' is available signalSem :: Sem - IO () signalSem (Sem sem wakeup) = do avail - takeMVar sem if avail 0 then putMVar wakeup (avail+1) else putMVar sem (avail+1) You should change this from = do to = block $ do. (I should turn this into a library proposal.) Bertram If you do not need to take N at a time then the untested code below has no order bug and is fair. module Sem where import Control.Concurrent.MVar import Control.Monad(when,liftM2) data Sem = Sem { avail :: MVar Int-- ^ provides fast path and fair queue , lock :: MVar () }-- ^ Held while signalling the queue -- It makes no sense here to initialize with a negative number, so -- this is treated the same as initializing with 0. newSem :: Int - IO Sem newSem init | init 1 = liftM2 Sem newEmptyMVar (newMVar ()) | otherwise = liftM2 Sem (newMVar init) (newMVar ()) waitSem :: Sem - IO () waitSem (Sem sem _) = block $ do avail - takeMVar sem when (avail 1) (signalSemN (pred avail)) signalSem :: Sem - IO () signalSem = signalSemN 1 signalSemN :: Int - Sem - IO () signalSemN i (Sem sem lock) | i = 1 = return () | otherwise = withMVar lock $ \ _ - block $ do old - tryTakeMVar sem case old of Nothing - putMVar sem i Just v - putMVar sem $! succ i All waitSem block in arrival order with the takeMVar in waitSem. The signalSemN avoid conflicting by serializing on the MVar () lock. The above is quite fast so long as the semaphore holds no more than the value 1. Once it hold more than 1 the waiter must take time to add back the remaining value. Note that once threads are woken up in order, they may still go out of order blocking for the () lock when adding back the remaining value (in the presence of other signalers). The above is also exception safe. The only place it can die is during the takeMVar and this merely remove a blocked waiter. I see no way to add a fair waitSemN without changing Sem. But if I change Sem then I can make a fair waitSemN. The untested code is below: module Sem where import Control.Concurrent.MVar import Control.Monad(when,liftM3) import Control.Exception.Base data Sem = Sem { semWait :: MVar () -- for serializing waiting threads , semAvail :: MVar Int -- positive quantity available , semSignal :: MVar () -- for serializing signaling threads } newSem i | i=0 = liftM3 Sem (newMVar ()) newEmptyMVar (newMVar ()) | otherwise = liftM3 Sem (newMVar ()) (newMVar i) (newMVar ()) waitSem :: Sem - IO () waitSem = waitSemN 1 waitSemN :: Int - Sem - IO () waitSemN i sem@(Sem w a s) | i=0 = return () | otherwise = withMVar w $ \ _ - block $ do let go n = do avail - onException (takeMVar a) (signalSemN (i-n) sem) case compare avail n of LT - go $! n-avail EQ - return () GT - signalSemN (avail-n) sem go i signalSem :: Sem - IO () signalSem = signalSemN 1 signalSemN :: Int - Sem - IO () signalSemN i (Sem _ a s) | i=0 = return () | otherwise = withMVar s $ \ _ - block $ do ma - tryTakeMVar a case ma of Nothing - putMVar a i Just v - putMVar a $! v+i Trying for exception safety makes the above slightly tricky. It works by allowing only a single thread to get the semWait lock. This keeps all the arriving threads in the fair blocking queue for the semWait lock. The holder of the semWait lock then nibbles at semAvail's positive value until it is satisfied. Excess value is added back safely with signalSemN. Cheers, Chris Kuklewicz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Using classes for a heterogeneous graph structure
I think the below code which compiles with ghc-6.10.1 should compile with ghc-6.8.3 as well. My preference is to define a GADT such as ThingMap below. Conceptually ThingMap contains two pieces of information. There is a Map to an unknown type thing and there is a dictionary which implements a Thing instance for this unknown type thing. By pattern matching (ThingMap map) in update the rest of update gets access to both pieces of information. You are guaranteed that each element of the map is the SAME type. To be able to do more stuff with it you need to add classes either as a context to the definition of class Thing or in addition to the (Thing thing) context in the ThingMap definition. Or you could use the slightly different strategy of MapTW. Here each element of the map might be a DIFFERENT underlying type (underneath ThingWrapper). The data MapThing is the older style of existential data and is, in my opinion, superseded by the GADT style used in ThingMap. {-# OPTIONS_GHC -fglasgow-exts #-} module Sample where import Data.Map(Map) import qualified Data.Map as Map class Thing thing where set_int :: thing - Integer - thing -- for wrapper wrapper :: thing - ThingWrapper wrapper thing = ThingWrapper thing instance Thing Integer where set_int me i = i -- in the generic case, this actually does something -- This really has to change -- type ThingsByString = (Thing thing) = Map.Map Integer thing -- Look at http://www.haskell.org/ghc/docs/latest/html/users_guide/data-type-extensions.html data ThingMap where ThingMap :: forall thing . (Thing thing) = Map Integer thing - ThingMap -- New GADT goodness data MapThing = forall thingish . (Thing thingish) = MapThing (Map Integer thingish) -- Old style, not as good type MapTW = Map Integer ThingWrapper update :: Integer - Integer - ThingMap - ThingMap update key value (ThingMap map) = let (Just thing) = Map.lookup key map in ThingMap $ Map.insert key (set_int thing value) map update' :: Integer - Integer - MapThing - MapThing update' key value (MapThing map) = let (Just thingie) = Map.lookup key map in MapThing $ Map.insert key (set_int thingie value) map update'' :: Integer - Integer - MapTW - MapTW update'' key value map = let (Just thingie) = Map.lookup key map in Map.insert key (set_int thingie value) map test1 = let my_map = Map.empty :: Map Integer Integer map1 = ThingMap (Map.insert 0 1 my_map) map2 = update 0 8 map1 in map2 test2 = let my_map = Map.empty :: Map Integer ThingWrapper map1 = ThingMap (Map.insert 0 (ThingWrapper (1::Integer)) my_map) map2 = update 0 8 map1 in map2 test3 = let my_map = Map.empty :: Map Integer Integer map1 = MapThing (Map.insert 0 1 my_map) map2 = update' 0 8 map1 in map2 test4 = let my_map = Map.empty :: Map Integer ThingWrapper map1 = MapThing (Map.insert 0 (ThingWrapper (1::Integer)) my_map) map2 = update' 0 8 map1 in map2 test5 = let my_map = Map.empty :: MapTW map1 = Map.insert 0 (ThingWrapper (1::Integer)) my_map map2 = update'' 0 8 map1 in map2 data ThingWrapper = forall t. (Thing t) = ThingWrapper t instance Thing ThingWrapper where set_int (ThingWrapper thing) i = wrapper $ set_int thing i wrapper thing_wrapper = thing_wrapper ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Infinite grid
Luke Palmer wrote: On Mon, Dec 29, 2008 at 3:55 PM, Martijn van Steenbergen mart...@van.steenbergen.nl mailto:mart...@van.steenbergen.nl wrote: Hello, I would like to construct an infinite two-dimensional grid of nodes, where a node looks like this: data Node = Node { north :: Node , east :: Node , south :: Node , west :: Node } in such a way that for every node n in the grid it doesn't matter how I travel to n, I always end up in the same memory location for that node. No problem: let n = Node n n n n in n But you probably want some additional data in each node, also, in which the problem becomes harder. The solution very much depends on how the data is initialized / computed / obtained. Note that one can put an IORef into the Node to allow for mutable value. Also, see if you need something akin to mkDList at http://haskell.org/haskellwiki/Tying_the_Knot Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Missing Network Functions
John Van Enk wrote: I want to group them with Network functions because traditionally these specific functions *are* in networking packages. I agree. It is true that these functions are in the arpa/inet.h header on my machine. The ntoh* and hton* are just data manipulation, just as Network.URI is all about constructing and manipulating URI data. The question I think we should answer is: Do we wish to follow the precedent of putting URI under Network and not Data? Pragmatically, programmers will expect to find these functions near the 'net ones. A network-byteorder package which has the Network.ByteOrder module is the most flexible thing to create. It can be updated and reinstalled independently of the network package. On *nix systems, the prototypes reside in netinet/in.h. On Windows they reside in winsock2.h and one must link against libws2_32. The package I have locally supports all platforms. It just does a foreign import of the function (and uses a preprocessor definition to pick the calling convention). One could make an argument to add these sorts of functions to Word and Int packages. /jve On Sat, Dec 27, 2008 at 7:14 PM, Chris Kuklewicz hask...@list.mightyreason.com mailto:hask...@list.mightyreason.com wrote: Are these going to be available on Windows or just Posix systems? And these are NOT performing network communication. They are bit manipulations. I could imagine putting them in/under the Data.Word or Data.Bits more than Network.* And they should also be made to work with the corresponding Foreign.C.Types of CShort, etc. John Van Enk wrote: While working on a project, I discovered that hton[sl] and ntoh[sl] are missing from the networking libraries. It appears there was some justification for this, but I've fallen upon a few instances where things were *not* working as expected. (Specifically when newtype'ing around HostAddress.) As such, I'm putting together a few extra functions that I think should exist in Network somewhere. I'm calling the package Network.Util (but am open to changes). Currently this only has: * htons * htonl * ntohs * ntohl Before I drop this on hackage: 1. Is there a reason I shouldn't do this? 2. Are there other suggestions for missing functions? (preferably cross-platform suggestions) Thanks all. /jve ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org mailto:Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How does one use Text.Regex.Base.RegexLike?
Yeah! I am glad the mailing list has helped. If you get stuck or need something fancier then let me know (I wrote RegexLike). But I still don't know how to get makeRegex to work. You need it to specify options like case insensitivity, or to use functions like matchAllText. Well, the options are defined in detail by the back end module, e.g. regex-posix. Take Text.Regex.Posix.Wrap: Most importantly is the Regex data type defined in this module. Also, there is an instance of RegexMaker Regex CompOption ExecOption String. The CompOptions and ExecOptions are defined in this module, and can be passed to makeRegexOpts from the RegexMaker class that converts String/ByteString/.. to the Regex data type. The CompOptions are only settable at this compilation stage. Later you can use getExecOpts and setExecOpts to create a new Regex from the original Regex which differs in the value of ExecOptions it uses. These execution options can change after creating the Regex. The makeRegex uses the defaultCompOpt and defaultExecOpt values. The CompOptions for Text.Regex.Posix.Wrap are at http://www.haskell.org/ghc/docs/latest/html/libraries/regex-posix/Text-Regex-Posix-Wrap.html#4 and are just a newtype of the underlying c-library's bitmapped int flags. AND I made this newtype derive the same Bits instance, so this works: let r = makeRegexOpts (defaultCompOpts || compIgnoreCase) defaultExecOpts caseInSensiTIVe By binding a Regex value like this and using it several times one saves the effort of compiling the String pattern into the Regex data type. If you need something that is not exactly given by the RegexLike API then there may be a RegexContext instance which does give you what you want. WARNING: The version of regex-base you are using matters most when it comes to the RegexContext instances. The latest 0.93.1 version is at http://hackage.haskell.org/packages/archive/regex-base/0.93.1/doc/html/Text-Regex-Base-Context.html and please note the Source link in the top right corner area which helps in understanding that RegexContext is just a convenience wrapper. ghc 6.10 is distributed with the 0.72.0.2 version which is at http://hackage.haskell.org/packages/archive/regex-base/0.72.0.2/doc/html/Text-Regex-Base-Context.html and again has a link the Source. Hooray for hackage. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: What are side effects in Haskell?
Hans van Thiel wrote: I just saw somewhere that one of the purposes of monads is to capture side effects. There are a few things that have side effects. The best way to capture this is to see that there are both (1) commands whose result depends on some external state (2) commands which alter that state The (2) group has side effects, and (1) is affected by them. The order in which (1) and (2) are performed matters. commands that evaluate to different things based on environment/history/input: * reading from memory location or reference * reading from a file or network stream or sensor (e.g. keyboard or mouse) * a random or pseudo-random number generator commands which affect the environment * writing to a memory location or reference * writing to a file or network * changing the display buffer (i.e. windows on the screen) command which are both (1) and (2), often because they might fail: * creating or deleting a file * obtaining a lock or semaphore * interacting with a message queue So allocating a mutable variable and writing to it are side-effect operations which have to be carefully ordered with the corresponding reads. Things without side effects are the immutable bindings produced by let/where, and application of functions. This is independent of being lazy or strict. Note that putStr :: String - IO () is a function, and as such it is a pure value. putString ['a','b','\n'] is a pure value of IO (). Performing this command in the IO monad has side effects. The fact that functions like putStr and things with types of IO () are pure value means that they can be produced and passed around is powerful way of working. It is possible to create objects in C++/Java which encapsulate an operation, but it takes more syntax. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How to think about this? (profiling)
Or if you don't want to pay for laziness at all you could build your memo array imperatively (but purely): import Data.Array.IArray(elems,(!),inRange) import Data.Array.MArray(newArray_,writeArray,readArray) import Data.Array.Unboxed(UArray) import Data.Array.ST(runSTUArray,STUArray) import Control.Monad(forM_) import Data.List(zipWith3) ackMemoSize :: Int ackMemoSize = 12; ackList :: [Int] ackList = 0:1:2:zipWith3 (\ i j k - i+j+k) ackList (tail ackList) (tail (tail ackList)) ackMemo :: UArray Int Int ackMemo = runSTUArray $ do -- the $ works with ghc 6.10, hooray a - newArray_ (0,ackMemoSize) writeArray a 0 0 writeArray a 1 1 writeArray a 2 2 let op i x | i ackMemoSize = return () | otherwise = do writeArray a i x y - readArray a (i-3) op (succ i) $! (2*x-y) -- could use (2*x) intead op 3 (0+1+2) return a ack :: Int - Int ack i | inRange (0,ackMemoSize) i = ackMemo ! i | otherwise = error outsize memorized range for ack test = (take (succ ackMemoSize) ackList) == (elems ackMemo) (ackList !! ackMemoSize) == (ack ackMemoSize) Which should have very good performance in building ackMemo (the first time it is used). By changing the (2*x-y) to (2*x) I think you get the sum-of-all-previous-entries behavior. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Type wildcards
You can get pretty far with the same trick oleg mentions at [1]. If you use local type signature then you can do things like this: {- ghci infers this type: *Main :t f f :: (Ord a) = Int - a - t - String -} f i j x | False = (undefined (i::Int) (isOrd j)) :: String f i j x = error not filled in isOrd :: Ord a = a - () isOrd = undefined Cheers, Chris [1] http://okmij.org/ftp/Haskell/types.html#partial-sigs ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: MVar and Par ..
Mozhgan Kabiri wrote: Hi .. Hope you are doing well . I've just joined this group. Hi. Recently, I am struggling to do some simple experiment with haskell language about parallelism and wrong answers that we can get while using a shared variable . Your goal is still unclear. Are you trying to create an example which shows unexpected answers? Are you trying to create an example which shows the expected answer? I tried to write a simple program, for example calculationg 'n=n+1' few times.And then I tried to do it in parallel by using 'par' and 'pseq' . The aim was to get the wrong answer because we have to share a variable here,and without using 'MVar' function we will get the wrong answer for the calculation . MVar is a mutable storage cell. One can use forkIO to create IO threads which race to change MVar, and if this is done badly then you can get unexpected answers. One cannot use the pure par and pseq to launch IO threads, so you cannot use par and pseq to create race conditions, so one can only get the expected answer. I don't know how to write it in parallel in order to get a wrong answer when we don't use MVar,because we have a shared variable here. I read about MVars as well,but also I don't know how to combine MVar and Par together to get the program to work. I do not immediately see how MVar and par can be sensibly combined at all. I wrote this : module Main where f :: Int - Int - Int f i n = g 1 i n where g x i n | x = i = g (x+1) i (n+1) | otherwise = n main :: IO () main = do putStrLn starting... let r = f 10 5 putStrLn (show r) putStrLn finished I want to make to work in parallel by using 'Par'.And also use MVar for this simple example to work. All of the example about MVar are a little bit complicated and I couldn't figure it that how can I write one,the same ! Can any one help me with this ? I want a simple example that I can feel the need of MVar when I run my program in parallel and while I am using a shared variable. Regards; Mozhgan I have run out of time, so I leave this part of your question to others. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: File name encodings
Duncan Coutts wrote: Yet another reason why FilePath /= String (except on Windows where it does). Duncan Well, it is not a OS issue but a FileSystem issue. OS X is a Unix, but the main filesystem is HFS+ which has Unicode names, though they use a different normalization. So FilePath == String on most OS X systems. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: ANNOUNCE: haskell-src-exts 0.4.4
Is there anyway to track down and fix why haskell-src-exts-0.4.4.1 still gets haddock: parse error in doc string when I try to get cabal to haddock the package? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: ANNOUNCE: haskell-src-exts 0.4.4
Ross Paterson wrote: On Wed, Dec 10, 2008 at 05:34:16PM +, ChrisK wrote: Is there anyway to track down and fix why haskell-src-exts-0.4.4.1 still gets haddock: parse error in doc string when I try to get cabal to haddock the package? Line numbers would be handy. In this case, the problem is the Original: line in some of the module headers. Indeed, I just found that. I wonder where in the universe this is documented. Oh well. I also had to change -- $x and -- $( comments. A darcs diff -u patch is attached. Cheers, Chris diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Build.hs new-haskell-src-exts/Language/Haskell/Exts/Build.hs --- old-haskell-src-exts/Language/Haskell/Exts/Build.hs 2008-12-10 17:46:47.0 + +++ new-haskell-src-exts/Language/Haskell/Exts/Build.hs 2008-12-10 17:46:47.0 + @@ -1,7 +1,6 @@ - -- | -- Module : Language.Haskell.Exts.Build --- Original: Language.Haskell.Syntax -- Copyright : (c) The GHC Team, 1997-2000, --(c) Niklas Broberg 2004 -- License : BSD-style (see the file LICENSE.txt) diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Lexer.hs new-haskell-src-exts/Language/Haskell/Exts/Lexer.hs --- old-haskell-src-exts/Language/Haskell/Exts/Lexer.hs 2008-12-10 17:46:47.0 + +++ new-haskell-src-exts/Language/Haskell/Exts/Lexer.hs 2008-12-10 17:46:47.0 + @@ -2,7 +2,6 @@ - -- | -- Module : Language.Haskell.Exts.Lexer --- Original: Language.Haskell.Lexer -- Copyright : (c) The GHC Team, 1997-2000 --(c) Niklas Broberg, 2004 -- License : BSD-style (see the file LICENSE.txt) @@ -83,8 +82,8 @@ | THDecQuote-- [d| | THTypQuote-- [t| | THCloseQuote -- |] -| THIdEscape (String) -- $x -| THParenEscape -- $( +| THIdEscape (String) -- dollar x +| THParenEscape -- dollar ( | THVarQuote-- 'x (but without the x) | THTyQuote -- ''T (but without the T) diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs new-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs --- old-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs2008-12-10 17:46:47.0 + +++ new-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs2008-12-10 17:46:47.0 + @@ -2,7 +2,6 @@ - -- | -- Module : Language.Haskell.Exts.ParseMonad --- Original: Language.Haskell.ParseMonad -- Copyright : (c) The GHC Team, 1997-2000 -- License : BSD-style (see the file libraries/base/LICENSE) -- diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs new-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs --- old-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs2008-12-10 17:46:47.0 + +++ new-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs2008-12-10 17:46:47.0 + @@ -2,7 +2,6 @@ - -- | -- Module : Language.Haskell.Exts.ParseUtils --- Original: Language.Haskell.ParseUtils -- Copyright : (c) Niklas Broberg 2004, --(c) The GHC Team, 1997-2000 -- License : BSD-style (see the file LICENSE.txt) diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Parser.ly new-haskell-src-exts/Language/Haskell/Exts/Parser.ly --- old-haskell-src-exts/Language/Haskell/Exts/Parser.ly2008-12-10 17:46:47.0 + +++ new-haskell-src-exts/Language/Haskell/Exts/Parser.ly2008-12-10 17:46:47.0 + @@ -2,7 +2,6 @@ - -- | -- Module : Language.Haskell.Exts.Parser - -- Original: Language.Haskell.Parser -- Copyright : (c) Niklas Broberg 2004, --Original (c) Simon Marlow, Sven Panne 1997-2000 -- License : BSD-style (see the file LICENSE.txt) diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Pretty.hs new-haskell-src-exts/Language/Haskell/Exts/Pretty.hs --- old-haskell-src-exts/Language/Haskell/Exts/Pretty.hs2008-12-10 17:46:47.0 + +++ new-haskell-src-exts/Language/Haskell/Exts/Pretty.hs2008-12-10 17:46:47.0 + @@ -2,7 +2,6 @@ - -- | -- Module : Language.Haskell.Exts.Pretty --- Original: Language.Haskell.Pretty -- Copyright : (c) Niklas Broberg 2004, --(c) The GHC Team, Noel Winstanley 1997-2000 -- License : BSD-style (see the file LICENSE.txt) diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Syntax.hs new-haskell-src-exts
[Haskell-cafe] Re: ANNOUNCE: haskell-src-exts 0.4.4
Niklas Broberg wrote: Fellow Haskelleers, it is my pleasure to announce the new release of the haskell-src-exts package, version 0.4.4: The full list of pragmas supported by 0.4.4 is: SOURCE, RULES, DEPRECATED, WARNING, INLINE, NOINLINE, SPECIALISE, CORE, SCC, GENERATED and UNPACK. Ah, excellent. The hprotoc program generates possibly mutually recursive modules and when it does these need fixing up by hand to add SOURCE pragmas and boot files. I could use your new package to automatically break the recursion. Interesting. Thanks, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: The Knight's Tour: solutions please
Hmmm... it seems that n=63 is a special case. [EMAIL PROTECTED] wrote: Yes, there is a solution for n=99 and for n=100 for that matter -- which can be found under one second. I only had to make a trivial modification to the previously posted code tour n k s b | k n*n = return b | otherwise = do next - (foldr mplus mzero).map return $ successors n b s tour n (k+1) next $ insrt next k b I replaced foldl1 mplus with foldr mplus zero. The old version sees no solution to n=63 quite quickly: time nice ./fromwiki-63 63 fromwiki-63: Prelude.foldl1: empty list real0m0.285s user0m0.172s sys 0m0.026s The version with the 'tour' given above does not halt after running up to 0.4 GB of RAM, so I killed it. Though having no solution may be tied to starting in the corner. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Question about fastcgi
Er, no. A fastcgi executable is (like a cgi executable) controlled by the front end web server. I run my fastcgi using Apache as the front end. The front end web server will control things like the port number. Mauricio wrote: Hi, I'm learnng to use fastcgi and, reading the examples, I see the main loop is like this: main = runFastCGI my_work However, isn't a fastcgi program supposed to choose a port where to listen to calls? For instance, in this C example: xzdev.com/nginx_fastcgi.html doesn't the line listen_socket = FCGX_OpenSocket(:8002, 2000); says it's listening to port 8002? I read the code for fastcgi, from hackage, and I can't find anything related to ports like, for instance, a default port. Am I understanding something the wrong way? Thanks, Maurício ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Question about fastcgi
I have only used this, all of these are from Haskell: pamac-cek10:~ chrisk$ cat /etc/apache2/other/httpd-fastcgi.conf IfModule mod_fastcgi.c Alias /fcgi-bin/ /Library/WebServer/FastCGI-Executables/ Directory /Library/WebServer/FastCGI-Executables/ AllowOverride None Options None Order allow,deny Allow from all SetHandler fastcgi-script Options +ExecCGI /Directory FastCgiIpcDir /tmp/fastcgi FastCgiServer /Library/WebServer/FastCGI-Executables/hw.fastcgi -pass-header Cookie FastCgiServer /Library/WebServer/FastCGI-Executables/test.fastcgi -pass-header Cookie FastCgiServer /Library/WebServer/FastCGI-Executables/xwords.fastcgi -pass-header Cookie /IfModule The above is included from the main httpd.conf which has: pamac-cek10:~ chrisk$ grep -i fast /etc/apache2/httpd.conf LoadModule fastcgi_module libexec/apache2/mod_fastcgi.so ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Efficient parallel regular expressions
The regex-tdfa package (and regex-posix) implement subexpressions capture. So if you want to match alpha beta and gamma in parallel you could write (alpha)|(beta)|(gamma) and check which subexpression has the non-empty match. This becomes slightly complicated if there are parenthesis and captures inside alpha beta or gamma. Then you need to compute the indices that are the top level captures. In particular, the regex-tdfa package (get the latest from http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa ) will create a DFA and run through the input once without backtracking. It will find the leftmost-longest match, so the order of the branches only matters if there is a tie in length. If you need to be left-biased then you need a perl-style engine, and you can use the regex-pcre or pcre-light haskell package and the PCRE library. These are obtainable from Hackage. I doubt PCRE uses a simple DFA... Cheers, Chris Martijn van Steenbergen wrote: Hello all, For my mud client Yogurt (see hackage) I'm currently working on improving the efficiency of the hooks. Right now several hooks, each consisting of a regex and an action can be active at the same time. Every time a line of input is available (usually several times a second) I run the line through all the available regexes and execute the first matching action. I figured this is not the cleverest approach and it'd be better if I |'ed all regexes into one big DFA. However, how do I then find out which of the original hooks matched and so which action to execute? As far as I know there's no way to do that with Text.Regex. Alex looks promising but is really only an executable and doesn't offer an API. I've also found mr. Jo�o Saraiva's HaLex but I don't know if that was meant to be used seriously. Does anyone have any experience with this? What's the best way to achieve this? Thanks much, Martijn. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Climbing up the shootout...
And, though I had never seen it before, the current winner for speed is ATS ( http://www.ats-lang.org/ ) which is dependently-typed functional language. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: protocol-buffers-0.2.9 for Haskell is ready
Hello one and all, Amid much rejoicing, my Haskell version of protocol-buffer is now released (version 0.2.9). What is this for? What does it do? Why? Shorter answer: It generates Haskell data types that can be converted back and forth to lazy ByteStrings that interoperate with Google's generated code in C++/Java/python. It is a pure Haskell re-implementation of the Google code at http://code.Google.com/apis/protocolbuffers/docs/overview.html which is ...a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more. Google's project produces C++, Java, and Python code. This one produces Haskell code. The release tarball (with 3 Haskell packages inside, see README in source) is at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers The darcs repository has moved to http://darcs.haskell.org/packages/protocol-buffers2/ You will also need a recent ghc compiler, the binary package and the utf8-string package from hackage.haskell.org (same site as mentioned above). The source compiles to 3 things: 1) the package protocol-buffers with the library API 2) the package protocol-buffers-descriptor with the descriptor.proto code 3) The 'hprotoc' executable which is a command line program similar to 'protoc'. The examples sub-directory in the code has the Haskell version of the addressbook.proto example and is compatible with Google's similar example code. The code generated from unittest.proto (and unittest_import.proto) includes messages TestAllTypes and TestAllExtensions which have been extensively tested by QuickCheck to ensure they can be wire encoded and decoded (see the tests sub-directory in the code). The user API, as exported by Text.ProtocolBuffers, allows for converting messages back and forth to the lazy ByteString type. And such messages can be merged, and the defaults accessed via the MessageAPI type class. The messages in Haskell as just regular data types and are thus immutable. Required types are simple record fields, optional types are Maybe, and repeated types are Seq (from Data.Sequence). Extensions are supported via Key data that allows access to the extension fields. Extensible messages contain an opaque ext'field entry of type ExtField that contains the map data structure to contain the extension field values. The User API allows for serializing messages as the usual series of fields. It also provides for a length prefix to be written to create delimited messages. It also provides to write a wire tag with any field number before the length and message data. This last form looks like a field on the wire, and there is a special api call to read back just the one message and its field number. This last API is similar to the one that is part of the C# API. No benchmarks have been run yet. Any suggestions? Unsupported for the moment is loading and storing unknown fields. It can be added sooner if someone has a use for this. Unsupported indefinitely is code generation for Services and Methods. I have yet to look into how this is presented in the other languages. The API to read a single message field, as mentioned above, might be extended to read any type instead of just messages. optional clever_quote { autrijus Perl: Easy things are easy, hard things are possible autrijus Haskell: Hard things are easy, the impossible just happened } Cheers! Chris Kuklewicz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: STM and FFI
There are some examples of adding IO actions to commit and rollback events at http://www.haskell.org/haskellwiki/New_monads/MonadAdvSTM Disclaimer: I wrote this instance of the code, but have not used it much. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] The initial view on typed sprintf and sscanf
Matthew Brecknell wrote: Unfortunately, I don't seem to be able to make the expected fprintf function, because printf's format-dependent parameter list makes it impossible to find a place to pass the handle. Hence the C++-like () ugliness. How about this: fprintf :: Handle - F (IO ()) b - fprintf h fmt = write fmt id where write :: F a b - (IO () - a) - b write (FLit str) k = k (hPutStr h str) write FInt k = \i - k (hPutStr h (show i)) write FChr k = \c - k (hPutChar h c) write (FPP (PrinterParser pr _)) k = \x - k (hPutStr h (pr x)) write (a :^ b) k = write a (\sa - write b (\sb - k (sa sb))) *PrintScan fprintf stdout fmt5 15 1.3 '!' abc15cde1.3!*PrintScan The first thing I did last night was change String to type ShowS = String - String : intps :: F a b - (ShowS - a) - b intps (FLit str) k = k (str++) intps FInt k = \x - k (shows x) intps FChr k = \x - k (x:) intps (FPP (PrinterParser pr _)) k = \x - k (pr x ++) intps (a :^ b) k = intps a (\sa - intps b (\sb - k (sa . sb))) sprintfs :: F ShowS b - b sprintfs fmt = intps fmt id Ideally PrinterParser would display using ShowS as well: data PrinterParser a = PrinterParser (a - ShowS) (String - Maybe (a, String)) Or one could use instance witnesses via GADTs to wrap up Show: data F a b where FSR :: (Show b,Read b) = F a (b - a) But I think changing PrinterParser would result in simpler code. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Brainstorming on how to parse IMAP
Hi John, I recently posted new and fancy binary Get monads in http://article.gmane.org/gmane.comp.lang.haskell.libraries/9691 and http://article.gmane.org/gmane.comp.lang.haskell.libraries/9756 which might be of interest since network protocol are usually specified in bytes at the wire level. The latest one takes input which may or may not be complete and returns a stream (a Seq) of results. When it reaches the end of the input it pauses and asks for more. This handling of partial input might be good for network protocols where you can feed the data from the socket to the parser in pieces. (This Get monad eats lazy bytestrings). The latest MyGetW.hs allows the Get code to send a Data.Sequence.Seq of results by using yieldItem (and perhaps flushItems). This is in addition to any final result of the parser. It has the usual binary Get interface, so you can pull bytestrings and words and (fancy) any Storable off the input. I call it fancy because the monad is a transformer, and it is a MonadError/MonadPlus/Alternative, and it supports lookAhead and callCC/MonadCont and Reader/Writer/State. Whew. As for IMAP, I use imapfilter (http://imapfilter.hellug.gr/) which uses Lua. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Brainstorming on how to parse IMAP
I am glad you asked Ben, Short answer: It can return a Seq of your values. The values in the Seq are lazy, the Seq itself is finite. It can return what it has so far before it finishes parsing (or even before the rest of the input has arrived from the network). Ben Franksen wrote: ChrisK wrote: I recently posted new and fancy binary Get monads in http://article.gmane.org/gmane.comp.lang.haskell.libraries/9691 and http://article.gmane.org/gmane.comp.lang.haskell.libraries/9756 which might be of interest since network protocol are usually specified in bytes at the wire level. The latest one takes input which may or may not be complete and returns a stream (a Seq) of results. IIRC Seq is not a 'Stream' but a strict sequence? Or do you meant 'a stream (of Seq)'? I meant it returns many (Seq y), one after the other, while doing parsing in between. Cheers Ben Long answer: The complicated parser looks like this. Start with the run function: runCompGet :: (Monad m,Monoid w) = CompGet a y r w user m a - r - user - L.ByteString - m (CompResult y w user m a) This takes a CompGet and a reader state r and a user state user and the (initial) input L.ByteString (Data.ByteString.Lazy.ByteString). It evaluates to the inner monad 'm' returning a CompResult. The CompResult is a three-fold type: data CompResult y w user m a = CFailed (Seq y) !Int64 String | CFinished (Seq y) !L.ByteString !Int64 w user a | CPartial (Seq y) (Either ( m (CompResult y w user m a) ) ( Maybe L.ByteString - m (CompResult y w user m a) )) All three have (Seq y) which are the Data.Sequence.Seq of things which have been queued by yieldItem. CFailed also has the Int64 count of bytes parsed successfully and an error message String. Nothing more can be done. CFinished also has the unused tail of the input as a L.ByteString and an Int64 of the bytes consumed. And the output of the writer w, the final user state, and lastly it has the end value returned by the computation which has type 'a'. Nothing more can be done. CPartial is the intermediate result. It also carries Either: Left : the rest of the computation, currently suspended, to continue running. Right: a function from (Maybe ByteString) to the suspended computation. The Left is a result of the flushItems command and is merely a way to return the (Seq y) so far before continuing. The Right is a result of running out of input data. This allows the program to feed more input into the parser which will be appended to all the previous input. One does this by passing (Just someByteString) to the function. If the parser again runs out of data it will again return CPartial with a Right value. Alternatively, one can pass Nothing. This tells the parser that there will never ever be more input. The parser will never ask for (though it may flushItems and return a Right valued CPartial). A key thing about the (Seq y) is that yielded items are only returned once. The CPartial may be returned many times and each time it will have an empty list or fresh list of (Seq y). The values in the Seq are lazy, the Seq itself is finite. To collect all the value the caller has to concatenate all the (Seq y)'s that are returned during parsing. As for parsing, the module offers the usual BinaryParser interface (package binary-strict) and has an interface which mostly overlaps Data.Binary.Get (package binary). For example: it has getByteString and getWord64be and getStorable. You don't have to use the yieldItem command. You can just return the results in the final a return value (or the user state or the writer w value). In this situation you only get an answer when CFinished is returned (and nothing if CFailed is returned). I could not use the writer mechanism for yield-ing because the listen and pass parts of the MonadWriter class ensure it has the wrong semantics. You might wonder: *) If the parser code uses MonadPlus to give several alternatives *) The first alternative gets more input via CPartial (perhaps several times) *) The first alternative then fails *) The second alternative starts parsing from the same position the first did *) Does the second alternative see the new input passed to CPartial earlier? The answer is yes. Changes to the input stream by appending with CPartial affect the whole computation and are never rolled back. You might wonder: *) If the first alternative calls yieldItem foo *) The first alternative fails *) The second alternative calls flushItems *) Does the CPartial (Seq y) contain foo? The answer is yes. Items yielded are never rolled back. [ Doing all of this in the presence of throwError/mzero/fail and lookAhead* and callCC was interesting to code. ] Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org
[Haskell-cafe] Re: ANN: haskell-src-exts 0.3.5
Thanks for the fix. I have gotten the darcs version and I am compiling... Niklas Broberg wrote: Hi all, I'm pleased to report that haskell-src-exts is now updated to understand Template Haskell syntax (it used to understand pre-6.4 TH, but now it works with the current version). At least I hope so, I didn't have much TH code to try it on so if you find some bugs just let me know. It wasn't all that hard to fix though, so I'm pretty confident it actually works. Get the code: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/haskell-src-exts-0.3.5 darcs get http://code.haskell.org/HSP/haskell-src-exts Cheers, /Niklas ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Template Haskell and haskell-src-exts
Hi, Can one represent the ''Type template Haskell syntax of $( makeMergeable ''FileDescriptorProto ) in haskell-src.exts Language.Haskell.Exts.Syntax ? And what are the HsReify data (e.g. HsReifyType and HsReifyDecl and HsReifyFixity )? I don't see any pretty print capability to produce the ''Type so I am wondering what else I might use... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: A Monad for on-demand file generation?
Joachim Breitner wrote: * The 5th line does not have this effect. Because this gets desugared to (), the special implementation of () means that the next line still sees the same dependency state as the before the call to liftIO. You are violating the monad laws. (f k) and (f = \_ - k) should do the same thing. You might write a version of liftIO that has the effect you want, however. * A change to inFile3 causes outFile1 to be re-written, although from looking at the code, _we_ know that this is not necessary, but the ODIO monad can not tell. The programmer should have swapped the lines. Let me reverse engineer your algorithm (aside from the screwy ): Every readFile that is encountered in processing ODIO is added to a list of source files. The reading deferred to be lazy with unsafePerformIO. When a writeFile is encountered it is assumed to depend on all previously read files. If this output file already exists and is newer than all the source files, then writing it is skipped (and perhaps also the lazy reads are skipped). Otherwise, the writing is strict. I would say this is an unusual module. I rather prefer Makefile semantics, which could be improved in some ways by using a DSL in Haskell instead. The syntactic form of a file-oriented Makefile declaration is output : input1 input2 shell script more shell script And the shell script has access to the output file name, and also has access to the input names. In Haskell you could have a monadic DSL where the output name (and perhaps some explicit input names) are accessible like MonadReader. The result of running the DSL would do no IO at all but, much like a compiler, would return an IO action (the program to create the output file) and a list of inferred dependencies (an improvement over the Makefile syntax). Even if the DSL does not allow liftIO, it can still compile to various IO actions. Then you have a map from (outputname) to (dependencies,ioAction). And when outputname is demanded you can walk the dependencies to see if the timestamps are newer or older, using the ioActions to create the desired files. So perhaps to run the DSL monad you have a function like: makeRule :: DSL () - FilePath - [FilePath] - ( [FilePath], IO () ) type Depends = Map FilePath ([FilePath], IO ()) demand :: Depends - FilePath - Maybe ByteString ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: A Monad for on-demand file generation?
hen, the readFileOD could put the timestamp of the read file in a Monad-local state and the writeFileOD could, if the output is newer then all inputs listed in the state, skip the writing and thus the unsafeInterleaveIO’ed file reads are skipped as well, if they were not required for deciding the flow of the program. How is your system similar to make/Makefile or different to make/Makefile ? Are your actions more restricted? Are the semantics more imperative? Are the dependencies still explicit or are them implicit and inferred? -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Writing an 'expect'-like program with runInteractiveCommand
Are you adjusting 'System.IO.hSetBuffering' to NoBuffering for those handles? Graham Fawcett wrote: Hi folks, I would like to communicate with an external, line-oriented process, which takes a sequence of one-line commands, each returning an arbitrary number of lines, and waits for another command after each response. So, something like: sendCmd :: (Handle, Handle) - String - IO [String] ... main = do handles - connectToExternalProcess sendCmd handles do something resp - sendCmd get results -- needs strict I/O, before quit? sendCmd quit mapM_ putStrLn resp I've tried using runInteractiveCommand, and several combinations of hFlush, hWaitForInput, etc., but I can't find a combination that actually works. I know this is a sketchy description, but can anyone offer some sample code, or point me toward a program that has similar behaviour? Thanks, Graham ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Caching the Result of a Transaction?
The garbage collector never gets to collect either the action used to populate the cached value, or the private TMVar used to hold the cached value. A better type for TIVal is given below. It is a newtype of a TVal. The contents are either a delayed computation or the previously forced value. Thew newTIVal(IO) functions immediately specify the delayed action. The newEmptyTIVal(IO) functions create a private TMVar that allows the delayed action to be specified once later. Note the use of tryPutTMVar to return a Bool instead of failing, in the event that the user tries to store more that one action. When force is called, the previous action (and any private TMVar) are forgotten. The garbage collector might then be free to collect them. -- Chris -- By Chris Kuklewicz (April 2008), public domain module TIVal(TIVal,newTIVal,newTIValIO,force,cached) where import Control.Applicative(Applicative(..)) import Control.Concurrent.STM(STM,TVar,newTVar,newTVarIO,readTVar,writeTVar ,TMVar,newEmptyTMVar,newEmptyTMVarIO,tryPutTMVar,readTMVar) import Control.Monad(Monad(..),join,liftM2) import System.IO.Unsafe(unsafePerformIO) newtype TIVal a = TIVal (TVar (Either (STM a) a)) -- the non-empty versions take a computation to delay newTIVal :: STM a - STM (TIVal a) newTIVal = fmap TIVal . newTVar . Left newTIValIO :: STM a - IO (TIVal a) newTIValIO = fmap TIVal . newTVarIO . Left -- The empty versions stage things with a TMVar, note the use of join -- Plain values 'a' can be stored with (return a) newEmptyTIVal :: STM ( TIVal a, STM a - STM Bool) newEmptyTIVal = do private - newEmptyTMVar tv - newTVar (Left (join $ readTMVar private)) return (TIVal tv, tryPutTMVar private) newEmptyTIValIO :: IO ( TIVal a, STM a - STM Bool ) newEmptyTIValIO = do private - newEmptyTMVarIO tv - newTVarIO (Left (join $ readTMVar private)) return (TIVal tv, tryPutTMVar private) -- force will clearly let go of the computation (and any private TMVar) force :: TIVal a - STM a force (TIVal tv) = do v - readTVar tv case v of Right a - return a Left wait - do a - wait writeTVar tv (Right a) return a -- Conal's cached function. This is actually safe. cached :: STM a - TIVal a cached = unsafePerformIO . newTIValIO -- The instances instance Functor TIVal where f `fmap` tiv = cached (f `fmap` force tiv) instance Applicative TIVal where pure x = cached (pure x) ivf * ivx = cached (force ivf * force ivx) instance Monad TIVal where return x = cached (return x) tiv = k = cached (force tiv = force . k) instance Applicative STM where pure x = return x ivf * ivx = liftM2 ($) ivf ivx ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Wrong Answer Computing Graph Dominators
More algebraically, including 'or' for symmtry: and xs = foldr () True xs or xs = foldr (||) False xs The True and False are the (monoid) identities with respect to and || : True x == x x True == x False || x == x x || False == x And so an empty list, if defined at all, should be the identity: and [] = True or [] = False In english: and returns false when is any element of the list false? is yes or returns true when is any element of the list true? is yes Matthew Brecknell wrote: Dan Weston wrote: Here, any path means all paths, a logical conjunction: and [True, True] = True and [True ] = True and [ ] = True Kim-Ee Yeoh wrote: Hate to nitpick, but what appears to be some kind of a limit in the opposite direction is a curious way of arguing that: and [] = True. Surely one can also write and [False, False] = False and [False ] = False and [ ] = False ??? No. I think what Dan meant was that for all non-null xs :: [Bool], it is clearly true that: and (True:xs) == and xs -- (1) It therefore makes sense to define (1) to hold also for empty lists, and since it is also true that: and (True:[]) == True We obtain: and [] == True Since we can't make any similar claim about the conjuctions of lists beginning with False, there is no reasonable argument to the contrary. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: semi-closed handles
Ryan Ingram wrote: I usually use something like this instead: hStrictGetContents :: Handle - IO String hStrictGetContents h = do s - hGetContents h length s `seq` hClose h return s A small idiomatic nitpick: When I see (length s) gets computed and thrown away I wince at the wasted effort. I would prefer (finiteSpine s): finiteSpine = foldr (const id) () hStrictGetContents :: Handle - IO String hStrictGetContents h = do s - hGetContents h finiteSpine s `seq` hClose h return s finiteSpine finds the end of a finite list and will hang forever on an infinite list. One can even notice that the type of finiteSpine is Strategy [a]: import Control.Parallel.Strategies(Strategy) finiteSpine :: Strategy [a] finiteSpine = foldr (const id) () And in fact finiteSpine = seqList r0, which returns () after applying the do nothing strategy r0 to every element. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Type families and GADTs in 6.9
The length calculation looked complicated. So I reformulated it as a comparison using HasIndex. But ghc-6.8.2 was not inferring the recursive constraint on proj, so I split proj into proj_unsafe without the constraint and proj with the constraint checked only once. I also renamed ZT to Nil to be more consistent. -- works in ghc 6.8.2 {-# LANGUAGE GADTs, TypeFamilies, EmptyDataDecls, TypeOperators #-} data FZ data FS a data Fin fn where FZ :: Fin FZ FS :: Fin fn - Fin (FS fn) f0 = FZ f1 = FS f0 f2 = FS f1 -- ... etc. data Nil data t ::: ts infixr 4 ::: data Tuple ts where Nil:: Tuple Nil (:::) :: t - !(Tuple ts) - Tuple (t ::: ts) data HTrue type family Lookup ts fn :: * type instance Lookup (t ::: ts) FZ = t type instance Lookup (t ::: ts) (FS fn) = Lookup ts fn type family HasIndex ts fn :: * type instance HasIndex (t ::: ts) FZ = HTrue type instance HasIndex (t ::: ts) (FS fn) = HasIndex ts fn {-# INLINE proj #-} proj :: (HasIndex tsT fnT ~ HTrue) = Fin fnT - Tuple tsT - Lookup tsT fnT proj = proj_unsafe where proj_unsafe :: Fin fnT - Tuple tsT - Lookup tsT fnT proj_unsafe (FS fn) (_v ::: vs) = proj_unsafe fn vs proj_unsafe FZ (v ::: _vs) = v proj_unsafe _ Nil = error Cannot proj Nil in proj_unsafe fst' :: (HasIndex ts FZ ~ HTrue) = Tuple ts - Lookup ts FZ fst' = proj f0 snd' :: (HasIndex ts (FS FZ) ~ HTrue) = Tuple ts - Lookup ts (FS FZ) snd' = proj f1 pair :: Tuple (Char ::: (() ::: Nil)) pair = 'a' ::: () ::: Nil q1 :: Char q1 = fst' pair q2 :: () q2 = snd' pair {- This won't compile q2 :: () q2 = snd' ('a' ::: Nil) -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Longest increasing subsequence
It is late, but I was not sleepy enough, so here is my first translation of the algorithm to a functional approach... {- Quote wikipedia: http://en.wikipedia.org/wiki/Longest_increasing_subsequence L = 0 M[0] = 0 for i = 1, 2, ... n: binary search for the largest j ≤ L such that X[M[j]] X[i] (or set j = 0 if no such value exists) P[i] = M[j] if j == L or X[i] X[M[j+1]]: M[j+1] = i L = max(L, j+1) -} {- X[i] defined for i = 1,2,3… So X[0] is not defined. Now, rethink '0' as Nothing, and 1≤j≤L since X[M[0]] is also undefined. Not that after the binary search that one the three conditions holds: X[i] ≤ X[M[1]] The same or a new minimum value P[i] is created and set to Nothing If X[i] X[M[1]] then M[1] is changed to i X[M[j]] X[i] ≤ X[M[j+1]] for some jL A value greater than the minimum and equal or less than the maximum P[i] is set to Just M[j] If X[i] == X[M[j+1]] then there is no need to update M[j+1] If X[i] X[M[j+1]] then M[j+1] is changed to i X[M[L]] X[i] A new maximum value P[i] is set to Just M[L] M[L+1] is created and set to i L is set to L+1 Wikipedia is too loose. X[M[1]], X[M[2]], …, X[M[L]] is not nondecreasing, but must be strictly increasing. This is really sloppy of wikipedia. The P[i] are just a stack, create a linked list going in and pull apart on way out, will by O(N). If you do not separately track the min and max values, then the algorithm works like this: Make a map mu from the set of X[M[j]] to M[j], starting empty. Make a P as a list of Maybe Int, starting as []. Note that size mu will always by L, and starts off as 0. For i=1,2,3…: do a Data.Map.splitLookup using pivot X[i] to get (map1,m,map2) and find which of the three cases we are in: If there is a null map2 then third case: If empty mu then prepend Nothing to P. else get M[L] from snd (snd (findMax mu)), prepend Just M[L] to P. Create new my by inserting key X[i] with value i to mu. If there is a null map1 then first case (Note that mu cannot be empty): Prepend Nothing to P get min from fst (findMin mu) If X[i] min then make new mu from replacing key X[i] with value i with (Data.Map.updateMin). Otherwise this is the middle case and map1 and map2 are both non-empty. Get M[j] from (snd (findMax map1)) and prepend Just M[j] to P. If 'm' is (Just {}) then X[i] X[M[j+1]] and do not change mu else change mu to Data.Map.adjust key X[i] with value i on mu Also: keep track of the length of P, which is ultimate N, where 1≤i≤N. Each operation in the loop with index i is order log (i-1) Note that the j is never explicitly tracked. It is implicit in the order of the keys of the map mu. Once you are done, you have a maximum subsequence length of (size mu), and the stack P is just the P[i]'s in reverse order. You can get the last index i of the longest subsequence from (snd (findMax mu)) and backtrack to get the other i's by carefully popping the stack P (in a single traversal) and keeping only the indices you need until you reach a Nothing. -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Longest increasing subsequence
My late night suggestions were nearly correct. I have actually written the code now. Once keeping track of indices, and a second time without them: {-# LANGUAGE BangPatterns #-} -- By Chris Kuklewicz, copyright 2008, BSD3 license -- Longest increasing subsequence -- (see http://en.wikipedia.org/wiki/Longest_increasing_subsequence) import Data.List (foldl') import Data.Map (Map) import qualified Data.Map as M (empty,null,insert,findMin,findMax ,splitLookup,deleteMin,delete) type DList a = [a] - [a] lnds :: Ord a = [a] - [a] lnds = lnds_decode . lnds_fold lnds_fold :: Ord a = [a] - Map a (DList a) lnds_fold = foldl' process M.empty where -- The Map keys, in sorted order, are the input values which -- terminate the longest increasing chains of length 1,2,3,… process mu x = case M.splitLookup x mu of (_,Just {},_) - mu -- ignore x when it is already an end of a chain (map1,Nothing,map2) | M.null map2 - -- insert new maximum element x if M.null mu then M.insert x (x:) mu -- x is very first element else let !xs = snd (M.findMax mu) in M.insert x (xs . (x:)) mu | M.null map1 - -- replace minimum element with smaller x M.insert x (x:) (M.deleteMin mu) | otherwise - -- replace previous element oldX with slightly smaller x let !xs = snd (M.findMax map1) !oldX = fst (M.findMin map2) -- slightly bigger key !withoutOldX = M.delete oldX mu in M.insert x (xs . (x:)) withoutOldX lnds_decode :: Ord a = Map a (DList a) - [a] lnds_decode mu | M.null mu = [] | otherwise = snd (M.findMax mu) [] tests = [ ['b'..'m'] == (lnds $ ['m'..'s'] ++ ['b'..'g'] ++ ['a'..'c'] ++ ['h'..'k'] ++ ['h'..'m'] ++ ['d','c'..'a']) , == lnds , a == lnds a , a == lnds ba , ab == lnds ab ] Comparing to wikipedia: The X[M[1]],X[M[2]],… sequence is strictly increasing. These are the ends of the current increasing chains of length 1,2,… and they are the keys to the Map in my code. The values of the map are the subsequences themselves, in DList form. Instead of pointing to the index of the previous element I just lookup '!xs' and append '(x:)' to that. Complexity: The strictness annotations ensure that the garbage collector can destroy any unreachable DList entries. The space usage is thus O(N) and may be O(1) for certain inputs (such as the best case of never-increasing input list). A strictly increasing input list is the worst case for space usage. The naive time complexity of 'process' for the i'th input value is O(log i). This can be double checked by looking at the time complexity of everything I import from Data.Map. Peak performance could be had by (1) adding the first element before the foldl' to avoid checking for this case in process (2a) accessing the internal map structure to optimize the splitLookup-delete-insert case into a single operation (2b) Using something like a zipper to access the to-be-deleted-and-replaced element of the map The (2a) and (2b) work because we know the changed key will go into the same 'slot' of the map as the old one. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Dynamic typing makes you more productive?
Jules Bean wrote: Justin Bailey wrote: From a recent interview[1] with the guy leading Ruby development on .NET at Microsoft: You spend less time writing software than you spend maintaining software. Optimizing for writing software versus maintaining software is probably the wrong thing to do. Static typing makes it harder to maintain software because it's harder to change it. It's interesting to say that, because not only is it completely untrue, but the opposite is in fact true. I would make the following statement: Static typing makes it easier to maintain software because it's easier to change it. It depends on type inferencing and tool support. If you change a type in Java (or C++) then you will have to change its declaration at each point of use in the project. This is only easy with tool support. In Ruby or Python there are few declarations to change -- but you have to be sure it is safe. In Haskell, the type inference meas there are few declarations to change -- and it catches all the unsafe usages. If you have many declarations then you are going to wish for more tool support. When you change the type of something in a program (be it statically dynamically typed) you have to change all uses of it. If your program is dynamically typed, you have to work very hard to make sure you catch all instances, perhaps by having an enormous test suite, perhaps by having a powerful IDE with semantically aware search and replace. A common source of bugs is making a partial change in this way, where a rarely-tested code path develops a semantic bug from a 'far-away change'. If your program is statically typed, the compiler tells you all the places you need to change. Job done. Therefore I find that generally speak change/refactoring is an order of magnitude easier in haskell than, say, ruby. Jules ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Small displeasure with associated type synonyms
Tom Schrijvers wrote: Stefan, I tried lexically scoped type variables, but to no avail: instance forall a b. (C a, C b) = C (a, b) where type T (a, b) = (T a, T b) val = (val :: T a, val :: T b) The problem is ambiguity. The type checker can't determine which val function to use, i.e. which dictionary to pass to val. Assume: instance C Int where type T Int = Int val= 0 instance C Bool where type T Bool = Int val = 1 Now, if you want some val :: Int, which one do you get? The one of C Int of C Bool? Depending on the choice you may get a different result. We can't have that in a deterministic functional language. Hence the error. Adding a type signature doesn't change the matter. I don't see how your example explains this particular error. I agree Int cannot be generalized to (T Int) or (T Bool). I see Stefan's local type signature is not (val :: a) like your (val ::Int) but (val :: T a) which is a whole different beast. And (T a) is the type that ghc should assign here. The C (a,b) instance wants val :: T (a,b), The T (a,b) is declared as (T a, T b). The annotated val returns (T a, T b). One never needs the sort of Int to (T Int) generalization. So what is a better explanation or example to clarify why GHC cannot accept the original code? -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Small displeasure with associated type synonyms
Okay, I get the difference. The T a annotation in val :: T a)and val :: T a does not help choose the C a dictionary. But the val :: a- T a and val (undefined :: a) allows a to successfully choose the C a dictionary. val :: T a fixes T a but does not imply C a. (undefined :: a) fixes a and does imply C a. I now see how the functional dependency works here (which I should have tried to do in the first place -- I should have thought more and relied on the mailing list less). class C a b | a - b is here class C a where type T a = b. So only knowing T a or b does not allow a to be determined. -- Chris Tom Schrijvers wrote: I don't see how your example explains this particular error. I agree Int cannot be generalized to (T Int) or (T Bool). Generalized is not the right word here. In my example T Int, T Bool and Int are all equivalent. I see Stefan's local type signature is not (val :: a) like your (val ::Int) but (val :: T a) which is a whole different beast. Not all that different. As in my example the types T Int, T Bool and Int are equivalent, whether one writes val :: Int, val :: T Int or val :: T Bool. My point is that writing val :: T Int or val :: T Bool does not help determining whether one should pick the val implementation of instance C Int or C Bool. And (T a) is the type that ghc should assign here. As my example tries to point out, there is not one single syntactic form to denote a type. Consider the val of in the first component. Because of val's signature in the type class the type checker infers that the val expression has a type equivalent to T a2 for some a2. The type checker also expects a type equivalent to T a, either because of the type annotation or because of the left hand side. So the type checker must solve the equation T a ~ T a2 for some as yet to determine type a2 (a unification variable). This is precisely where the ambiguity comes in. The type constructor T is not injective. That means that the equation may hold for more than one value of a2 (e.g. for T Int ~ T a2, a2 could be Int or Bool). Hence, the type checker complains: Couldn't match expected type `T a2' against inferred type `T a'. Maybe you don't care what type is chosen, if multiple ones are possible. My example tried to show that this can effect the values computed by your program. So it does matter. For this particular example, it seems that the type checker does not have have more than alternative for a2 in scope. However, it is not aware of that fact. It uniformly applies the same general strategy for solving equations in all contexts. This is a trade-off in type system complexity vs. expressivity. There is an opportunity for exploring another point in the design space here. Tom -- Tom Schrijvers Department of Computer Science K.U. Leuven Celestijnenlaan 200A B-3001 Heverlee Belgium tel: +32 16 327544 e-mail: [EMAIL PROTECTED] url: http://www.cs.kuleuven.be/~toms/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Create a list without duplicates from a list with duplicates
For Bimap is there anything like Data.Map.insertWithKey ? Stuart Cook wrote: On Sat, Feb 9, 2008 at 7:36 AM, Dan Weston [EMAIL PROTECTED] wrote: If order is important, the new bijective Data.Bimap class http://code.haskell.org/~scook0/haddock/bimap/Data-Bimap.html may be your best bet (I haven't yet tried it myself). Let me try: nub :: (Ord a) = [a] - [a] nub = map snd . Data.Bimap.toAscList . Data.Bimap.fromList . reverse . zip [1..] nub hello, world! helo, wrd! Without the call to (reverse), this would still be an order-preserving nub, except that it would preserve the relative order of the *last* occurrence of each element. Actually, this makes me wonder whether fromList's behaviour should be changed, and whether I should add a non-clobbering variant of insert. Stuart ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: weird ghci behavior with gadts and existentials
Let me add: data ExpGADT t where ExpInt :: Int - ExpGADT Int ExpChar :: Char - ExpGADT Char Which type do you think 'unHide' and 'wierd' should have: unHide h = case h of Hidden (_,e) - e wierd = unHide (Hidden (TyInt,ExpInt 3)) either: unHide :: HiddenTypeExp - ExpGADT t-- Choice 1: Polymorphic t unHide :: HiddenTypeExp - ExpGADT Int -- Choice 2: Monomorphic Int Note that TypeGADT/TyInt are irrelevant to unHide and wierd. Clearly the first choice to unHide violates the encapsulation of 't'. Clearly I cannot choose the 2nd choice, since it might be an ExpChar. So unHide is impossible to write. Chris Casinghino wrote: Hi all, I've been playing a bit with gadts and existentials lately, and I have an example where I don't quite understand the behavior of ghc. The expression in question typechecks sometimes in some versions of ghc, depending on where you write it, and not in other versions. Some other people I've asked report not getting any errors, even when using apparently one of the same versions of ghc I checked. If I create a file which contains: data TypeGADT t where TyInt :: TypeGADT Int data ExpGADT t where ExpInt :: Int - ExpGADT Int data HiddenTypeExp = forall t . Hidden (TypeGADT t, ExpGADT t) weird = case Hidden (TyInt, ExpInt 3) of Hidden (TyInt, e) - e I am able to :load it into ghci just fine (with -fglasgow-exts) with version 6.8.2. However, if I then copy the line: let weird2 = case Hidden (TyInt, ExpInt 3) of Hidden (TyInt, e) - e into ghci, I get a type error. In the HEAD version 6.9, I get a type error on the definition of weird right away when I :load the file. The type error goes away if I add the line weird :: ExpGADT Int before the definition of weird. The type error in question is this: interactive:1:46: Inferred type is less polymorphic than expected Quantified type variable `t' escapes When checking an existential match that binds e :: ExpGADT t The pattern(s) have type(s): HiddenTypeExp The body has type: ExpGADT t In a case alternative: Hidden (TyInt, e) - e In the expression: case Hidden (TyInt, ExpInt 3) of Hidden (TyInt, e) - e So, several questions. 1) Why the discrepancy in behavior between :loading the file and copying in the definition in 6.8.2. It seems like, one way or the other, this should be consistent. 2) Several other people report not getting any errors at all, even people with ghc 6.8.2, one of the versions I tested. What's the right behavior? My guess would be that this should cause no type error, even without the type annotation. The GADT pattern match against TyInt in the case statement should refine the existential type variable t to Int, so no existential type variables are escaping. Is that right? 3) Is there a bug here? Are there two bugs (one, the typing error, two, the difference between ghc and ghci)? Or, do I just not understand what is going on? Sorry for the length of this email! --Chris Casinghino ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: A handy little consequence of the Cont monad
The bit of a mess that comes from avoiding monads is (my version): import Foreign.Marshal.Array(withArray0) import Foreign.Ptr(nullPtr,Ptr) import Foreign.C.String(withCString,CString) This uses withCString in order of the supplied strings, and a difference list ([CString]-[CString]) initialized by id to assemble the [CString]. This is the laziest way to proceed. acquireInOrder :: [String] - (Ptr CString - IO a) - IO a acquireInOrder strings act = foldr (\s cs'io'a - (\cs - withCString s (\c - cs'io'a (cs . (c:)) ) ) ) (\cs - withArray0 nullPtr (cs []) act ) strings id This uses in withCString in reversed order of the supplied strings, and normal list ([CString]) initialized by [] to assemble the [CString]. This is not as lazy since it needs to go to the end of the supplied list for the first IO action. acquireInRerverseOrder :: [String] - (Ptr CString - IO a) - IO a acquireInRerverseOrder strings act = foldl (\cs'io'a s - (\cs - withCString s (\c - cs'io'a (c:cs) ) ) ) (\cs - withArray0 nullPtr cs act ) strings [] Cale Gibbard wrote: Hello, Today on #haskell, resiak was asking about a clean way to write the function which allocates an array of CStrings using withCString and withArray0 to produce a new with* style function. I came up with the following: nest :: [(r - a) - a] - ([r] - a) - a nest xs = runCont (sequence (map Cont xs)) withCStringArray0 :: [String] - (Ptr CString - IO a) - IO a withCStringArray0 strings act = nest (map withCString strings) (\rs - withArray0 nullPtr rs act) Originally, I'd written nest without using the Cont monad, which was a bit of a mess by comparison, then noticed that its type was quite suggestive. Clearly, it would be more generally useful whenever you have a bunch of with-style functions for managing the allocation of resources, and would like to turn them into a single with-style function providing a list of the acquired resources. - Cale ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Security Notice: Buffer overflow fixed in PCRE library
The PCRE library has just fixed a buffer overflow (related to UTF-8 mode). There are several haskell wrappers for the pcre library. If you use a wrapper for the PCRE library (libpcre) then you may want to upgrade the underlying library. http://pcre.org/news.txt states: News about PCRE releases Release 7.6 28-Jan-08 - The main reason for having this release so soon after 7.5 is because it fixes a potential buffer overflow problem in pcre_compile() when run in UTF-8 mode. In addition, the CMake configuration files have been brought up to date. Cheers, Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Why functional programming matters
Simon Peyton-Jones wrote: 1. Small examples of actual code. I particularly like the lazy way of counting change example (also works for picking items off a menu). The code below show 3 approaches : a function for computing the coins used in each way as a verbose list a function for computing just the total number of ways a simply Monoid that does both at once, which a pretty summary display And it has a short but user friendly main function that drives it. The method used is simple. It considers each value of coin in turn, this loop is done by the foldr. The value being folded is a list where the index into the list is an amount for which change is being made; the value at that list index is the list or count of the ways to make that amount using the coins considered so far. These exploit laziness since the returned lists are infinite and since 'result' is defined recursively for each different value of coin. The example of defining a Monoid is a clear abstraction or generalization of the first two functions. -- This demonstrates a way to find every eay to make change for a -- given total using a set of coins. -- -- By Chris Kuklewicz, Public Domain import System.Environment(getArgs) import Control.Exception as E(catch) import Control.Monad(when) import Data.List(group) import Data.Monoid(Monoid(mempty,mappend)) computeListOfWays :: [Int] - [[[Int]]] computeListOfWays coins = foldr includeValue noValues coins where noValues = [] : repeat [] includeValue value oldResult = let (unchangedResult,changedResult) = splitAt value oldResult result = unchangedResult ++ zipWith (++) changedResult (map addCoin result) addCoin = map (value:) in result computeCountOfWays :: [Int] - [Integer] computeCountOfWays coins = foldr includeValue noValues coins where noValues = 1 : repeat 0 includeValue value oldResult = let (unchangedResult,changedResult) = splitAt value oldResult result = unchangedResult ++ zipWith (+) changedResult result in result computeWays :: [Int] - [Ways] computeWays coins = foldr includeValue noValues coins where noValues = Ways [[]] 1 : repeat mempty includeValue value oldResult = let (unchangedResult,changedResult) = splitAt value oldResult result = unchangedResult ++ zipWith mappend changedResult (map addCoin result) addCoin (Ways list count) = Ways (map (value:) list) count in result data Ways = Ways [[Int]] Integer instance Monoid Ways where mempty = Ways [] 0 mappend (Ways list1 count1) (Ways list2 count2) = Ways (list1++list2) (count1+count2) instance Show Ways where show (Ways list count) = unlines (map summary list) ++ Count of Ways = ++ show count ++ \n where summary = show . map (\sub - (length sub,head sub)) . group coins_US :: [Int] coins_US = [1,5,10,25,50] coins_UK :: [Int] coins_UK = [1,2,5,10,20,50] main = do args - getArgs case args of [] - error Pass a number of cents for which to count ways of making change [x] - do n - E.catch (readIO x) (const (error The argument passed needs to be a number)) when (n0) (error The argument passed needs to be a non-negative number) print (computeWays coins_US !! n) _ - error Too many parameters, need just one number ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Why functional programming matters
Achim Schneider wrote: Don Stewart [EMAIL PROTECTED] wrote: jwlato: In addition to STM, another item that should interest serious programmers is forkIO. Lightweight threads that (unlike in Python) can use multiple cpu's. Coming from Python, I personally appreciate this. Using STM to handle concurrency issues often greatly simplifies multithreaded code. And further on this, the use of `par` in pure code to make it go multicore is way beyond what most people think is possible. I said _don't_ make me think of using par on a beowolf cluster of ps3's. Don't you guys have any scruples? Well... ghc still has a single-threaded garbage collector, so all the par threads must stop for garbage collection. So scaling to the level of a cluster would be significantly sub-linear. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: \_ - not equivalent to const $
Luke Palmer wrote: In attempting to devise a variant of cycle which did not keep its argument alive (for the purpose of cycle [1::Int..]), I came across this peculiar behavior: import Debug.Trace cycle' :: (a - [b]) - [b] cycle' xs = xs undefined ++ cycle' xs take 20 $ cycle' (const $ 1:2:3:4:trace x 5:[]) [1,2,3,4,x 5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5] Nuts. Oh, but wait: take 20 $ cycle' (\_ - 1:2:3:4:trace x 5:[]) [1,2,3,4,x 5,1,2,3,4,x 5,1,2,3,4,x 5,1,2,3,4,x 5] Hey, it worked! Can someone explain what the heck is going on here? Luke (\_ - 1:2:3:4:trace x 5:[]) literally could mean your second program, but... the 1:2:3:4:trace x 5:[] does not depend on the _ argument, and so it can be lifted outside the (\_ - ... ) and lazily evaluated once and shared between calls. Optimization in ghc do this for you. The definition const x = (\_ - x) binds 'x' outside of the _ argument, so 'x' is obviously outside (\_ - ...) and will be lazily evaluated once and shared. I see that making the binding and sharing explicit in take 20 $ cycle' (let x = 1:2:3:4:trace x 5:[] in (\_ - x)) [1,2,3,4,x 5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5] behaves like const. And pushing the binding inside the (\_ - ...) take 20 $ cycle' (\_ - let x = 1:2:3:4:trace x 5:[] in x) [1,2,3,4,x 5,1,2,3,4,x 5,1,2,3,4,x 5,1,2,3,4,x 5] behaves like your second example. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Haskell and GUI
The advice below is for Mac OS X 10.4 and below. Starting with Mac OS X 10.5 (Leopard) the DISPLAY is set for you by the operating system. Mine is currently /tmp/launch-sQZXQV/:0 which looks very strange because it is used to cause the launchd daemon to start the X server on demand (i.e. lazily). Yitzchak Gale wrote: btw, if you use GTK2HS on the Mac, don't forget to start up X Windows support, and run export DISPLAY=:0.0 in your terminal window, before you run your program. X Windows is usually in Applications/Utilities, but only if you installed it manually from the Mac OS X discs, it is not installed by default. Good news - Gtk2 now has native support for the Mac, so you probably won't need to run X starting with the next version of GTK2HS. Regard, Yitz -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Implicit parameters and Arrows/Yampa?
Could I has one question? What is the purpose of the stream function in the ArrowLoop instance? Is it just to catch an unexpected [] at runtime? 8 module Main where import Control.Arrow import Control.Arrow.Operations import Control.Arrow.Transformer.Reader -- -- Standard list/stream arrow. -- newtype SF b c = SF { runSF :: [b] - [c] } instance Arrow SF where arr f = SF (map f) SF f SF g = SF (g . f) first (SF f) = SF (uncurry zip . (f *** id) . unzip) second (SF f) = SF (uncurry zip . (id *** f) . unzip) instance ArrowLoop SF where loop (SF f) = SF $ \as - let (bs,cs) = unzip (f (zip as (stream cs))) in bs where stream ~(x:xs) = x:stream xs It looks like stream is (almost) an identity which would crash at runtime if it encountered a []. In particular it is equivalent to where stream xs = head xs:stream (tail xs) instance ArrowCircuit SF where delay x = SF (init . (x:)) -- -- Some state we want to pass around without manual plumbing. -- data AudioState = AudioState { sampleRate :: Double } runAudio state graph = proc p - (| runReader (graph - p) |) state -- -- Some unit generators for audio. -- wrap x = x - fromIntegral (floor x) -- phasor needs the sample rate phasor phase0 = proc hz - do sr - pure sampleRate readState - () rec accum - delay (wrap phase0) - wrap (accum + hz / sr) returnA - accum -- osc doesn't need to know about sample rate osc phase0 = proc hz - do phase - phasor phase0 - hz returnA - cos (2 * pi * phase) -- -- Test it out. -- main = print (runSF (runAudio (AudioState{sampleRate=1000}) (osc 0)) (replicate 10 100)) 8 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: Build fixed for regex-base,posix,compat,pcre
ANNOUNCEMENT: Build fixed for regex-base, regex-posix, regex-compat, regex-pcre The changes are mainly to the Cabal build files to support ghc-6.8 and ghc-6.6 simultaneously. They definitely work with cabal version 1.2.3.0 (required for regex-pcre). The regex-base, regex-posix, and regex-compat packages also seem to work with cabal-1.2.2.0 (but not regex-pcre). The darcs repositories under http://darcs.haskell.org/packages/regex-unstable/ hold the new code. I have uploaded new versions of the most popular regex-* packages to hackage: Mon Jan 7 13:18:00 PST 2008 ChrisKuklewiczregex-pcre-0.94.1 Mon Jan 7 13:05:57 PST 2008 ChrisKuklewiczregex-posix-0.93.1 Mon Jan 7 13:04:49 PST 2008 ChrisKuklewiczregex-base-0.93.1 Mon Jan 7 06:50:54 PST 2008 ChrisKuklewiczregex-compat-0.91 I would like to thank everyone who sent me updated, and particularly Olivier Boudry who sent patches. These build fixes have only been tested with ghc-6.6.1 on Mac OS X 10.5.1 (powerpc) and with ghc-6.8.2 on Linux (x86). Happy New Year, Chris Kuklewicz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Quanta. Was: Wikipedia on first-class object
Brandon S. Allbery KF8NH wrote: On Jan 6, 2008, at 15:02 , Ketil Malde wrote: More seriously, perhaps quantum enters into the equation in how the brain works, perhaps it is even necessary for thought. However, I get worried it's just another mystical mantra, a gratuitous factor that, lacking any theory about how and what it does, adds nothing to help understanding the issue. The brain, being real, is best modeled by a final theory that physicists have not yet (noticed) written down. how the brain works appears to be though electro- and bio- chemistry, which are best modeled/described right now by quantum mechanics. There are observable quantum correlations that cannot be described by a classical theory. So long as the processes you care about (e.g. whatever the hell consciousness is) do not use these non-classical correlations then you can create a simplified model that avoids the complexity of quantum theory. I should not get into these off-topic things... but the viewpoint that worries you is only espoused by people looking for excuses to apply their favorite mystical mantra. Quantum effects are well defined, but nonintuitive (for example, particles tunneling through a barrier). Right. Even if there are some quantum correlations that are used in the brain, then you just use a more complicated mathematical model. Nothing mystical/spooky/special about it. Final note: Quantum tunneling is only surprising if you insist on thinking about particles or photons meeting some kind of impenetrable wall (made of what?). All real barriers are made of forces exerted by other particles. So the whole idea of a barrier is flawed in this regime. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Missing join and split
Mitar wrote: Hi! On Dec 28, 2007 5:51 PM, Lihn, Steve [EMAIL PROTECTED] wrote: Since regex is involved, it is specific to (Byte)String, not a generic list. Oh, this gives me an interesting idea: making regular expressions more generic. The new regex-base API is fairly generic. If you look at the classes in regex-base's Text.Regex.RegexLike: class Extract source = RegexLike regex source where matchAll :: regex - source - [MatchArray] matchOnce :: regex - source - Maybe MatchArray matchCount :: regex - source - Int matchTest :: regex - source - Bool matchAllText :: regex - source - [MatchText source] matchOnceText :: regex - source - Maybe (source, MatchText source, source) you can see that the regex type parameter is fully abstract, and that the source being searched is also fully abstract. The reason for having those specific class methods is to allow for the instance to expose the most efficient way to do each operation. You could make an instance for string seaching (e.g. KMP or BM searching). Pretty much any search or find operation could be made into an instance of RegexLike. The main constraint is that the MatchArray/MatchText use Int indexing and the Extract instance wants to be able to do lookup with this: type MatchOffset = Int type MatchLength = Int type MatchArray = Array Int (MatchOffset, MatchLength) type MatchText source = Array Int (source, (MatchOffset, MatchLength)) class Extract source where before :: Int - source - source after :: Int - source - source empty :: source extract :: (Int, Int) - source - source One benefit is that all the RegexContext instances are implemented by using just the above class methods, so all the polymorphic match/matchM will immediately work. If there is ever a strong need for going beyond the range of Int indexing, then one could either make new variants of the classes, or add methods to the existing ones. But if you are searching over 2GB of something, then perhaps have this generic type class API is not the top priority. Would not it be interesting and useful (but not really efficient) to have patterns something like: foo :: Eq a = a - ... foo (_{4}'b') = ... which would match a list with four elements ending with an element 'b'. Or: foo (_+';'_+';'_) = ... which would match a list with embedded two ';' elements. (Last _ matched the remaining of the list.) OK, maybe guards are not the proper place to implement this as would add a possibility to make a really messy Haskell programs. But extending regular expressions to work on any list of elements with type implementing Eq would be realy powerfull. And then we could use split in many other than just text processing contexts. Of course, the problem in both cases is implementing something like regular expressions efficiently, especially on lists, but this is why there are smart people around. :-) Mitar ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Missing join and split
Albert Y. C. Lai wrote: Mitar wrote: I am really missing the (general) split function built in standard Haskell. I do not understand why there is something so specific as words and lines but not a simple split? The same goes for join. Don't forget Text.Regex.splitRegex. Which is just: matchRegexAll p str = matchM p str {- | Splits a string based on a regular expression. The regular expression should identify one delimiter. This is unsafe if the regex matches an empty string. -} splitRegex :: Regex - String - [String] splitRegex _ [] = [] splitRegex delim str = case matchRegexAll delim str of Nothing - [str] Just (firstline, _, remainder, _) - if remainder == then firstline : [] : [] else firstline : splitRegex delim remainder Inlining the matchRegexAll/matchM means this is 8 lines of code. Any given split function is very short, but there are enough design choices that I think the best library is none at all; the user can write exactly what they want in = 10 lines of code. Though now that I look at it again, I think I like splitRegex :: Regex - String - [String] splitRegex _ [] = [] splitRegex delim strIn = loop strIn where loop str = case matchM delim str of Nothing - [str] Just (firstline, _, remainder) - if null remainder then [firstline,] else firstline : loop remainder slightly better. I'll eventually update the unstable regex-compat. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: what does @ mean?.....
Nicholls, Mark wrote: Hello, I wonder if someone could answer the following… The short question is what does @ mean in mulNat a b | a = b = mulNat' a b b | otherwise = mulNat' b a a where mulNat' x@(S a) y orig | x == one = y | otherwise = mulNat' a (addNat orig y) orig The @ means an as-pattern as defined in the Haskell 98 report section 3.17.1 http://www.haskell.org/onlinereport/exps.html#3.17.1 The 'x' binds to the whole (S a) and the 'a' binds to the parameter of the constructor 's'. There is a possible performance benefit here. Consider: zeroNothing Nothing = Nothing zeroNothing (Just n) = if n == 0 then Nothing else (Just n) versus zeroNothing Nothing = Nothing zeroNothing x@(Just n) = if n == 0 then Nothing else x The first example takes apart the (Just n) and later reconstructs (Just n). Unless the compiler is fairly clever, this will cause the new (Just n) to be a new allocation instead of reusing the input value. The second form uses an at-pattern to bind 'x' to the whole input parameter and the returned 'x' will not need to be reallocaed. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Wikipedia on first-class object
This thread is obviously a source of much fun. I will play too. Cristian Baboi wrote: On Fri, 28 Dec 2007 18:32:05 +0200, Jules Bean [EMAIL PROTECTED] wrote: Cristian Baboi wrote: Let me ask you 3 simple questions. Can one use Haskell to make dynamically linked libraries (DLL on Windows, so on Linux) ? The short answer is yes. The long answer is that this is not a feature of haskell, but rather a feature of certain programs, which happen to be mostly but not entirely written in haskell, in particular, the haskell compiler. GHC can produce dynamically linked libraries. What you choose not to notice is the context in which I asked these questions. The context is: Haskell functions as first-class objects. What I am interested in the first place is dynamically linked libraries written in Haskell and used in Haskell. The interface with other languages like C come second. If yes, what is in them ? Object code, mostly. Sometimes a little data. What is the definition of an entry point in Haskell ? Haskell does not have such a concept. At all. An implementation may have such a concept. Most people on this list define Haskell as any attempt at an implementation of one of the standards which define Haskell, most recently the Hakell 98 standard. This can be nhc / yhc / ghc / hugs / winhugs / helium / jhc. Some of these compile to native code, some compile to byte code for a virtual machine. If an implementation can compile separately, then it might support dynamic libraries. If so then a specific version of that compiler will define its own implementation specific concept of an entry point. What is the semantics of those entry points ? It depends. For recent ghc versions, see its user manual: http://haskell.org/ghc/docs/latest/html/users_guide/ffi-ghc.html#ffi-library http://haskell.org/ghc/docs/latest/html/users_guide/win32-dlls.html Other note: An imperative language, such as C++ or Java, specified the binary output of any instance of the compiler. Class methods will have very specific names and addresses. In C++ you can even get the member-function pointer values and examine the byte offsets in the object. In Java one gets a very specific layout of bytecode in a class file. Haskell is a declarative language. It does not specify anything about the implementation's internals. It specifies only properties like non-strict. The fact that ghc uses lazy evaluation is merely an implementation detail, chosen as a way of satisfying the non-strict requirement of Haskell. The output of a Haskell compiler is free to take all the source code and implement things with jumps and branches that look absolutely nothing like function calls. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Missing join and split
Lihn, Steve wrote: Programmer with perl background would think split like: list of string = split regex original string Since regex is involved, it is specific to (Byte)String, not a generic list. Also it appears one would need help from Text.Regex(.PCRE) to do that. intercalate a (split a xs) = a This identity rule does not hold for perl's join/split if regex is used. Steve Well, libpcre does not have a split function. One can already write a split that uses the high level Regex API. The only reason you might want to dig into regex-pcre's Text.Regex.PCRE would be if it would help efficiency. Specifically, regex-base defines a RegexContext instance which is: ( RegexLike a b = RegexContext a b (b, b, b) ) : The text before the match, the text of the match, the text after the match So you can iteratively generate the pieces that split returns. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Doing some things right
Brian Sniffen wrote: On Dec 28, 2007 6:05 AM, Andrew Coppin [EMAIL PROTECTED] wrote: [I actually heard a number of people tell me that learning LISP would change my life forever because LISP has something called macros. I tried to learn it, and disliked it greatly. It's too messy. And what the heck is cdr ment to mean anyway? To me, LISP doesn't even seem all that different from normal languages (modulo weird syntax). Now Haskell... that's FUN!] Contents of Data Register. Macros are like Template Haskell. One example of where they're useful is programmer definition of new binding forms. That's not possible in Haskell without Templates. Macros were invented in Lisp because the syntax is so easy for machine manipulation---they don't have a tenth the complexity of Template Haskell for about the same power. -Brian There is also Likell, at http://liskell.org/ by Clemens Fruhwir, which translates Haskell source into a lisp style prefix syntax (will (all (the (parentheses. This is to allow macro-like analysis and transformation without template haskell. The main use of template haskell that I have seen mentioned on the mailing lists is to analyze a data declaration and auto-generate some new class and instance declarations. But I agree that template haskell has problems: (*) Its own syntax, as big and complicated as Haskell (*) Has historically been very poorly documented (that _might_ have changed) (*) Only works with ghc -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe