[Haskell-cafe] ANN: BUG FIX release of regex-tdfa-1.1.2

2009-05-05 Thread ChrisK
Hello,

  While occasionally and slowly updating the future version of regex-tdfa I
found a bug that exists in the released 1.1.1 version.  It was just a matter of
passing the wrong value into a function, so was easy to fix when I figured it 
out.

  The test case triggered an impossible error call and so I know this can kill
a program that uses a buggy version of this library.

  Please upgrade regex-tdfa to version 1.1.2 which is available on hackage at:

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parallel combinator, performance advice

2009-04-07 Thread ChrisK
You create one MVar for each task in order to ensure all the tasks are done.
This is pretty heavyweight.

You could create a single Control.Concurrent.QSemN to count the completed tasks,
starting with 0.

Each task is followed by signalQSemN with a value of 1.  (I would use 
finally).

As parallel_ launches the tasks it can count their number, then it would call
waitQSemN for that quantity to have finished.

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parallel combinator, performance advice

2009-04-07 Thread ChrisK
Neil Mitchell wrote:
 Sorry, accidentally sent before finishing the response! I also noticed
 you sent this directly to me, not to -cafe, was that intentional?

The mail/news gateway makes it look like that, but I also sent to the mailing 
list.

 You mean something like:

 parallel_ xs =
sem - createSemapore (length xs)
enqueue [x  signalSemapore sem | x - xs]
waitOnSemaphore sem

 I thought of something like this, but then the thread that called
 parallel_ is blocked, which means if you fire off N threads you only
 get N-1 executing. If you have nested calls to parallel, then you end
 up with thread exhaustion. Is there a way to avoid that problem?

 Thanks

 Neil

Your parallel_ does not return until all operations are finished.

 parallel_ (x:xs) = do
 ys - mapM idempotent xs
 mapM_ addParallel ys
 sequence_ $ x : reverse ys

By the way, there is no obvious reason to insert reverse there.

What I meant was something like:

 para [] = return ()
 para [x] = x
 para xs = do
   q - newQSemN 0
   let wrap x = finally x (signalQSemN q 1)
   go [y] n = wrap x  waitQSemN q (succ n)
   go (y:ys) n = addParallel (wrap y)  go ys $! succ n
   go xs 0

This is nearly identical to your code, and avoid creating the MVar for each
operation.  I use finally to ensure the count is correct, but if a worker
threads dies then bas things will happen.  You can replace finally with () if
speed is important.

This is also lazy since the length of the list is not forced early.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parallel combinator, performance advice

2009-04-07 Thread ChrisK
Neil Mitchell wrote:
 
 I guess the nested calls to parallel_ bit is the part of the spec
 that makes everything much harder!
 
 Thanks
 
 Neil

Yes.  Much more annoying.

But the problem here is generic.  To avoid it you must never allow all thread to
block at once.

The parallel_ function is such a job, so you solved this with the 'idempotent'
trick.  You solution works by blocking all but 1 thread.

1a) Some worker thread 1 executes parallel_ with some jobs
1b) These get submitted the work queue 'chan'
1c) worker thread 1 starts on those same jobs, ignoring the queue
1d) worker thread 1 reaches the job being processed by thread 2
1e) worker thread 1 blocks until the jobs is finished in modifyMVar

2a) Worker thread 2 grabs a job posted by thread 1, that calls parallel_
2b) This batch of jobs gets submitted to the work queue 'chan'
2c) worker thread 2 starts on those same jobs, ignoring the queue
1d) worker thread 2 reaches the job being processed by thread 3
1e) worker thread 2 blocks until the jobs is finished in modifyMVar

3...4...5...

And now only 1 thread is still working, and it has to work in series.

I think I can fix this...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Use unsafePerformIO to catch Exception?

2009-03-23 Thread ChrisK
You should ensure that the result of evaluate is in normal form, not just weak 
head normal form.  You can do this with the Control.Parallel.Strategies module:



import Control.Exception(ArithException(..),try,evaluate)
import Control.Parallel.Strategies(NFData,using,rnf)
import System.IO.Unsafe(unsafePerformIO)

tryArith :: NFData a = a - Either ArithException a
tryArith = unsafePerformIO . try . evaluate . flip using rnf

test :: [Either ArithException Integer]
test =  map (tryArith . (div 5)) [2,1,0,5]

testResult = [Right 2,Right 5,Left DivideByZero,Right 1]

withPair :: Integer - (Integer,Integer)
withPair x = (x,throw Overflow)

main = do
  print (test == testResult)
  print (tryArith (withPair 7))
   print (tryArith' (withPair 7))


in ghci


*Main main
main
True
Left arithmetic overflow
Right (7,*** Exception: arithmetic overflow



This rnf :: Strategy a ensures that the result of evaluate is in normal form. 
 This means it should not have any embedded lazy thunks, so any errors from 
such thunks will be forced while in the scope of the try.


Otherwise a complex type like the result of withPair can hide an error.


Xiao-Yong Jin wrote:

Hi,

I just feel it is not comfortable to deal with exceptions
only within IO monad, so I defined


tryArith :: a - Either ArithException a
tryArith = unsafePerformIO . try . evaluate


and it works quite good as


map (tryArith . (div 5)) [2,1,0,5]


evaluates to


[Right 2,Right 5,Left divide by zero,Right 1]


However, I guess unsafePerformIO definitely has a reason for
its name.  As I read through the document in
System.IO.Unsafe, I can't convince myself whether the use of
'tryArith' is indeed safe or unsafe.

I know there have been a lot of discussion around
unsafePerformIO, but I still can't figure it out by myself.
Can someone share some thoughts on this particular use of
unsafePerformIO?  Is it safe or not?  And why?

Thanks,
Xiao-Yong


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: regex-tdfa 1.1.1

2009-03-22 Thread ChrisK

This is both a bug fix release and a feature release.

The bug fix is a bit embarrassing, the indices were correct but the captured 
text was wrong in version 1.1.0. Oops.


As of version 1.1.1 the following GNU extensions are recognized, all anchors:
\` at beginning of entire text
\' at end of entire text
\ at beginning of word
\ at end of word
\b at either beginning or end of word
\B at neither beginning nor end of word

There is a newSyntax field of CompOptions that can enable/disable this syntax.
These were requested by users of the Yi project.

The home page is on the wiki and UPDATED a bit at
http://haskell.org/haskellwiki/Regular_expressions#regex-tdfa

The HADDOCK fails on hackage for now, but is up at
http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/doc/html/regex-tdfa/Text-Regex-TDFA.html

The released code is on hackage at
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa

The darcs repository is at
http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: doubts about runGetState in the binary package

2009-03-19 Thread ChrisK

Manlio Perillo wrote:

Hi.

I have some doubts about the runGetState function in the binary package.
The signature is:
runGetState :: Get a - LBS - Int64 - (a, LBS, Int64)


however the Int64 input parameter is not documented.
What value should I pass?
How will be used?


Thanks  Manlio Perillo


hackage has the code at
http://hackage.haskell.org/packages/archive/binary/0.5.0.1/doc/html/src/Data-Binary-Get.html#runGetState

And I have pieced together an answer at the bottom...



-- | The parse state
data S = S {-# UNPACK #-} !B.ByteString  -- current chunk
   L.ByteString  -- the rest of the input
   {-# UNPACK #-} !Int64 -- bytes read

-- | The Get monad is just a State monad carrying around the input ByteString
-- We treat it as a strict state monad. 
newtype Get a = Get { unGet :: S - (a, S) }



mkState :: L.ByteString - Int64 - S
mkState l = case l of
L.Empty  - S B.empty L.empty
L.Chunk x xs - S x xs



-- | Run the Get monad applies a 'get'-based parser on the input
-- ByteString. Additional to the result of get it returns the number of
-- consumed bytes and the rest of the input.
runGetState :: Get a - L.ByteString - Int64 - (a, L.ByteString, Int64)
runGetState m str off =
case unGet m (mkState str off) of
  (a, ~(S s ss newOff)) - (a, s `join` ss, newOff)



getBytes :: Int - Get B.ByteString
getBytes n = do
S s ss bytes - get
if n = B.length s
then do let (consume,rest) = B.splitAt n s
put $! S rest ss (bytes + fromIntegral n)
return $! consume
else

 ...

The Int64 passed to runGetState just initializes the running total of consumed 
bytes.  The updated total is returned by runGetState.  The absolute value of the 
Int64 is never used; it is only increased by getBytes.


Cheers,
  Chris
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: regex-tdfa-1.1.0

2009-03-18 Thread ChrisK
I have just uploaded the new regex-tdfa-1.1.0 to hackage.  This version is a 
small performance update to the old regex-tdfa-1.0.0 version.


Previously all text (e.g. ByteString) being search was converted to String and 
sent through a single engine.


The new version uses a type class and SPECIALIZE pragmas to avoid converting to 
String.  This should make adding support for searching other Char containers 
easy to do.


The new version includes six specialized engine loops to take advantage of 
obvious optimizations of the traversal.  The previous version had only a couple 
of such engines.  The new code paths have been tested for correctness and no 
performance degradations have shown up.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?

2009-03-16 Thread ChrisK

Thomas Hartman wrote:


testPcre = ( subRegex (mkRegex (?!\n)\n(?!\n)) asdf\n \n\n\nadsf
 ) == asdf \n\n\nadsf


quoting from the man page for regcomp:


REG_NEWLINE   Compile for newline-sensitive matching.  By default, newline is a 
completely ordinary character with
  no special meaning in either REs or strings.  With this flag, 
`[^' bracket expressions and `.' never
  match newline, a `^' anchor matches the null string after any 
newline in the string in addition to
  its normal function, and the `$' anchor matches the null string 
before any newline in the string in
  addition to its normal function.


This is the carried over to Text.Regex with


mkRegexWithOpts Source
:: String   The regular expression to compile
- Bool  True = '^' and '$' match the beginning and end of individual 
lines respectively, and '.' does not match the newline character.
- Bool  True = matching is case-sensitive
- Regex Returns: the compiled regular expression
Makes a regular expression, where the multi-line and case-sensitive options can 
be changed from the default settings.


Or with regex-posix directly the flag is compNewline:
http://hackage.haskell.org/packages/archive/regex-posix/0.94.1/doc/html/Text-Regex-Posix-Wrap.html
 The defaultCompOpt is (compExtended .|. compNewline).

You want to match a \n that is not next to any other \n.

So you want to turn off REG_NEWLINE.


import Text.Regex.Compat

r :: Regex
r = mkRegexWithOpts (^|[^\n])\n($|[^\n]) False True  -- False is important 
here



The ^ and $ take care of matching a lone newline at the start or end of the 
whole text.  In the middle of the text the pattern is equivalent to [^\n]\n[^\n].


When substituting you can use the \1 and \2 captures to restore the matched 
non-newline character if one was present.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?

2009-03-16 Thread ChrisK

Thomas Hartman wrote:


testPcre = ( subRegex (mkRegex (?!\n)\n(?!\n)) asdf\n \n\n\nadsf
 ) == asdf \n\n\nadsf


quoting from the man page for regcomp:


REG_NEWLINE   Compile for newline-sensitive matching.  By default, newline is a 
completely ordinary character with
  no special meaning in either REs or strings.  With this flag, 
`[^' bracket expressions and `.' never
  match newline, a `^' anchor matches the null string after any 
newline in the string in addition to
  its normal function, and the `$' anchor matches the null string 
before any newline in the string in
  addition to its normal function.


This is the carried over to Text.Regex with


mkRegexWithOpts Source
:: String   The regular expression to compile
- Bool  True = '^' and '$' match the beginning and end of individual 
lines respectively, and '.' does not match the newline character.
- Bool  True = matching is case-sensitive
- Regex Returns: the compiled regular expression
Makes a regular expression, where the multi-line and case-sensitive options can 
be changed from the default settings.


Or with regex-posix directly the flag is compNewline:
http://hackage.haskell.org/packages/archive/regex-posix/0.94.1/doc/html/Text-Regex-Posix-Wrap.html
 The defaultCompOpt is (compExtended .|. compNewline).

You want to match a \n that is not next to any other \n.

So you want to turn off REG_NEWLINE.


import Text.Regex.Compat

r :: Regex
r = mkRegexWithOpts (^|[^\n])\n($|[^\n]) False True  -- False is important 
here



The ^ and $ take care of matching a lone newline at the start or end of the 
whole text.  In the middle of the text the pattern is equivalent to [^\n]\n[^\n].


When substituting you can use the \1 and \2 captures to restore the matched 
non-newline character if one was present.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?

2009-03-16 Thread ChrisK

Don Stewart wrote:

tphyahoo:

Is there something like subRegex... something like =~ s/.../.../ in
perl... for haskell pcre Regexen?

I mean, subRegex from Text.Regex of course:
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-compat

Thanks for any advice,


Basically, we should have it.


Let me open the discussion with all the questions I can quickly ask:

  What should the subRegex function do, exactly?
  (Single replacement,global replacement,once per line,...)

  What should the replacement template be able to specify?
  (Can it refer to all text before a match or all text after?)
  (Can it access the start/stop offsets as numbers?)

  Should the replacement template be specified in a String?  As an abstract 
data type or syntax tree?  With combinators?


  What happens if the referenced capture was not made?  Empty text?

  How will syntax errors in the template be handled (e.g. referring to a 
capture that does not exist in the regular expression)?


  Will the output text be String? ByteString? ByteString.Lazy? Seq Char?
  Note: String and Strict Bytestrings are poor with concatenation.

  Can the output text type differ from the input text type?

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Alternative to Data.Binary

2009-03-14 Thread ChrisK

Grzegorz Chrupala wrote:

Hi all,
Is there a serialization library other than the Data.Binary from hackage?


Yes.  binary-strict is one alternative:

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-strict






I am using Data.Binary in a couple of projects, but I have found its stack
and memory usage very hard to control. Its very common that decoding a map
or list of non-trivial size uses up all available RAM, or causes a stack
overflow.

I don't care that much about it being extremely fast, I just want to stop
worrying that if I try to read a file a few percent larger than the last
time, my program will suddenly stop working.

Best,
--
Grzegorz



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?

2009-03-12 Thread ChrisK

Thomas Hartman wrote:

Is there something like subRegex... something like =~ s/.../.../ in
perl... for haskell pcre Regexen?

I mean, subRegex from Text.Regex of course:
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-compat

Thanks for any advice,

thomas.


Short answer: No.

This is a FAQ.  The usual answer to your follow up Why not? is that the design 
space is rather huge.  Rather than justify this statement, I will point at the 
complicated module:


http://hackage.haskell.org/packages/archive/split/0.1.1/doc/html/Data-List-Split.html

The above module is a wide range of strategies for splitting lists, which is a 
much simpler problem than your subRegex request, and only works on lists.  A 
subRegex library should also work on bytestrings (and Seq).


At the cost of writing your own routine you get exactly what you want in a 
screen or less of code, see

http://hackage.haskell.org/packages/archive/regex-compat/0.92/doc/html/src/Text-Regex.html#subRegex
for subRegex which is 30 lines of code.

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: do nmergeIO or mergeIO preserve order?

2009-03-11 Thread ChrisK

Anatoly Yakovenko wrote:

do nmergeIO or mergeIO preserve order? or not preserve order?


If you have a list of operations [IO a] then the future package at

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/future

can do this.  It's 'forkPromises' function returns a Chan a which can be used 
to get the non-order preserving results (actually Either SomeExcption a).  If 
you are feeling lucky you can use getChanContents and filter to get a lazy 
[a] which is the results as they are completed.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: ANN: Future 1.1.0 concurrency library

2009-03-10 Thread ChrisK

Don Stewart wrote:

Who needs to build futures into the language -- all you need is MVars, eh?


For a pure computation in Haskell one can use par (which did take changing the 
runtime, and arguably adding to the language).


The future package I uploaded is just a clean way to get something a little like 
par for an IO computation, as a library.


One can build many useful APIs quite cheaply using MVars.  Hackage even has a 
few examples (many under the Concurrency heading).


This API was interesting solely because it is in the C++ standard and the 
discussion about how the standard left out useful proposed parts of the API.


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: Future 1.1.0 concurrency library

2009-03-09 Thread ChrisK

Hello,

  As a side effect of the discussion of the new C++ future/promise features at 
http://lambda-the-ultimate.org/node/3221 I have implemented a Haskell package 
called future at


http://hackage.haskell.org/cgi-bin/hackage-scripts/package/future

This ought to do what C++ standard futures/promises do, plus a bit more.  The 
main operation is


 forkPromise :: IO a - IO (Promise a)

This sets the IO a operation running in a fresh thread.  The eventual result 
can be accessed in many ways (non-blocking, blocking, blocking with timeout).


 let one :: Int; one = 1
 p - forkPromise (return (one+one))
 x - get
 y - wait

x is an Int with value 2.
y is an (Either SomeException Int) with value (Right 2).

The useful thing about futures, as opposed to various IVar packages, is handling 
 the case where the forked operation ends with an exception.  The exception 
becomes the return value of the promise.  The get operation rethrows it, the 
wait operation returns it as (Left ...).


There is also an abort command to kill a promise.  The dead promise may then 
have an exceptions as its value.


The plus a bit more than C++ is the nonblocking addTodo feature.  This takes 
a continuation function from the Either SomeException a to an IO operation. 
These continuation functions get queued and they are run immediately after the 
the forked operation completes.  Once completed any new addTodo continuations 
run immediately.


These continuations allow you to race a list of action and take the first one 
done, or to collect the answers as they complete into a Chan.  Both of those 
options are demonstrated in Future.hs as racePromises and forkPromises.


It should be safe to use unsafePerformIO . get or unsafePeformIO . wait to 
get lazy access to the result, which is itself immutable once set.


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Data.Binary stack overflow with Data.Sequence String

2009-03-04 Thread ChrisK
I have collected some of the backing code to decode.  This is all pasted below 
so we can look at it.  I will not improperly guess at the cause of the problem, 
and be totally wrong.


I observe Get is a lazy monad:

Prelude Data.Binary Data.Binary.Get Data.Monoid World == runGet ((return $! 
undefined)  return World) mempty

True

Prelude Data.Binary Data.Binary.Get Data.Monoid 'W' == head (runGet ((return $! 
undefined) = \t - return $! ('W':t))  mempty)

True

Prelude Data.Binary Data.Binary.Get Data.Monoid orld == tail (runGet ((return 
$! undefined) = \h - return $! (h:orld))  mempty)

True

This may have implication for building the String from Char.  The get for 
Char uses return $! char but this is no good unless the Char is being 
forced, as the (return $! undefined) above shows.


The instance Get [a] inherits the laziness of replicateM which is sequence.

The instance (Seq.Seq e) does not force the x :: String value.  And even if it 
did it would only force the leading (:) cons cell and not the characters 
themselves.  The instance is strict in what passes for the spine of the Seq, not 
the contents, and certainly not the deep contents.


You might try using newtype when deserializing ArticleDB and make a much 
stricter version of the code.


All the relevant code (yi  binary  ghc):


type Article = String
type ArticleDB = Seq Article

-- | Read in database from 'dbLocation' and then parse it into an 'ArticleDB'.
readDB :: YiM ArticleDB
readDB = io $ (dbLocation = r) `catch` (\_ - return empty)
  where r x = fmap (decode . BL.fromChunks . return) $ B.readFile x
-- We read in with strict bytestrings to guarantee the file is 
closed,
-- and then we convert it to the lazy bytestring data.binary 
expects.
-- This is inefficient, but alas...

decode :: Binary a = ByteString - a
decode = runGet get

instance (Binary e) = Binary (Seq.Seq e) where
put s = put (Seq.length s)  Fold.mapM_ put s
get = do n - get :: Get Int
 rep Seq.empty n get
  where rep xs 0 _ = return $! xs
rep xs n g = xs `seq` n `seq` do
   x - g
   rep (xs Seq.| x) (n-1) g

instance Binary Int where
put i   = put (fromIntegral i :: Int64)
get = liftM fromIntegral (get :: Get Int64)

instance Binary Int64 where
put i   = put (fromIntegral i :: Word64)
get = liftM fromIntegral (get :: Get Word64)

instance Binary Word64 where
put = putWord64be
get = getWord64be

instance Binary a = Binary [a] where
put l  = put (length l)  mapM_ put l
get= do n - get :: Get Int
replicateM n get

-- Char is serialised as UTF-8
instance Binary Char where
put a | c = 0x7f = put (fromIntegral c :: Word8)
  | c = 0x7ff= do put (0xc0 .|. y)
   put (0x80 .|. z)
  | c = 0x   = do put (0xe0 .|. x)
   put (0x80 .|. y)
   put (0x80 .|. z)
  | c = 0x10 = do put (0xf0 .|. w)
   put (0x80 .|. x)
   put (0x80 .|. y)
   put (0x80 .|. z)
  | otherwise = error Not a valid Unicode code point
 where
c = ord a
z, y, x, w :: Word8
z = fromIntegral (c   .. 0x3f)
y = fromIntegral (shiftR c 6  .. 0x3f)
x = fromIntegral (shiftR c 12 .. 0x3f)
w = fromIntegral (shiftR c 18 .. 0x7)

get = do
let getByte = liftM (fromIntegral :: Word8 - Int) get
shiftL6 = flip shiftL 6 :: Int - Int
w - getByte
r - case () of
_ | w  0x80  - return w
  | w  0xe0  - do
x - liftM (xor 0x80) getByte
return (x .|. shiftL6 (xor 0xc0 w))
  | w  0xf0  - do
x - liftM (xor 0x80) getByte
y - liftM (xor 0x80) getByte
return (y .|. shiftL6 (x .|. shiftL6
(xor 0xe0 w)))
  | otherwise - do
x - liftM (xor 0x80) getByte
y - liftM (xor 0x80) getByte
z - liftM (xor 0x80) getByte
return (z .|. shiftL6 (y .|. shiftL6
(x .|. shiftL6 (xor 0xf0 w
return $! chr r


replicateM:: (Monad m) = Int - m a - m [a]
replicateM n x= sequence (replicate n x)

sequence   :: Monad m = [m a] - m [a] 
{-# INLINE sequence #-}

sequence ms = foldr k (return []) ms
where
  k m m' = do { x - m; xs - m'; return (x:xs) }


--
Chris

___
Haskell-Cafe mailing list

[Haskell-cafe] major speed improvement: regex-tdfa reaches 1.0.0

2009-03-01 Thread ChrisK

Announcing the version 1.0.0 release of regex-tdfa.

I am proud of this release.
This is not just a bug fix release.
It is a serious improvement in the asymptotic running time.

The previous versions allowed bad combinations of pattern and searched
text length to scale badly in the length of the text.  Previously the
worst case for text of length N was O(N^3).

The new worst case asymptotic runtime scaled as O(N).
There is never any backtracking.
And the worst case storage space is independent of N.

The package is on hackage at
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa
The darcs repository is at
http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/

All non-overlapping matches are found and returned, along with all
captured (parenthesized) subexpressions.  The result is precisely what
Posix extended regular expressions are supposed to return.

To be concrete, consider example text with length of N of (2^n)+2:

 longInput = replicate (2^n) 'a' ++ bc

And define 4 worst-case-scenario patterns.  I wrote the code and I
know how to kill it:

 wcs0 = a*b
 wcs1 = a*c
 wcs2 = (a|aa)*b
 wcs3 = (a|aa)*c

wcs0 is easy.
wcs1 causes the old code to backtrack.
wcs2 causes the old code's storage to scale as O(N).
wcs3 causes both backtracking and O(N) storage with the old code.

The old code's time scales as O(N) for wcs0, O(N^2) for wcs1 and wcs2,
and O(N^3) for wcs3.  The new code is always O(N).  The actual timings
for the old code on my G4 laptop for wcs on 2^8 and 2^9 and 2^10 are:

Reason:compare-tdfa chrisk$ time ./Test-TDFA-np wcs3 8 +RTS -sstderr 21  | 
head -4
./Test-TDFA-np wcs3 8 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(257,1)),(1,(-1,0))]]
 263,776,756 bytes allocated in the heap
user0m1.017s
sys 0m0.058s

Reason:compare-tdfa chrisk$ time ./Test-TDFA-np wcs3 9 +RTS -sstderr 21  | 
head -4
./Test-TDFA-np wcs3 9 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(513,1)),(1,(-1,0))]]
   1,998,647,452 bytes allocated in the heap
user0m7.083s
sys 0m0.289s

Reason:compare-tdfa chrisk$ time ./Test-TDFA-np wcs3 10 +RTS -sstderr 21  | 
head -4

./Test-TDFA-np wcs3 10 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(1025,1)),(1,(-1,0))]]
  15,653,277,200 bytes allocated in the heap
user0m53.350s
sys 0m2.056s

The doubling of length is causing a nearly eight-fold time increase.
The heap allocation is also increasing nearly eight-fold.

The new code with the same input pattern and texts gives:

Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 8 +RTS -sstderr 
21  | head -4

./Test-TDFA2-0.99.19-np2 wcs3 8 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(257,1)),(1,(-1,0))]]
   2,135,324 bytes allocated in the heap
user0m0.017s
sys 0m0.017s

Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 9 +RTS -sstderr 
21  | head -4

./Test-TDFA2-0.99.19-np2 wcs3 9 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(513,1)),(1,(-1,0))]]
   3,588,656 bytes allocated in the heap
user0m0.024s
sys 0m0.017s

Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 10 +RTS -sstderr 
21  | head -4

./Test-TDFA2-0.99.19-np2 wcs3 10 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(1025,1)),(1,(-1,0))]]
   6,345,436 bytes allocated in the heap
user0m0.038s
sys 0m0.018s

Note that the heap allocation for the 1026 character example above is
2466 times less than the old code.

That was too fast to prove the scaling, so take more input:

Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 20 +RTS -sstderr 
21  | head -4

./Test-TDFA2-0.99.19-np2 wcs3 20 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(1048577,1)),(1,(-1,0))]]
   5,708,574,848 bytes allocated in the heap
user0m26.023s
sys 0m0.985s

Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 wcs3 21 +RTS -sstderr 
21  | head -4

./Test-TDFA2-0.99.19-np2 wcs3 21 +RTS -sstderr
Test for [Char]
Right [array (0,1) [(0,(2097153,1)),(1,(-1,0))]]
  11,416,354,056 bytes allocated in the heap
user0m52.656s
sys 0m1.985s

The length and time both doubled, as did the heap allocation.  And the
new code has searched two million characters in the time the old code
searched one thousand.

How about away from the worst case scenario?  On the testing suite the
new code is running slightly slower:

Reason:compare-tdfa chrisk$ time ./Test-TDFA-np -r 1 100  /dev/null
user0m4.841s
sys 0m3.019s

Reason:compare-tdfa chrisk$ time ./Test-TDFA2-0.99.19-np2 -r 1 100  /dev/null
user0m5.970s
sys 0m3.012s

So that is an increase of execution time of 14%.  This small dip in
performance might be reclaimable with more optimization.  I think the
gain in worst case performance already offsets the slight cost.

The code for String is complete.  The strict and lazy bytestrings and
the (Seq Char) are currently using the String code for matching.  This
will be improved in a future release.

Cheers,
  Chris

[Haskell-cafe] ANN: bug fix for regex-tdfa, version 0.97.4 (and regex-ast)

2009-02-24 Thread ChrisK

Hello,

  The regex-tdfa package has had a series of bug fix releases (0.97.1 and 2 and 
3 and now 4).  This 0.97.4 releases finishes fixing the bug that was only mostly 
fixed in the 0.97.1 release.


  An example of the fixed bug: Apply the regex pattern (BB(B?))+(B?) to the 
text .  The BB in the pattern should be used twice and both B? should 
match nothing.  My code grouped the + wrong and matched the BB once and then 
both the B? matched a B.


  The case fixed here was not initially caught because of how I search for 
unknown bugs.  I use Arbitrary from QuickCheck to generate random patterns and 
strings to search, and compare regex-tdfa to another POSIX engine.


  Because I am on OS X, I am limited by the the native POSIX libraries bugs: 
this bug in regex-tdfa was triggered only when the native POSIX was also buggy.


  But the source of most of my unit tests is ATT research [1], and they have a 
libast with a POSIX implementation.  I have adapted my regex-* wrapper 
packages to make a regex-ast Haskell interface, but the difficulties with the 
ATT headers prevent me from releasing this on hackage.  This regex-ast has 
given me access to a less buggy POSIX back-end, and randomized testing has led 
to catching the bug fixed here (as well as a few bug reports back to ATT).


  So while regex-tdfa will not win many speed contests, it is the only POSIX 
regular expression library I have running that passes all the unit tests.


[1] http://www.research.att.com/sw/download/
http://www.research.att.com/~gsf/testregex/
http://www.research.att.com/~gsf/testregex/re-interpretation.html

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: speed: ghc vs gcc

2009-02-20 Thread ChrisK

On a G4:

s.hs (which does not need bang patterns) is:


main = seq (sum0 (10^9) 0) (return ())

sum0 :: Int - Int - Int
sum0 0 acc = acc
sum0 x acc = sum0 (x-1) $! (acc+x)


And s.c is (actually including 10^9, which Bulat's did not):


main()
{
  int sum=0;
  for(int i=1000*1000*1000; i0; i--)
  sum += i;
}


I compiled them with

ghc --make -O2 s.hs -o shs
gcc -o sc -std=c99 -O3 -funroll-loops s.c

And timed them:

$ time ./shs

real0m3.309s
user0m3.008s
sys 0m0.026s

$ time ./sc

real0m0.411s
user0m0.316s
sys 0m0.006s

So C is 9.4 times faster.

And via-C did not help:

$ ghc -fvia-C -optc -O3 -funroll-loops --make -O2 s.hs -o shs-via-C
$ time ./shs-via-C

real0m7.051s
user0m3.010s
sys 0m0.050s

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: Bug fix to regex-tdfa, new version 0.97.3

2009-02-10 Thread ChrisK

To Haskell and Libraries and Haskell-Cafe,

Whilst improving regex-tdfa I have run across new bugs.  Some patterns were 
getting compiled wrong and others were affected by an execution bug.


As this package has actual users, I wanted to make sure they get these fixes 
immediately.


Three Cheers For QuickCheck!

The new version is 0.97.3 at
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa

And this version passes all the unit tests I have, including coverage for the 
new bugs.  This is no warranty that regex-tdfa is bug free, since I made that 
same claim last release.  For instance: I suspect 0.97.3 may be buggy if used in 
the optional left-associative mode.


The new improved version of regex-tdfa is still a long way off.

Cheers,
 Chris Kuklewicz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Fastest regex package?

2009-02-05 Thread ChrisK

Eugene Kirpichov wrote:

All in all, my question remains: what is the fastest way to do this
kind of parsing on a lazy bytestring?



Your example regular expression works the same in both Posix and Perl-ish 
semantics.   Do you know the difference?  Posix libraries look for the longest 
match of all possible matches.  Perl-ish is left-bias and looks at the left 
branch first and only looks at the right branch when the left fails and it has 
to backtrack.


So the ++ operator is a hack to try and control the backtracking of Perl 
regular expressions.  Such a things has no meaning in Posix where the 
implementation details are totally different.


You might try this variant of the example pattern:
 /foo.xml.*fooid=([0-9]+)[^0-9].*barid=([0-9]+)

The [^0-9] can be used when you know that there is at least one junk character 
before the barid, which I suspect will always occur in a URL.


I expect regex-posix to be slower than regex-pcre. I have not used the new 
pcre-light.  I wrote regex-tdfa — it is pure haskell and not a C library 
wrapper.  There are patterns where regex-pcre will backtrack and take 
exponentially more time than regex-tdfa's automaton (which is not yet ideal and 
may get faster).


So what is the lazy bytestring with its multiple buffers doing for you when 
using PCRE, PCRE-light, or regex-posix? Absolutely nothing.  To run against 
these C libraries the target text is converted to a single buffer, i.e. a 
CStringLen in Haskell.  Thus it is morally converted into a strict bytestring. 
This may involve copying the logfile into a NEW strict bytestring EVERY TIME you 
run a match.  Please Please Please convert to a strict bytestring and then run 
regex-pcre or pcre-light (or any of the others).


regex-tdfa does not convert it into a strict bytestring, but is otherwise much 
slower than pcre for your simple pattern.


As for regex-pcre's interfaceyou should use the API in regex-base to get a 
pure interface.   The RegexLike functions are the pure interface for this, and 
the RegexContext class offers a slew of instances with useful variants.  But if 
you have been getting to the low level IO API in regex-pcre then you probably 
do not need or want the RegexContext transformations.


And BoyerMoore (which I think I helped optimize): this may be faster because it 
does not copy your whole Lazy bytestring into a Strict ByteString for each 
search.  But you may wish to test it with a Strict ByteString as input anyway.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: evaluation semantics of bind

2009-02-05 Thread ChrisK

Gregg Reynolds wrote:

getChar = \x - getChar

An optimizer can see that the result of the first getChar is discarded


True, so 'x' is not used, and it can be garbage collected, and may not even be 
created.


But that data dependency is simple not the data dependency that make IO 
sequential.  Here is code from IOBase.lhs for GHC:



newtype IO a = IO (State# RealWorld - (# State# RealWorld, a #))


The # are unboxed types and thus strict, but here we can erase them for clarity:

newtype IO a = IO (State RealWorld - (State RealWorld, a))

getChar is of type IO Char so that is constructor IO applied to a function from 
the State RealWorld to a strict pair of State RealWorld and Char.


Since this is strict there is no laziness and the code must evaluate the input 
and output State RealWorld to ensure they are not bottom or error.


Here is the rest of the plumbing in GHC:


unIO :: IO a - (State# RealWorld - (# State# RealWorld, a #))
unIO (IO a) = a

instance  Functor IO where
   fmap f x = x = (return . f)

instance  Monad IO  where
{-# INLINE return #-}
{-# INLINE ()   #-}
{-# INLINE (=)  #-}
m  k  =  m = \ _ - k
return x= returnIO x

m = k = bindIO m k
fail s  = failIO s

failIO :: String - IO a
failIO s = ioError (userError s)

liftIO :: IO a - State# RealWorld - STret RealWorld a
liftIO (IO m) = \s - case m s of (# s', r #) - STret s' r

bindIO :: IO a - (a - IO b) - IO b
bindIO (IO m) k = IO ( \ s -
  case m s of 
(# new_s, a #) - unIO (k a) new_s

  )

thenIO :: IO a - IO b - IO b
thenIO (IO m) k = IO ( \ s -
  case m s of 
(# new_s, _ #) - unIO k new_s

  )

returnIO :: a - IO a
returnIO x = IO (\ s - (# s, x #))


The bind operation's case statement forces the unboxed new_s :: State# 
RealWorld to be strictly evaluated, and this depends on the input strict s :: 
State# RealWorld.  This data dependency of new_s on s is what forces IO 
statements to evaluate sequentially.


Cheers,
   Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Monad explanation

2009-02-05 Thread ChrisK
All Haskell programs start as 



main :: IO ()



though... so they all get evaluated in the context of another IO ()
don't they?



True for most cases now, but historically false.  Haskell existed and people 
wrote programs for years before the Monad class and IO were created.  A 
Haskell98 program can be taken to have a main :: IO (), but that is not essential.


Which is the point jcc made:



Well...  Haskell compilers and runhaskell-style interpreters (not
regular Hugs/ghci!) take the value of Main.main as `the program'.  But
that feels (to me --- I could be wrong) like an aspect of a particular
hosted environment.  REPLs can handle programs that aren't wrapped up in
IO at all; and there's no reason why IO has to be the type of
IO-performning-things in REPLs, either.  You could just as well write a
REPL that took, say, tangible values [http://haskell.org/haskellwiki/TV]
as input instead, and displayed them.  So it's more a matter of Haskell
implementations can be given an IO value to run than that combining IO
values together somehow runs them.

jcc


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: evaluation semantics of bind

2009-02-05 Thread ChrisK

Jonathan Cast wrote:

On Fri, 2009-02-06 at 00:51 +0100, Peter Verswyvelen wrote:

On Thu, Feb 5, 2009 at 8:20 PM, ChrisK hask...@list.mightyreason.com
wrote:
Since this is strict there is no laziness and the code must
evaluate the input and output State RealWorld to ensure they
are not bottom or error.



Interesting. I also thought it was the passing of the RealWorld that
caused the sequencing, I never realized that the strictness played an
important role here. 



So what would happen if it would be lazy instead of strict? What kind
of craziness would occur?


The order of side effects would be demand-driven, rather than
order-of-statement driven.  So if I said:

[snip]

Essentially, the program would act as if every statement was wrapped up
in an unsafeInterleaveIO.

jcc


I do not think so.  Consider 
http://darcs.haskell.org/packages/base/GHC/IOBase.lhs to see unsafeInterleaveIO:



unsafeInterleaveIO m = unsafeDupableInterleaveIO (noDuplicate  m)
unsafeDupableInterleaveIO (IO m)
  = IO ( \ s - let
   r = case m s of (# _, res #) - res
in
(# s, r #))


And Control.Monad.State.Lazy which is the non-strict State monad:


newtype State s a = State { runState :: s - (a, s) }



instance Monad (State s) where
return a = State $ \s - (a, s)
m = k  = State $ \s - let
(a, s') = runState m s
in runState (k a) s'


And you can see that the data dependence is broken in unsafeInterleaveIO by _, 
and not broken in State.Lazy.  But neither s nor s' are not forced to WHNF.


What you can do State.Lazy is put an bottom into the state.  Or you can put an 
unevaluated thunk into the state, and if it gets replaced later then you never 
paid to evaluate it.


But the sequencing is retained.  One could make an unsafeInterleaveIO for state:


interleaveState :: State s a - State s a
interleaveState (State m) = State ( \ s - let (a,_) = runState m s
   in (a,s) )


Now the dependency chain is broken in the above, due to the _ being ignored.

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: regex-posix-unittest-1.0 AND regex-posix-0.94.1 AND regex-tdfa-0.97.1

2009-02-02 Thread ChrisK

I have three announcements to make about regex-* related packages.

The regex-posix-0.94.1 package update provides better semantics for multiple 
matches.  Below version 0.94, if any match was empty the matching would stop. 
Now the empty match is returned and the position is incremented and the 
searching continues.


The regex-tdfa-0.71.1 package update provides the same new multiple match 
semantics.  It also fixes a bug I found.  I know of no outstanding bugs in 
regex-tdfa, and version 0.71.1 now passes all the tests used in 
regex-posix-unittest-1.0 announced below.


We should care about the correctness of our operating system libraries.
To help with this, I have a NEW package to announce: regex-posix-unittest-1.0

The accompanying wiki page is http://www.haskell.org/haskellwiki/Regex_Posix

This new package provides an executable called regex-posix-unittest which you 
can install as --user or --global.


The regex-posix-unittest executable with no arguments runs a suite of unit 
tests, all of which are described by text files in the package, the format is 
documented in the wiki page.  By editing the text files in the package you can 
add to or delete from the unit tests being run.


With two arguments the program expects the text first and the pattern second and 
will run just that match and print all the results.


How does regex-posix-unittest help us care about the OS libraries?

The regex-posix distributed in the GHC bundle uses the OS C library's regex.h 
API.  The regex-posix-unittest package will quite likely show you that your OS C 
library regex.h API is full of bugs.


If you are on Linux, it will show you a plethora of GLIBC bugs in Posix 
conformance.

If you are on OS X, FreeBSD, or NetBSD, it will show you many bugs including a 
critical bug where it fail to find a match where one actually exists.


These bugs in the OS library are inherited by your sed program as well as 
regex-posix and Haskell.


If you are on Windows, or OpenBSD, or Solaris, or anything else, then please 
update the wiki page at http://www.haskell.org/haskellwiki/Regex_Posix or email 
me with your results so I can update the wiki.


You may have evil and ingenious tests of Posix extended regular expressions to 
add to the test suite.  Adding them is easy and if you send them to me I will 
put them in an updated version of regex-posix-unittest.


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: ANN: HTTPbis / HTTP-4000.x package available

2009-01-21 Thread ChrisK

Duncan Coutts wrote:

Proxy auto-configuration files are JavaScript. It uses more or less the
full JavaScript language (ECMA these days), though with a small subset
of the standard library.


W T F

So we want a tiny naive javascript interpreter, hopefully in pure Haskell.  The 
dumbest interpreter than can parse and evaluate the language.


An separate program that just translates .pac file to something declarative is 
sorely needed.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Existencial quantification and polymorphic datatypes

2009-01-20 Thread ChrisK

Mauricio wrote:

Hi,

I'm trying, without success, to understand the difference
between existencial quantification and polymorphic
datatypes. Can you give me a hint, or an example where
one is valid and the other is not?


The first thing to ensure you know is that Haskell can have functions (usually 
in type classes) that have the Perl-like ability to return different types of 
values depending on the context.


The usual example for this is fromInteger :: Num a = Integer - a.  Every 
integer in your source code is put through this function.  If the number is used 
in an Int context then it makes and Int, if use in a Word8 context then it makes 
a Word8.


The other way we talk about the type of context is that this is the type 
demanded by the user of the value.  Concretely:


x :: forall a. Num a = a
x = fromInteger 1

The type of 'x' is any Num type chosen by the user.  The critical thing here is 
that fromInteger does not get to choose the type.  This is bounded or 
constrained polymorphism, the type 'a' is polymorphic but bounded by the Num 
constraint.


Now I can refer to ghc's manual on existential quantification:
http://www.haskell.org/ghc/docs/latest/html/users_guide/data-type-extensions.html#existential-quantification

So what about wanting to write a function myNum that returns some Num type 
that myNum gets to choose instead of the user demanding a type.


We can do this with existential types, which usually is used with a data type 
(often a GADT) and here I call this SomeNum:



{-# LANGUAGE ExistentialQuantification #-}
import Int
import Data.Typeable
data SomeNum = forall a. (Typeable a, Num a) = SomeNum a

myNum :: Integer - SomeNum
myNum x | abs x  2^7  = let i :: Int8
 i = fromInteger x
 in SomeNum i
| abs x  2^31 = let i :: Int32
 i = fromInteger x
 in SomeNum i
| otherwise = SomeNum x

display :: SomeNum - String
display (SomeNum i) = show i ++  ::  ++ show (typeOf i)

main = do
  putStrLn (display (myNum (2^0)))
  putStrLn (display (myNum (2^8)))
  putStrLn (display (myNum (2^32)))


In GHCI I see


*Main main
main
1 :: Int8
256 :: Int32
4294967296 :: Integer


In the above you can see the polymorphism of the return type of fromInteger, it 
returns a Int8 or a Int32.


You can see the polymorphism of the argument of show, it takes an Int8 or 
Int32 or Integer.


The latest ghc-6.10.1 also allows avoiding use of SomeNum, see 
impredicative-polymorphism:

http://www.haskell.org/ghc/docs/latest/html/users_guide/other-type-extensions.html#impredicative-polymorphism

So this next bit is very very new:

{-# LANGUAGE ExistentialQuantification, RankNTypes, ImpredicativeTypes #-}



displayNum :: [forall a. Num a = Integer - a] - String
displayNum converts = unlines $ concatMap withF converts
  where withF :: (forall b. Num b = Integer - b) - [String]
withF f = [ show (f 1 :: Int8)
  , show (f (2^10) :: Int32)
  , show (f (2^32) :: Integer) ]


The first argument to displayNum is a list of an existential type.  Before 
ImpredicativeType this would have required defining and using a data type like 
SomeNum.  So the latest ghc lets one avoid the extra data type.


displayNum cannot demand any (Num b = Integer-b) function.  The list holds 
SOME functions, but which one is unknowable to displayNum.


The first argument of withF is polymorphic and while this requires RankNTypes 
(or Rank2Types) the type of withF is not existential.  In withF the code demands 
three different types of results from the 'f'.  This works because the return 
type of (f 1) is really (Num b = b) and this is polymorphic and any type 'b' 
which is a Num can be demanded.


This can be tested with


  putStr $ displayNum [ fromInteger
  , fromInteger . (2*)
  , fromInteger . (min 1000) ]


which passes in a list of three different conversion functions.  In ghci the 
result is:



1
1024
4294967296
2
2048
8589934592
1
1000
1000


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Existencial quantification and polymorphic datatypes (actually, components...)

2009-01-20 Thread ChrisK

Great, thanks! I'm enlightened :)


And no one had to hit you with a stick first!



But how is this:

data SomeNum = forall a. SN a

different from:

data SomeNum = SN (forall a. a)

?


At a glance they look the same to me — but only the first is accepted by ghc.

There is also the GADT syntax:


data SomeNum where SomeNum :: forall a. (Typeable a, Num a) = a - SomeNum


which is accepted by ghc with the LANGUAGE GADTs extension.

The GADT is more than simple syntactic sugar, it allows for easier use this kind 
of existential type.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Bug in Text.Regex.PCRE - do not accept national symbol in pattern

2009-01-19 Thread ChrisK

Alexandr,

  Thanks for sending me this question about unicode and regex-pcre.  I will 
share with the mailing list. This is an encoding issue.


From the haddock documentation for regex-pcre:

http://hackage.haskell.org/packages/archive/regex-pcre/0.94.1/doc/html/Text-Regex-PCRE.html

Using the provided CompOption and ExecOption values and if configUTF8 is
True, then you might be able to send UTF8 encoded ByteStrings to PCRE and get
sensible results. This is currently untested.


This is a literate Haskell post so you can save with file extension
.lhs and pass this to ghci.

The answer is a combination of man 3 pcre and the haddock
documentation for haskell-pcre and using makeRegexOpts.  I show one
possible way to use utf8 below, via the 'utf8-string' package from
hackage.  There are other ways to use the same package and other
packages available.

 {-# LANGUAGE FlexibleContexts #-}
 import Text.Regex.PCRE hiding ((=~))
 --import Text.Regex.PCRE.Wrap(configUtf8)
 import qualified Data.ByteString.UTF8 as U
 import qualified System.IO.UTF8 as U
 import Data.Bits((.|.))


Here I copied the original source for (=~) from
http://hackage.haskell.org/packages/archive/regex-pcre/0.94.1/doc/html/Text-Regex-PCRE-Wrap.html#v%3A%3D~
I then editied it to create a custom (=~) that defines its own
options.  You can add compNoUTF8Check for performance/safety tradeoff
(see man 3 pcre).

 makeRegexUtf8 :: (RegexMaker Regex CompOption ExecOption source) = source - 
Regex
 makeRegexUtf8 r = let co = defaultCompOpt .|. compUTF8 -- need compUTF8 flag 
when using makeRegexOpts

   -- co = defaultCompOpt .|. compUTF8 .|. compNoUTF8Check 
 --
   in makeRegexOpts co defaultExecOpt r

 (=~)  :: (RegexMaker Regex CompOption ExecOption source,RegexContext Regex 
source1 target)

   = source1 - source - target
 (=~) x r = let q = makeRegexUtf8 r
in match q x

If you are going to use the same pattern against many different texts
then you should NOT use (=~).  Instead you should call makeRegexUtf8
and reuse the resulting Regex value.  Otherwise you have to recompile
the pattern for each match performed.

Below, 're_test' was changed internally to convert the [Char] into a
ByteString holding a utf8 encoded representation.  The 'makeRegexOpts'
and 'match' calls will then run the libpcre routines directly on the
the memory that backs the ByteString.  This is an optimal was to use
the library.

 re_test :: String - String - Bool
 re_test re str = (U.fromString str) =~ (U.fromString re)

 -- test for national symbols
 main = do
   putStrLn $ If this line ends with True then your libpcre has UTF8 support: 
 ++ show configUTF8

   let pattern1,pattern2,pattern3,text :: String
   pattern1 = ^п.*
   pattern2 = ^..ив.*
   pattern3 = ^..$
   text = привет
   U.putStrLn $ The 3 patterns are:  ++ pattern1 ++ ,  ++ pattern2 ++ , 
and ++pattern3

   U.putStrLn $ The text to be matched is  ++ text
   putStrLn $ The length of the text to be matched is ++show (length text)
   putStrLn All three lines below should print True
   print $ re_test pattern1 text
   print $ re_test pattern2 text
   print $ re_test pattern3 text

The output when I run this on my machine is

If this line ends with True then your libpcre has UTF8 support: True
The 3 patterns are: ^п.*, ^..ив.*, and ^..$
The text to be matched is привет
The length of the text to be matched is 6
All three lines below should print True
True
True
True

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How to make code least strict?

2009-01-19 Thread ChrisK

Robin Green wrote:

What guidelines should one follow to make Haskell code least-strict?


Obviously the use of seq and bang-patterns make code more strict.

Code is strict when it evaluates values to determine a pattern match.  So 
avoiding that makes code lazier.  Values are evaluated when decisions have to be 
make in order to choose what an expression will evaluate to.  Avoiding case 
statements and things that de-sugar to case statements such as if then else 
and pattern matching.  Put off examining the input values.  Occasionally the use 
of lazy patterns, preceded by ~, can help make code both more compact and less 
strict.


Consider that the order of pattern matching can matter as well, the simplest 
common case being zip:


zip xs [] = []
zip [] ys = []
zip (x:xs) (y:ys) = (x,y) : zip xs ys

The order of the first two lines of zip's definition affects whether
zip [] (error boom)
or
zip (error bam) []
will be an error.  This shows that least-strict is not a unique goal.

For the choice I just made the zip [] (error boom) will cause an error 
because the first definition line of zip checks the second argument, while zip 
(error bam) [] will evaluate to [].


The other way to reduce strictness is to be more polymorphic because this 
reduces what can be sensibly done with the arguments.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: some ideas for Haskell', from Python

2009-01-19 Thread ChrisK

Manlio Perillo wrote:

Brandon S. Allbery KF8NH ha scritto:
 
...in theory. In practice GHC needs help with circular imports, and 
some cycles might be impossible to resolve.




This is interesting.
Where can I find some examples?

Is this explained in the Real World Haskell book?



I have no idea about RWH, but there are certainly mutual import cycles that 
cannot be resolved by using hs-boot files with GHC.


Consider three modules A and B and C, which are A-B-C permutations of


module A(A,AKBC, AKCB)
import B(B,BKAC)
import C(C,CKAB)

data A
AKBC :: Either B C
AKCB :: Either C B


 instance Show (A,BKAC,CKAB) where ...

There is no way to break the ?K?? import cycle with just hs-boot files.  I had 
to solve this by generating helper modules.


Call the data A the rank-1 declarations.  Then the ?K?? are built on rank-1 
types such as B and C and are rank-2 declarations.  The rank-1 declarations 
can all be put in hs-boot files but the rank-2 declaration import cycle cannot 
be broken with the same hs-boot files.  Some of these need to be put in separate 
modules.


It may be possible to make a useful definition of rank-3 and higher 
declarations.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Comments from OCaml Hacker Brian Hurt

2009-01-16 Thread ChrisK

Dan Weston wrote:
Richard Feinman once said: if someone says he understands quantum 
mechanics, he doesn't understand quantum mechanics.


But what did he know...


Well, I am a physicist and Feynman (with a y, not an i), is not talking about 
the linear algebra.


Of course, linear algebra [1] here is used a vector space [2].  The tricky thing 
is that humans then measure the state.  And this is confusing step that causes 
Feynman to say that no one understands it.


But the measurement step and how it interacts with the vector space can be 
approximated by an algorithm [3] using ExistentialQuantification and Arrows.


[1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/numeric-quest
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hmatrix
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/Vec
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas
[2] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/vector-space
[3] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/quantum-arrow

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Comments from OCaml Hacker Brian Hurt

2009-01-16 Thread ChrisK

Here is a great Monoid found in the wild story:

I just implemented a library for binary message serialization that follows 
Google's protocol buffer format.


The documentation of this was very scattered in some respects but I kept reading 
snippets which I have pasted below.  The effect of these snippets is to document 
that the messages on the wire should mimic an API where they can be combined in 
various merge operations (right-biased, concatenation, and recursive merging), 
and that well-formed messages have default values for all fields (which can be 
set in the spec).


So the code below is a well thought out collection of properties that has 
reinvented the wheel known as Monoid, so the Haskell API creates Monoid instances.


http://code.google.com/apis/protocolbuffers/docs/encoding.html


Normally, an encoded message would never have more than one instance of an
optional or required field. However, parsers are expected to handle the case in
which they do. For numeric types and strings, if the same value appears multiple
times, the parser accepts the last value it sees. For embedded message fields,
the parser merges multiple instances of the same field, as if with the
Message::MergeFrom method – that is, all singular scalar fields in the latter
instance replace those in the former, singular embedded messages are merged, and
repeated fields are concatenated. The effect of these rules is that parsing the
concatenation of two encoded messages produces exactly the same result as if you
had parsed the two messages separately and merged the resulting objects. That
is, this:


MyMessage message;
message.ParseFromString(str1 + str2);


is equivalent to this:


MyMessage message, message2;
message.ParseFromString(str1);
message2.ParseFromString(str2);
message.MergeFrom(message2);


This property is occasionally useful, as it allows you to merge two messages
even if you do not know their types.


And this at http://code.google.com/apis/protocolbuffers/docs/proto.html


As mentioned above, elements in a message description can be labeled
optional. A well-formed message may or may not contain an optional element.
When a message is parsed, if it does not contain an optional element, the
corresponding field in the parsed object is set to the default value for that
field. The default value can be specified as part of the message description.
For example, let's say you want to provide a default value of 10 for a
SearchRequest's result_per_page value.


optional int32 result_per_page = 3 [default = 10];


If the default value is not specified for an optional element, a
type-specific default value is used instead: for strings, the default value
is the empty string. For bools, the default value is false. For numeric
types, the default value is zero. For enums, the default value is the first
value listed in the enum's type definition.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Compiling regex-posix-0.93.2 on windows

2009-01-15 Thread ChrisK

Paulo: I suggest doing this more carefully.

Get the source from hackage.

Edit the regex-posix.cabal file to add the include and lib directories you need 
on Cygwin.


cabal configure it.

cabal build it.

cabal install it.

Then in an unrelated directory try and run ghci -package regex-posix.  This 
should load regex-posix-VERSION.  If this fails then I think that the cabal file 
needs fixing.


If that works then please test it with these commands:

Prelude :m + Text.Regex.Posix
Prelude Text.Regex.Posix (Text.Regex.Posix.=~) ab (()|[ab])(b) :: 
(String,String,String,[String])


The right answer by the way is:
 (,ab,,[a,,b])

But on FreeBSD/NetBSD/OS X there is a bug that I have found and it prints:
 (a,b,,[,,b])

Which just goes to show that a mountain of QuickCheck is what is sometimes 
needed to catch really hard to trigger bugs.


This is the other bug I reported:

Prelude Text.Regex.Posix (Text.Regex.Posix.=~) XababaY (X)(aba|ab|b)*(Y) 
:: (String,String,String,[String])
(,XababaY,,[X,b,Y])


The above answer is impossible (what matched a next to b?), but I now think 
I know WTF the library code is doing wrong (I think it is matching ababa in 
two passes and the second pass is greedy which is a broken strategy).  The right 
answer is:


 (,XababaY,,[X,aba,Y])

Cheers,
  Chris

PS: Yes, I have reported these bug to Apple, FreeBSD, and NetBSD this month.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Comments from OCaml Hacker Brian Hurt

2009-01-15 Thread ChrisK

Thomas DuBuisson wrote:

How does forcing them to learn proposed terminology such as `Appendable'
help here?  Learners of Haskell do still need to learn what the new word
means.


The contention is that 'Appendable' is an intuitive naming that people
will already have a rudimentary grasp of.  This as opposed to Monoid,
which absolutely requires looking up for the average coder.


Intuition tells me:

* 'Appendable' add an element to the back of a (finite) linear collection.
* There is a 'Prependable' somewhere that add the element to the front.
* There is an inverse 'pop' or 'deque' operation nearby.

Absolutely none of those things are true.  Let's try for 'Mergeable'

* mconcat joins two collections, not a collection and an element.
* Is should be a split operation.

The above is true for the list instance, but false in general.  Look at the 
instances already given that violate the collection idea:



Monoid Any
Monoid All
Monoid (Last a)
Monoid (First a)
Num a = Monoid (Product a)
Num a = Monoid (Sum a)


And I don't even see an (Ord a)=(Max a) or a Min instance.

So the original article, which coined 'Appendable', did so without much thought 
in the middle of a long post.  But it does show the thinking was about 
collections and there is one ONE instance of Monoid at


http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html#t%3AMonoid

that is about a collection (Monoid ([] a)) that has a split operation.

ONE.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Monads aren't evil? I think they are.

2009-01-13 Thread ChrisK

Henning Thielemann wrote:

I have seen several libraries where all functions of a monad have the
monadic result (), e.g. Binary.Put and other writing functions. This is
a clear indicator, that the Monad instance is artificial and was only
chosen because of the 'do' notation.


I completely disagree with that example.
The Put monad is, mainly, a specialized State monad.
The internal state being the current fixed-size bytestring memory buffer that 
has been allocated and is being filled.
The monad make the execution sequential so that there is only one memory buffer 
being filled at a time.
In Put, when one memory buffer has been filled it allocates the next one to 
create a Lazy Bytestring.


This is not to say that all M () are really monads, but just that Put () is.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: System.CPUTime and picoseconds

2009-01-12 Thread ChrisK

Neil Davies wrote:
I've found the pico second accuracy useful in working with 'rate 
equivalent' real time systems. Systems where the individual timings 
(their jitter) is not critical but the long term rate should be accurate 
- the extra precision helps with keeping the error accumulation under 
control.


When you are selling something (like data bandwidth) and you are pacing 
the data stream on a per packet basis you definitely want any error to 
accumulate slowly - you are in the 10^10 events per day range here.


Neil



Now I am posting just because I like to look at the time scales.

A rate of 10^10 per Day is a period of 8.64 microseconds.

If you want to slip only 1 period per year then you need a fractional accuracy 
of 2.74 * 10^-13.  In one day this is a slip of 23.7 nanoseconds.


So atomic time radio synchronization is too inaccurate. I have seen GPS 
receivers that claim to keep the absolute time to within 100 nanoseconds.


Lennart is right that 1 picosecond accuracy is absurd compared to all the 
jitters and drifts in anything but an actual atomic clock in your room.  But 
since CPUs tick faster than nanosecond the CPUTime needs better than 1 
nanosecond granularity.  I agree with Lennart — I also want an Integral type; it 
keeps the granularity constant and avoids all the pitfalls of doing math with a 
Double.  Out of simplicity I can see why the granularity was set to 1 picosecond 
as it is slightly easier to specify than 100 picosecond or 10 picosecond or 1/60 
nanosecond (hmmm... arcnanosecond?).


Maybe Haskell should name the 1/60 nanosecond unit something clever and create 
a new Time submodule using it for April 1st.  [ Base 60 is the real standard: 
http://en.wikipedia.org/wiki/Babylonian_mathematics has an 1800 B.C. tablet with 
the (sqrt 2) in base 60 as (1).(24)(51)(10) ]


--
Chris
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] ANN: ghci-haskeline 0.1

2009-01-12 Thread ChrisK
Haskeline is designed to remove the readline dependency, because Windows does 
not have readline.  So rlwrap is useless there.



Andrew Hunter wrote:

On Mon, Jan 12, 2009 at 12:57:57PM -0800, Judah Jacobson wrote:

I'm pleased to announce the first release of ghci-haskeline.  This
package uses the GHC API to reimplement ghci with the Haskeline
library as a backend.  Haskeline is a library for line input in
command-line programs, similar to readline or editline, which is
written in Haskell and thus (hopefully) more easily integrated into
other Haskell programs.



Perhaps this has already been discussed at length, in which case I
apologize but, well, why provide line input editing at all?

A number of languages/programs (off the top of my head: sml, most
Schemes) don't; the standard method to get line editing is rlwrap.
And this works (in my limited experience) quite well.  The
disadvantage as I see it of using editline or Haskeline or whatever is
that it's going to be sutbly different than other methods; presumably,
people won't like the changes in behavior.

It seems to me that from a UNIX-y separation of concern view, the
right thing to do (as many languages have chosen) is to /not/ provide
line editing, and just let the user do that with any number of
convenient tools that focus on getting/that/ right (like rlwrap.)  Is
there a reason we've not taken that approach?

Thanks,
AHH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: System.CPUTime and picoseconds

2009-01-12 Thread ChrisK

Tony Finch wrote:

The FreeBSD kernel uses a 64+64 bit fixed point type to represent time,
where the integer part is a normal Unix time_t. The fractional part is
64 bits wide in order to be able to represent multi-GHz frequencies
precisely.


multi-GHz being a euphemism for 18.45*10^9 GHz, over 18 billion GHz.

I just read through that. The granularity is 2^-64 seconds, or 5.4*^-20 seconds? 
That is 54 nano-pico-seconds.  I can see needing better than nanosecond, and 
going to milli-nanoseconds like Haskell, but to jump close to pico-nano-seconds? 
 That skips right past micro-nano-seconds and nano-nano-seconds.  That's 20 
million times more resolution than Haskell's picoseconds.  My that was fun to write.


It looks like an excellent performance hack for OS kernels.  64-bits make for 
simple register and cache access, the compiled code is small and quick, etc.


As a portable API it is far too complicated to use.  Not in the least because 
only FreeBSD probably has that API.


Note that at 10^-20 seconds the general relativistic shift due to altitude will 
matter over less than the thickness of a closed laptop.  Defining now that 
accurately has meaning localized to less then your computer's size.  The 
warranty for the bottom of your screen will expire sooner than that of the top.


Only stock traders and relativistic particles care about time intervals that 
short. FreeBSD — designed for the interstellar craft to tomorrow


Hmm...The W and Z bosons decay the fastest with 10^-25 second lifetimes, the 
shortest known lifetimes that I can find.  The fundamental Planck scale, the 
shortest amount of time in today's physics, is 5.4*10^-44 seconds.  So with 80 
more bits FreeBSD would be at the fundamental limit.  Of course the conversion 
then depends on the values of h, c, and G.


Now that would also be a good April Fool's joke proposal.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: System.CPUTime and picoseconds

2009-01-11 Thread ChrisK
An Double or Int64 are both 8 bytes and counts with picoseconds precision for 
2.5 hours to 106 days.  Going to 12 byte integer lets you count to 3.9 billion 
years (signed).  Going to 16 byte integer is over 10^38 years.


Lennart Augustsson wrote:

A double has 53 bits in the mantissa which means that for a running
time of about 24 hours you'd still have picoseconds.  I doubt anyone
cares about picoseconds when the running time is a day.


The above is an unfounded claim about the rest of humanity.


That's why I think a Double is a good choice, it adapts to the time
scale involved.


Let's compute:


tTooBig :: Double
tTooBig = 2^53

main = do
  print (tTooBig == 1+ tTooBig)


The above prints True.  How long does your computer have to be running before 
losing picosecond resolution?



tHours = tTooBig / (10^12) / 60 / 60


tHours is 2.501999792983609.

My laptop battery lasts longer.

Nanosecond precision is lost after 106 days.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Computer time, independent of date

2009-01-10 Thread ChrisK

Mauricio wrote:

patients, I wanted to be sure not to save wrong
information. It wouldn't matter if the clock is
saying we are on XVII century, as long as 10 seconds
would never be 10.1.



Chris (yes I am an experimental physicist) asks:

What are the interval durations you need to measure?
  0.1 second is very different from 1 ms or 0.1 μs !
  (1 ms is the atomic time radio signal accuracy)
  (0.1 μs can be had with GPS receivers)
  Since they are from equipment, what is the spec?
What is the error tolerance for these durations?
How are these errors allowed or forbidden to accumulate?

Since you say you do not trust the incoming timing then you need to be able to 
measure it.  Do you need to monitor the timing of each and every measurement 
while running?  Could you measure and certify the timing of the equipment before 
using it?  Got a cheap oscilloscope?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: System.CPUTime and picoseconds

2009-01-10 Thread ChrisK

Manlio Perillo wrote:

Hi.

Just out of curiosity, but why Haskell 98 System.CPUTime library module 
uses picoseconds instead of, say, nanoseconds?


At least on POSIX systems, picoseconds precision is *never* specified.



I have not idea.  But at a guess, I would say that 1 ns is not such a small time 
interval anymore.  The CPU speeds are about 3 GHz, so 0.3 ns per CPU clock. 
Even the RAM clock in a laptop (e.g. Apple's 17 Mac Pro) is 1066 MHz, so the 
internal there is just under 1 ns.


Whoever picked picoseconds has made it possible to talk about a single clock 
interval for hardware like this.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Haskell not ready for Foo [was: Re: Hypothetical Haskell job in New York]

2009-01-08 Thread ChrisK

John A. De Goes wrote:


Here's hoping someone develops a native messaging framework for Haskell, 
which is the equal of RabbitMQ.




The first thing would be to make a Haskell client library to speak AMQP 
(Advanced Message Queuing Protocol) on the wire.


It is a very open binary standard (with defined semantics!) at
http://jira.amqp.org/confluence/display/AMQP/Advanced+Message+Queuing+Protocol

I would be mildly surprised if zero people were working on this.

Once that is in place then the question of a Haskell Broker for AMQP arises. 
But I suspect that Erlang's runtime will still rule there for production use.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Tying a simple circularly STM linked list

2009-01-06 Thread ChrisK

You can use undefined or error ... :


{-# LANGUAGE RecursiveDo #-}
import Control.Concurrent.STM
import Control.Monad.Fix

-- Transactional loop.  A loop is a circular link list.
data Loop a
   = ItemLink
  { item :: a
  , prev :: TVar (Loop a)
  , next :: TVar (Loop a)
  }

-- Create a new empty transactional loop.
newLoop :: a - STM (TVar (Loop a))
newLoop item = do
   tLoop - newTVar undefined
   writeTVar tLoop (ItemLink item tLoop tLoop)
   return tLoop


Hmmm.. STM does not have a MonadFix instance.  But IO does:



-- Use MonadFix instance of newLoopIO
newLoopIO :: a - IO (TVar (Loop a))
newLoopIO item = mfix (\ tLoop - newTVarIO (ItemLink item tLoop tLoop))


But mfix (like fix) is difficult to read in large amounts, so there is mdo:


-- Use RecursiveDo notation
newLoopMDO :: a - IO (TVar (Loop a))
newLoopMDO item = mdo
   tLoop - newTVarIO (ItemLink item tLoop tLoop)
   return tLoop




Cheers,
  Chris




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: ANN: bytestring-trie 0.1.1 (bugfix)

2009-01-04 Thread ChrisK

Question and suggestion:

looking at
http://hackage.haskell.org/packages/archive/bytestring-trie/0.1.1/doc/html/src/Data-Trie.html#Trie

I am questioning your choice of foldr in fromList:


-- | Convert association list into a trie. On key conflict, values
-- earlier in the list shadow later ones.
fromList :: [(KeyString,a)] - Trie a
fromList = foldr (uncurry insert) empty



-- | /O(1)/, The empty trie.
{-# INLINE empty #-}
empty :: Trie a
empty = Empty



-- | Insert a new key. If the key is already present, overrides the
-- old value
{-# INLINE insert #-}
insert:: KeyString - a - Trie a - Trie a
insert = alterBy (\_ x _ - Just x)



-- | Generic function to alter a trie by one element with a function
-- to resolve conflicts (or non-conflicts).
alterBy :: (KeyString - a - Maybe a - Maybe a)
 - KeyString - a - Trie a - Trie a
alterBy f_ q_ x_
| S.null q_ = mergeBy (\x y - f_ q_ x (Just y)) (singleton q_ x_) 
| otherwise = go q_

where



-- | /O(1)/, Is the trie empty?
{-# INLINE null #-}
null :: Trie a - Bool
null Empty = True
null _ = False


So it looks like the reduction is fromList - uncurry insert - alterBy - null.
Let me use insert in place of uncurry insert:

fromList ( (a,1) : ( (b,2) : ( (c,3) : [] ) ) )
(a,1) `insert` ( (b,2) `insert` ( (c,3) `insert` Empty ) ) )

So fromList forces the whole call chain above to be traversed until it hits the 
Empty.  For a large input list this will force the whole list to be allocated 
before proceeding AND the call chain might overflow the allowed stack size in 
ghc.  For a large trie (which is a likely use case) this is a poor situation.


If you use foldl' then the input list is only forced one element at a time.  A 
small change to the lambda that insert passes to adjustBy will retain the same 
semantics of earlier key wins (which are an especially good idea in the foldl' 
case).


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Different QSem(N) (was Re: IORef vs TVar performance: 6 seconds versus 4 minutes)

2008-12-29 Thread ChrisK

I think I can improve on your code.

Bertram Felgenhauer wrote:

But why does it manually manage the waiters at all? MVars are fair, in
ghc at least. So this should work:

data Sem = Sem (MVar Int) (MVar Int)


I changed the above to be a data


newSem :: Int - IO Sem

newSem initial = liftM2 Sem (newMVar initial) newEmptyMVar

-- | Wait for a unit to become available

waitSem :: Sem - IO ()
waitSem (Sem sem wakeup) = do
   avail' - modifyMVar sem (\avail - return (avail-1, avail-1))


Threads can get out of order at this point.  This order bug may be 
undesirable. Also, killing the thread while it waits for wakeup below would be 
bad.  You need an exception handler and some kind of cleanup.



   when (avail'  0) $ takeMVar wakeup = putMVar sem



-- | Signal that a unit of the 'Sem' is available
signalSem :: Sem - IO ()
signalSem (Sem sem wakeup) = do
   avail - takeMVar sem
   if avail  0 then putMVar wakeup (avail+1)
else putMVar sem (avail+1)


You should change this from = do to = block $ do.



(I should turn this into a library proposal.)

Bertram


If you do not need to take N at a time then the untested code below has no 
order bug and is fair.



module Sem where

import Control.Concurrent.MVar
import Control.Monad(when,liftM2)

data Sem = Sem { avail :: MVar Int-- ^ provides fast path and fair queue
   , lock :: MVar () }-- ^ Held while signalling the queue

-- It makes no sense here to initialize with a negative number, so
-- this is treated the same as initializing with 0.
newSem :: Int - IO Sem
newSem init | init  1 = liftM2 Sem newEmptyMVar (newMVar ())
| otherwise = liftM2 Sem (newMVar init) (newMVar ())

waitSem :: Sem - IO ()
waitSem (Sem sem _) = block $ do
  avail - takeMVar sem
  when (avail  1) (signalSemN (pred avail))

signalSem :: Sem - IO ()
signalSem = signalSemN 1

signalSemN :: Int - Sem - IO ()
signalSemN i (Sem sem lock) | i = 1 = return ()
| otherwise =
  withMVar lock $ \ _ - block $ do
old - tryTakeMVar sem
case old of
  Nothing - putMVar sem i
  Just v - putMVar sem $! succ i



All waitSem block in arrival order with the takeMVar in waitSem.  The signalSemN 
 avoid conflicting by serializing on the MVar () lock.  The above is quite 
fast so long as the semaphore holds no more than the value 1.  Once it hold more 
than 1 the waiter must take time to add back the remaining value.


Note that once threads are woken up in order, they may still go out of order 
blocking for the () lock when adding back the remaining value (in the presence 
of other signalers).


The above is also exception safe.  The only place it can die is during the 
takeMVar and this merely remove a blocked waiter.


I see no way to add a fair waitSemN without changing Sem.  But if I change Sem 
then I can make a fair waitSemN.  The untested code is below:



module Sem where

import Control.Concurrent.MVar
import Control.Monad(when,liftM3)
import Control.Exception.Base

data Sem = Sem { semWait :: MVar () -- for serializing waiting threads
   , semAvail :: MVar Int -- positive quantity available
   , semSignal :: MVar () -- for serializing signaling threads
   }


newSem i | i=0 = liftM3 Sem (newMVar ()) newEmptyMVar (newMVar ())
 | otherwise = liftM3 Sem (newMVar ()) (newMVar i) (newMVar ())

waitSem :: Sem - IO ()
waitSem = waitSemN 1

waitSemN :: Int - Sem - IO ()
waitSemN i sem@(Sem w a s) | i=0 = return ()
   | otherwise = withMVar w $ \ _ - block $ do
  let go n = do
avail - onException (takeMVar a) (signalSemN (i-n) sem)
case compare avail n of
LT - go $! n-avail
EQ - return ()
GT - signalSemN (avail-n) sem
  go i

signalSem :: Sem - IO ()
signalSem = signalSemN 1

signalSemN :: Int - Sem - IO ()
signalSemN i (Sem _ a s) | i=0 = return ()
 | otherwise = withMVar s $ \ _ - block $ do
  ma - tryTakeMVar a
  case ma of Nothing - putMVar a i
 Just v - putMVar a $! v+i



Trying for exception safety makes the above slightly tricky.

It works by allowing only a single thread to get the semWait lock.  This keeps 
all the arriving threads in the fair blocking queue for the semWait lock.  The 
holder of the semWait lock then nibbles at semAvail's positive value until it is 
satisfied.  Excess value is added back safely with signalSemN.


Cheers,
  Chris Kuklewicz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Using classes for a heterogeneous graph structure

2008-12-29 Thread ChrisK
I think the below code which compiles with ghc-6.10.1 should compile with 
ghc-6.8.3 as well.  My preference is to define a GADT such as ThingMap below.


Conceptually ThingMap contains two pieces of information.  There is a Map to an 
unknown type thing and there is a dictionary which implements a Thing instance 
for this unknown type thing.  By pattern matching (ThingMap map) in update the 
rest of update gets access to both pieces of information.  You are guaranteed 
that each element of the map is the SAME type.


To be able to do more stuff with it you need to add classes either as a context 
to the definition of class Thing or in addition to the (Thing thing) context 
in the ThingMap definition.


Or you could use the slightly different strategy of MapTW.  Here each element of 
the map might be a DIFFERENT underlying type (underneath ThingWrapper).


The data MapThing is the older style of existential data and is, in my 
opinion, superseded by the GADT style used in ThingMap.



{-# OPTIONS_GHC -fglasgow-exts #-}
module Sample where

import Data.Map(Map)
import qualified Data.Map as Map

class Thing thing where
  set_int :: thing - Integer - thing

  -- for wrapper
  wrapper :: thing - ThingWrapper
  wrapper thing = ThingWrapper thing

instance Thing Integer where
 set_int me i = i -- in the generic case, this actually does something

-- This really has to change
-- type ThingsByString = (Thing thing) = Map.Map Integer thing
-- Look at 
http://www.haskell.org/ghc/docs/latest/html/users_guide/data-type-extensions.html
data ThingMap where ThingMap :: forall thing . (Thing thing) = Map Integer thing 
- ThingMap -- New GADT goodness
data MapThing = forall thingish . (Thing thingish) = MapThing (Map Integer 
thingish) -- Old style, not as good
type MapTW = Map Integer ThingWrapper

update :: Integer - Integer - ThingMap - ThingMap
update key value (ThingMap map) =
   let (Just thing) = Map.lookup key map
   in ThingMap $ Map.insert key (set_int thing value) map

update' :: Integer - Integer - MapThing - MapThing
update' key value (MapThing map) =
   let (Just thingie) = Map.lookup key map
   in MapThing $ Map.insert key (set_int thingie value) map

update'' :: Integer - Integer - MapTW - MapTW
update'' key value map =
   let (Just thingie) = Map.lookup key map
   in Map.insert key (set_int thingie value) map

test1 =
 let my_map = Map.empty :: Map Integer Integer
 map1 = ThingMap (Map.insert 0 1 my_map)
 map2 = update 0 8 map1
 in map2

test2 =
 let my_map = Map.empty :: Map Integer ThingWrapper
 map1 = ThingMap (Map.insert 0 (ThingWrapper (1::Integer)) my_map)
 map2 = update 0 8 map1
 in map2

test3 =
 let my_map = Map.empty :: Map Integer Integer
 map1 = MapThing (Map.insert 0 1 my_map)
 map2 = update' 0 8 map1
 in map2

test4 =
 let my_map = Map.empty :: Map Integer ThingWrapper
 map1 = MapThing (Map.insert 0 (ThingWrapper (1::Integer)) my_map)
 map2 = update' 0 8 map1
 in map2

test5 =
 let my_map = Map.empty :: MapTW
 map1 = Map.insert 0 (ThingWrapper (1::Integer)) my_map
 map2 = update'' 0 8 map1
 in map2

data ThingWrapper = forall t. (Thing t) = ThingWrapper t

instance Thing ThingWrapper where
 set_int (ThingWrapper thing) i = wrapper $ set_int thing i
 wrapper thing_wrapper = thing_wrapper


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Infinite grid

2008-12-29 Thread ChrisK

Luke Palmer wrote:
On Mon, Dec 29, 2008 at 3:55 PM, Martijn van Steenbergen 
mart...@van.steenbergen.nl mailto:mart...@van.steenbergen.nl wrote:


Hello,

I would like to construct an infinite two-dimensional grid of nodes,
where a node looks like this:

data Node = Node
 { north :: Node
 , east  :: Node
 , south :: Node
 , west  :: Node
 }


in such a way that for every node n in the grid it doesn't matter
how I travel to n, I always end up in the same memory location for
that node.


No problem:

let n = Node n n n n in n

But you probably want some additional data in each node, also, in which 
the problem becomes harder.


The solution very much depends on how the data is initialized / computed / 
obtained.

Note that one can put an IORef into the Node to allow for mutable value.

Also, see if you need something akin to mkDList at 
http://haskell.org/haskellwiki/Tying_the_Knot


Cheers,
  Chris


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Missing Network Functions

2008-12-28 Thread ChrisK

John Van Enk wrote:
I want to group them with Network functions because traditionally these 
specific functions *are* in networking packages.


I agree.  It is true that these functions are in the arpa/inet.h header on my 
machine.  The ntoh* and hton* are just data manipulation, just as Network.URI is 
 all about constructing and manipulating URI data.


The question I think we should answer is: Do we wish to follow the precedent of 
putting URI under Network and not Data?


Pragmatically, programmers will expect to find these functions near the 'net 
ones.

A network-byteorder package which has the Network.ByteOrder module is the 
most flexible thing to create.  It can be updated and reinstalled independently 
of the network package.




On *nix systems, the prototypes reside in netinet/in.h. On Windows 
they reside in winsock2.h and one must link against libws2_32.


The package I have locally supports all platforms. It just does a 
foreign import of the function (and uses a preprocessor definition to 
pick the calling convention).


One could make an argument to add these sorts of functions to Word and 
Int packages.


/jve


On Sat, Dec 27, 2008 at 7:14 PM, Chris Kuklewicz 
hask...@list.mightyreason.com mailto:hask...@list.mightyreason.com 
wrote:


Are these going to be available on Windows or just Posix systems?

And these are NOT performing network communication.  They are bit
manipulations.

I could imagine putting them in/under the Data.Word or Data.Bits
more than Network.*

And they should also be made to work with the corresponding
Foreign.C.Types of CShort, etc.


John Van Enk wrote:

While working on a project, I discovered that hton[sl] and
ntoh[sl] are missing from the networking libraries. It appears
there was some justification for this, but I've fallen upon a
few instances where things were *not* working as expected.
(Specifically when newtype'ing around HostAddress.)

As such, I'm putting together a few extra functions that I think
should exist in Network somewhere. I'm calling the package
Network.Util (but am open to changes).

Currently this only has:

   * htons
   * htonl
   * ntohs
   * ntohl

Before I drop this on hackage:

  1. Is there a reason I shouldn't do this?
  2. Are there other suggestions for missing functions? (preferably

 cross-platform suggestions)

Thanks all.

/jve




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org mailto:Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe






___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How does one use Text.Regex.Base.RegexLike?

2008-12-25 Thread ChrisK

Yeah!

I am glad the mailing list has helped.  If you get stuck or need something 
fancier then let me know (I wrote RegexLike).



But I still don't know how to get makeRegex to work. You need it to specify
options like case insensitivity, or to use functions like matchAllText.


Well, the options are defined in detail by the back end module, e.g. 
regex-posix.  Take Text.Regex.Posix.Wrap:


Most importantly is the Regex data type defined in this module.

Also, there is an instance of RegexMaker Regex CompOption ExecOption String.

The CompOptions and ExecOptions are defined in this module, and can be passed to 
 makeRegexOpts from the RegexMaker class that converts String/ByteString/.. 
to the Regex data type.  The CompOptions are only settable at this compilation 
stage.


Later you can use getExecOpts and setExecOpts to create a new Regex from the 
original Regex which differs in the value of ExecOptions it uses.  These 
execution options can change after creating the Regex.


The makeRegex uses the defaultCompOpt and defaultExecOpt values.

The CompOptions for Text.Regex.Posix.Wrap are at
http://www.haskell.org/ghc/docs/latest/html/libraries/regex-posix/Text-Regex-Posix-Wrap.html#4
and are just a newtype of the underlying c-library's bitmapped int flags.  AND I 
made this newtype derive the same Bits instance, so this works:



let r = makeRegexOpts (defaultCompOpts || compIgnoreCase)
  defaultExecOpts caseInSensiTIVe


By binding a Regex value like this and using it several times one saves the 
effort of compiling the String pattern into the Regex data type.


If you need something that is not exactly given by the RegexLike API then there 
may be a RegexContext instance which does give you what you want.


WARNING: The version of regex-base you are using matters most when it comes to 
the RegexContext instances.


The latest 0.93.1 version is at
http://hackage.haskell.org/packages/archive/regex-base/0.93.1/doc/html/Text-Regex-Base-Context.html
and please note the Source link in the top right corner area which helps in 
understanding that RegexContext is just a convenience wrapper.


ghc 6.10 is distributed with the 0.72.0.2 version which is at
http://hackage.haskell.org/packages/archive/regex-base/0.72.0.2/doc/html/Text-Regex-Base-Context.html
and again has a link the Source.

Hooray for hackage.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: What are side effects in Haskell?

2008-12-23 Thread ChrisK

Hans van Thiel wrote:

I just saw somewhere that one of the purposes of monads is to capture
side effects.


There are a few things that have side effects.  The best way to capture this 
is to see that there are both

  (1) commands whose result depends on some external state
  (2) commands which alter that state

The (2) group has side effects, and (1) is affected by them.  The order in which 
(1) and (2) are performed matters.


commands that evaluate to different things based on environment/history/input:
  * reading from memory location or reference
  * reading from a file or network stream or sensor (e.g. keyboard or mouse)
  * a random or pseudo-random number generator

commands which affect the environment
  * writing to a memory location or reference
  * writing to a file or network
  * changing the display buffer (i.e. windows on the screen)

command which are both (1) and (2), often because they might fail:
  * creating or deleting a file
  * obtaining a lock or semaphore
  * interacting with a message queue

So allocating a mutable variable and writing to it are side-effect operations 
which have to be carefully ordered with the corresponding reads.


Things without side effects are the immutable bindings produced by let/where, 
and application of functions.  This is independent of being lazy or strict.


Note that putStr :: String - IO () is a function, and as such it is a pure 
value.  putString ['a','b','\n'] is a pure value of IO ().  Performing this 
command in the IO monad has side effects.


The fact that functions like putStr and things with types of IO () are pure 
value means that they can be produced and passed around is powerful way of 
working.  It is possible to create objects in C++/Java which encapsulate an 
operation, but it takes more syntax.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How to think about this? (profiling)

2008-12-16 Thread ChrisK

Or if you don't want to pay for laziness at all you could build your memo array
imperatively (but purely):


import Data.Array.IArray(elems,(!),inRange)
import Data.Array.MArray(newArray_,writeArray,readArray)
import Data.Array.Unboxed(UArray)
import Data.Array.ST(runSTUArray,STUArray)
import Control.Monad(forM_)
import Data.List(zipWith3)

ackMemoSize :: Int
ackMemoSize = 12;

ackList :: [Int]
ackList = 0:1:2:zipWith3 (\ i j k - i+j+k) ackList (tail ackList) (tail (tail 
ackList))

ackMemo :: UArray Int Int
ackMemo = runSTUArray $ do -- the $ works with ghc 6.10, hooray
  a - newArray_ (0,ackMemoSize)
  writeArray a 0 0
  writeArray a 1 1
  writeArray a 2 2
  let op i x | i  ackMemoSize = return ()
 | otherwise = do
writeArray a i x
y - readArray a (i-3)
op (succ i) $! (2*x-y) -- could use (2*x) intead
  op 3 (0+1+2)
  return a

ack :: Int - Int
ack i | inRange (0,ackMemoSize) i = ackMemo ! i
  | otherwise = error outsize memorized range for ack

test = (take (succ ackMemoSize) ackList) == (elems ackMemo)
(ackList !! ackMemoSize) == (ack ackMemoSize)


Which should have very good performance in building ackMemo (the first time it 
is used).


By changing the (2*x-y) to (2*x) I think you get the sum-of-all-previous-entries 
behavior.


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Type wildcards

2008-12-16 Thread ChrisK

You can get pretty far with the same trick oleg mentions at [1].

If you use local type signature then you can do things like this:


{-
ghci infers this type:
*Main :t f
f :: (Ord a) = Int - a - t - String
-}
f i j x | False = (undefined (i::Int) (isOrd j)) :: String
f i j x = error not filled in

isOrd :: Ord a = a - ()
isOrd = undefined



Cheers,
  Chris

[1] http://okmij.org/ftp/Haskell/types.html#partial-sigs

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: MVar and Par ..

2008-12-16 Thread ChrisK

Mozhgan Kabiri wrote:

Hi .. Hope you are doing well . I've just joined this group.


Hi.

Recently, I am struggling to do some simple experiment with haskell 
language about parallelism and wrong answers that we can get while using 
a shared variable .


Your goal is still unclear.
Are you trying to create an example which shows unexpected answers?
Are you trying to create an example which shows the expected answer?

I tried to write a simple program, for example calculationg 'n=n+1' few 
times.And then I tried to do it in parallel by using 'par' and 'pseq' . 
The aim was to get the wrong answer because we have to share a variable 
here,and without using 'MVar' function we will get the wrong answer for 
the calculation .


MVar is a mutable storage cell.  One can use forkIO to create IO threads which 
race to change MVar, and if this is done badly then you can get unexpected answers.


One cannot use the pure par and pseq to launch IO threads, so you cannot use 
par and pseq to create race conditions, so one can only get the expected answer.


I don't know how to write it in parallel in order to get a wrong answer 
when we don't use MVar,because we have a shared variable here. I read 
about MVars as well,but also I don't know how to combine MVar and Par 
together to get the program to work.


I do not immediately see how MVar and par can be sensibly combined at all.


I wrote this :

module Main where f :: Int - Int - Int f i n = g 1 i n where g x i n | 
x = i = g (x+1) i (n+1) | otherwise = n main :: IO () main = do 
putStrLn starting... let r = f 10 5 putStrLn (show r) putStrLn finished
I want to make to work in parallel by using 'Par'.And also use MVar for 
this simple example to work.
All of the example about MVar are a little bit complicated and I 
couldn't figure it that how can I write one,the same !


Can any one help me with this ? I want a simple example that I can feel 
the need of MVar when I run my program in parallel and while I am using 
a shared variable.


Regards; Mozhgan


I have run out of time, so I leave this part of your question to others.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: File name encodings

2008-12-10 Thread ChrisK

Duncan Coutts wrote:

Yet another reason why FilePath /= String (except on Windows where it
does).

Duncan


Well, it is not a OS issue but a FileSystem issue.

OS X is a Unix, but the main filesystem is HFS+ which has Unicode names, though 
they use a different normalization.


So FilePath == String on most OS X systems.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: ANNOUNCE: haskell-src-exts 0.4.4

2008-12-10 Thread ChrisK

Is there anyway to track down and fix why haskell-src-exts-0.4.4.1 still gets

haddock: parse error in doc string

when I try to get cabal to haddock the package?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: ANNOUNCE: haskell-src-exts 0.4.4

2008-12-10 Thread ChrisK

Ross Paterson wrote:

On Wed, Dec 10, 2008 at 05:34:16PM +, ChrisK wrote:

Is there anyway to track down and fix why haskell-src-exts-0.4.4.1 still gets

haddock: parse error in doc string

when I try to get cabal to haddock the package?


Line numbers would be handy.

In this case, the problem is the Original: line in some of the module
headers.


Indeed, I just found that.  I wonder where in the universe this is documented. 
Oh well.  I also had to change -- $x and -- $( comments.  A darcs diff -u 
patch is attached.


Cheers,
  Chris

diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Build.hs 
new-haskell-src-exts/Language/Haskell/Exts/Build.hs
--- old-haskell-src-exts/Language/Haskell/Exts/Build.hs 2008-12-10 
17:46:47.0 +
+++ new-haskell-src-exts/Language/Haskell/Exts/Build.hs 2008-12-10 
17:46:47.0 +
@@ -1,7 +1,6 @@
 -
 -- |
 -- Module  :  Language.Haskell.Exts.Build
--- Original:  Language.Haskell.Syntax
 -- Copyright   :  (c) The GHC Team, 1997-2000,
 --(c) Niklas Broberg 2004
 -- License :  BSD-style (see the file LICENSE.txt)
diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Lexer.hs 
new-haskell-src-exts/Language/Haskell/Exts/Lexer.hs
--- old-haskell-src-exts/Language/Haskell/Exts/Lexer.hs 2008-12-10 
17:46:47.0 +
+++ new-haskell-src-exts/Language/Haskell/Exts/Lexer.hs 2008-12-10 
17:46:47.0 +
@@ -2,7 +2,6 @@
 -
 -- |
 -- Module  :  Language.Haskell.Exts.Lexer
--- Original:  Language.Haskell.Lexer
 -- Copyright   :  (c) The GHC Team, 1997-2000
 --(c) Niklas Broberg, 2004
 -- License :  BSD-style (see the file LICENSE.txt)
@@ -83,8 +82,8 @@
 | THDecQuote-- [d|
 | THTypQuote-- [t| 
 | THCloseQuote  -- |]
-| THIdEscape (String)   -- $x
-| THParenEscape -- $( 
+| THIdEscape (String)   -- dollar x
+| THParenEscape -- dollar ( 
 | THVarQuote-- 'x (but without the x)
 | THTyQuote -- ''T (but without the T)
 
diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs 
new-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs
--- old-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs2008-12-10 
17:46:47.0 +
+++ new-haskell-src-exts/Language/Haskell/Exts/ParseMonad.hs2008-12-10 
17:46:47.0 +
@@ -2,7 +2,6 @@
 -
 -- |
 -- Module  :  Language.Haskell.Exts.ParseMonad
--- Original:  Language.Haskell.ParseMonad
 -- Copyright   :  (c) The GHC Team, 1997-2000
 -- License :  BSD-style (see the file libraries/base/LICENSE)
 -- 
diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs 
new-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs
--- old-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs2008-12-10 
17:46:47.0 +
+++ new-haskell-src-exts/Language/Haskell/Exts/ParseUtils.hs2008-12-10 
17:46:47.0 +
@@ -2,7 +2,6 @@
 -
 -- |
 -- Module  :  Language.Haskell.Exts.ParseUtils
--- Original:  Language.Haskell.ParseUtils
 -- Copyright   :  (c) Niklas Broberg 2004,
 --(c) The GHC Team, 1997-2000
 -- License :  BSD-style (see the file LICENSE.txt)
diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Parser.ly 
new-haskell-src-exts/Language/Haskell/Exts/Parser.ly
--- old-haskell-src-exts/Language/Haskell/Exts/Parser.ly2008-12-10 
17:46:47.0 +
+++ new-haskell-src-exts/Language/Haskell/Exts/Parser.ly2008-12-10 
17:46:47.0 +
@@ -2,7 +2,6 @@
  -
  -- |
  -- Module  :  Language.Haskell.Exts.Parser
- -- Original:  Language.Haskell.Parser
  -- Copyright   :  (c) Niklas Broberg 2004,
  --Original (c) Simon Marlow, Sven Panne 1997-2000
  -- License :  BSD-style (see the file LICENSE.txt)
diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Pretty.hs 
new-haskell-src-exts/Language/Haskell/Exts/Pretty.hs
--- old-haskell-src-exts/Language/Haskell/Exts/Pretty.hs2008-12-10 
17:46:47.0 +
+++ new-haskell-src-exts/Language/Haskell/Exts/Pretty.hs2008-12-10 
17:46:47.0 +
@@ -2,7 +2,6 @@
 -
 -- |
 -- Module  :  Language.Haskell.Exts.Pretty
--- Original:  Language.Haskell.Pretty
 -- Copyright   :  (c) Niklas Broberg 2004,
 --(c) The GHC Team, Noel Winstanley 1997-2000
 -- License :  BSD-style (see the file LICENSE.txt)
diff -rN -u old-haskell-src-exts/Language/Haskell/Exts/Syntax.hs 
new-haskell-src-exts

[Haskell-cafe] Re: ANNOUNCE: haskell-src-exts 0.4.4

2008-12-04 Thread ChrisK

Niklas Broberg wrote:

Fellow Haskelleers,

it is my pleasure to announce the new release of the haskell-src-exts
package, version 0.4.4:



The full list of pragmas supported by 0.4.4 is: SOURCE, RULES,
DEPRECATED, WARNING, INLINE, NOINLINE, SPECIALISE, CORE, SCC,
GENERATED and UNPACK.


Ah, excellent.  The hprotoc program generates possibly mutually recursive 
modules and when it does these need fixing up by hand to add SOURCE pragmas and 
boot files.


I could use your new package to automatically break the recursion.  Interesting.

Thanks,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: The Knight's Tour: solutions please

2008-12-02 Thread ChrisK

Hmmm... it seems that n=63 is a special case.

[EMAIL PROTECTED] wrote:

Yes, there is a solution for n=99 and for n=100 for that matter --
which can be found under one second. I only had to make a trivial
modification to the previously posted code


tour n k s b | k  n*n   = return b
 | otherwise = do next - (foldr mplus mzero).map return $ 
successors n b s
  tour n (k+1) next $ insrt next k b


I replaced foldl1 mplus with foldr mplus zero.



The old version sees no solution to n=63 quite quickly:


time nice ./fromwiki-63 63
fromwiki-63: Prelude.foldl1: empty list

real0m0.285s
user0m0.172s
sys 0m0.026s


The version with the 'tour' given above does not halt after running up to 0.4 GB 
of RAM, so I killed it.


Though having no solution may be tied to starting in the corner.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Question about fastcgi

2008-11-28 Thread ChrisK
Er, no.  A fastcgi executable is (like a cgi executable) controlled by the front 
 end web server.  I run my fastcgi using Apache as the front end.  The front 
end web server will control things like the port number.



Mauricio wrote:

Hi,

I'm learnng to use fastcgi and, reading the examples,
I see the main loop is like this:

main = runFastCGI my_work

However, isn't a fastcgi program supposed to choose
a port where to listen to calls? For instance, in this
C example:

xzdev.com/nginx_fastcgi.html

doesn't the line

listen_socket = FCGX_OpenSocket(:8002, 2000);

says it's listening to port 8002? I read the code
for fastcgi, from hackage, and I can't find anything
related to ports like, for instance, a default port.
Am I understanding something the wrong way?

Thanks,
Maurício


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Question about fastcgi

2008-11-28 Thread ChrisK

I have only used this, all of these are from Haskell:


pamac-cek10:~ chrisk$ cat /etc/apache2/other/httpd-fastcgi.conf
IfModule mod_fastcgi.c
  Alias /fcgi-bin/ /Library/WebServer/FastCGI-Executables/

  Directory /Library/WebServer/FastCGI-Executables/
AllowOverride None
Options None
Order allow,deny
Allow from all

SetHandler fastcgi-script
Options +ExecCGI
  /Directory

  FastCgiIpcDir /tmp/fastcgi
  FastCgiServer /Library/WebServer/FastCGI-Executables/hw.fastcgi 
-pass-header Cookie
  FastCgiServer /Library/WebServer/FastCGI-Executables/test.fastcgi 
-pass-header Cookie
  FastCgiServer /Library/WebServer/FastCGI-Executables/xwords.fastcgi 
-pass-header Cookie
/IfModule


The above is included from the main httpd.conf which has:

pamac-cek10:~ chrisk$ grep -i fast /etc/apache2/httpd.conf 
LoadModule fastcgi_module libexec/apache2/mod_fastcgi.so


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Efficient parallel regular expressions

2008-11-04 Thread ChrisK

The regex-tdfa package (and regex-posix) implement subexpressions capture.

So if you want to match alpha beta and gamma in parallel you could write

(alpha)|(beta)|(gamma) and check which subexpression has the non-empty match.

This becomes slightly complicated if there are parenthesis and captures inside 
alpha beta or gamma.  Then you need to compute the indices that are the top 
level captures.


In particular, the regex-tdfa package (get the latest from 
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa ) will 
create a DFA and run through the input once without backtracking.  It will find 
the leftmost-longest match, so the order of the branches only matters if there 
is a tie in length.


If you need to be left-biased then you need a perl-style engine, and you can use 
the regex-pcre or pcre-light haskell package and the PCRE library.  These are 
obtainable from Hackage.  I doubt PCRE uses a simple DFA...


Cheers,
  Chris

Martijn van Steenbergen wrote:

Hello all,

For my mud client Yogurt (see hackage) I'm currently working on
improving the efficiency of the hooks. Right now several hooks, each
consisting of a regex and an action can be active at the same time.
Every time a line of input is available (usually several times a second)
I run the line through all the available regexes and execute the first
matching action.

I figured this is not the cleverest approach and it'd be better if I
|'ed all regexes into one big DFA. However, how do I then find out which
of the original hooks matched and so which action to execute?

As far as I know there's no way to do that with Text.Regex. Alex looks
promising but is really only an executable and doesn't offer an API.
I've also found mr. Jo�o Saraiva's HaLex but I don't know if that was
meant to be used seriously.

Does anyone have any experience with this? What's the best way to
achieve this?

Thanks much,

Martijn.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Climbing up the shootout...

2008-09-22 Thread ChrisK
And, though I had never seen it before, the current winner for speed is ATS ( 
http://www.ats-lang.org/ ) which is dependently-typed functional language.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: protocol-buffers-0.2.9 for Haskell is ready

2008-09-20 Thread ChrisK

Hello one and all,

Amid much rejoicing, my Haskell version of protocol-buffer is now
released (version 0.2.9).

What is this for?  What does it do?  Why?

  Shorter answer: It generates Haskell data types that can be converted back 
and forth to lazy ByteStrings that interoperate with Google's generated code in 
C++/Java/python.


  It is a pure Haskell re-implementation of the Google code at
http://code.Google.com/apis/protocolbuffers/docs/overview.html
  which is ...a language-neutral, platform-neutral, extensible way of 
serializing structured data for use in communications protocols, data storage, 
and more.
  Google's project produces C++, Java, and Python code.  This one produces 
Haskell code.


The release tarball (with 3 Haskell packages inside, see README in source) is at
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers

The darcs repository has moved to 
http://darcs.haskell.org/packages/protocol-buffers2/


You will also need a recent ghc compiler, the binary package and the
utf8-string package from hackage.haskell.org (same site as mentioned
above).

The source compiles to 3 things:
  1) the package protocol-buffers with the library API
  2) the package protocol-buffers-descriptor with the
descriptor.proto code
  3) The 'hprotoc' executable which is a command line program similar
to 'protoc'.

The examples sub-directory in the code has the Haskell version of
the addressbook.proto example and is compatible with Google's
similar example code.

The code generated from unittest.proto (and unittest_import.proto)
includes messages TestAllTypes and TestAllExtensions which have been
extensively tested by QuickCheck to ensure they can be wire encoded
and decoded (see the tests sub-directory in the code).

The user API, as exported by Text.ProtocolBuffers, allows for
converting messages back and forth to the lazy ByteString type.  And
such messages can be merged, and the defaults accessed via the
MessageAPI type class.

The messages in Haskell as just regular data types and are thus
immutable.  Required types are simple record fields, optional types
are Maybe, and repeated types are Seq (from Data.Sequence).

Extensions are supported via Key data that allows access to the
extension fields.  Extensible messages contain an opaque ext'field
entry of type ExtField that contains the map data structure to contain
the extension field values.

The User API allows for serializing messages as the usual series of
fields.  It also provides for a length prefix to be written to create
delimited messages.  It also provides to write a wire tag with any
field number before the length and message data.  This last form looks
like a field on the wire, and there is a special api call to read back
just the one message and its field number.  This last API is similar
to the one that is part of the C# API.

No benchmarks have been run yet.  Any suggestions?

Unsupported for the moment is loading and storing unknown fields.
It can be added sooner if someone has a use for this.

Unsupported indefinitely is code generation for Services and Methods.
I have yet to look into how this is presented in the other languages.

The API to read a single message field, as mentioned above, might be
extended to read any type instead of just messages.

optional clever_quote {
autrijus Perl: Easy things are easy, hard things are possible
autrijus Haskell: Hard things are easy, the impossible just
happened
}

Cheers!

  Chris Kuklewicz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: STM and FFI

2008-09-10 Thread ChrisK

There are some examples of adding IO actions to commit and rollback events at

http://www.haskell.org/haskellwiki/New_monads/MonadAdvSTM

Disclaimer: I wrote this instance of the code, but have not used it much.

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] The initial view on typed sprintf and sscanf

2008-09-02 Thread ChrisK

Matthew Brecknell wrote:

Unfortunately, I don't seem to be able to make the expected fprintf
function, because printf's format-dependent parameter list makes it
impossible to find a place to pass the handle. Hence the C++-like ()
ugliness.



How about this:

fprintf :: Handle - F (IO ()) b - 
fprintf h fmt = write fmt id  where

  write :: F a b - (IO () - a) - b
  write (FLit str) k = k (hPutStr h str)
  write FInt k = \i - k (hPutStr h (show i))
  write FChr k = \c - k (hPutChar h c)
  write (FPP (PrinterParser pr _)) k = \x - k (hPutStr h (pr x))
  write (a :^ b)  k = write a (\sa - write b (\sb - k (sa  sb)))


*PrintScan fprintf stdout fmt5 15 1.3 '!'
abc15cde1.3!*PrintScan

The first thing I did last night was change String to
type ShowS = String - String :


intps :: F a b - (ShowS - a) - b
intps (FLit str) k = k (str++)
intps FInt   k = \x - k (shows x)
intps FChr   k = \x - k (x:)
intps (FPP (PrinterParser pr _))  k = \x - k (pr x ++)
intps (a :^ b)   k = intps a (\sa - intps b (\sb - k (sa . sb)))



sprintfs :: F ShowS b - b
sprintfs fmt = intps fmt id


Ideally PrinterParser would display using ShowS as well:


data PrinterParser a
  = PrinterParser (a - ShowS) (String - Maybe (a, String))


Or one could use instance witnesses via GADTs to wrap up Show:


data F a b where
FSR  :: (Show b,Read b) = F a (b - a)


But I think changing PrinterParser would result in simpler code.

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Brainstorming on how to parse IMAP

2008-08-05 Thread ChrisK

Hi John,

  I recently posted new and fancy binary Get monads in
http://article.gmane.org/gmane.comp.lang.haskell.libraries/9691
and
http://article.gmane.org/gmane.comp.lang.haskell.libraries/9756
which might be of interest since network protocol are usually specified in bytes 
at the wire level.


  The latest one takes input which may or may not be complete and returns a 
stream (a Seq) of results.  When it reaches the end of the input it pauses and 
asks for more.  This handling of partial input might be good for network 
protocols where you can feed the data from the socket to the parser in pieces. 
(This Get monad eats lazy bytestrings).


  The latest MyGetW.hs allows the Get code to send a Data.Sequence.Seq of 
results by using yieldItem (and perhaps flushItems).  This is in addition to any 
final result of the parser.


It has the usual binary Get interface, so you can pull bytestrings and words and 
(fancy) any Storable off the input.


  I call it fancy because the monad is a transformer, and it is a 
MonadError/MonadPlus/Alternative, and it supports lookAhead and callCC/MonadCont 
and Reader/Writer/State.  Whew.


As for IMAP, I use imapfilter (http://imapfilter.hellug.gr/) which uses Lua.

Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Brainstorming on how to parse IMAP

2008-08-05 Thread ChrisK

I am glad you asked Ben,

Short answer: It can return a Seq of your values. The values in the Seq are 
lazy, the Seq itself is finite.  It can return what it has so far before it 
finishes parsing (or even before the rest of the input has arrived from the 
network).


Ben Franksen wrote:

ChrisK wrote:

   I recently posted new and fancy binary Get monads in
http://article.gmane.org/gmane.comp.lang.haskell.libraries/9691
and
http://article.gmane.org/gmane.comp.lang.haskell.libraries/9756
which might be of interest since network protocol are usually specified in
bytes at the wire level.

   The latest one takes input which may or may not be complete and returns
   a
stream (a Seq) of results.


IIRC Seq is not a 'Stream' but a strict sequence? Or do you meant 'a stream
(of Seq)'?


I meant it returns many (Seq y), one after the other, while doing parsing in 
between.




Cheers
Ben


Long answer:

The complicated parser looks like this.  Start with the run function:

runCompGet :: (Monad m,Monoid w)
= CompGet a y r w user m a
- r - user - L.ByteString
- m (CompResult y w user m a)

This takes a CompGet and a reader state r and a user state user and the 
(initial) input L.ByteString (Data.ByteString.Lazy.ByteString).  It evaluates to 
 the inner monad 'm' returning a CompResult.


The CompResult is a three-fold type:


data CompResult y w user m a =
CFailed (Seq y) !Int64 String
  | CFinished  (Seq y) !L.ByteString !Int64 w user a
  | CPartial (Seq y) (Either ( m (CompResult y w user m a) )
 ( Maybe L.ByteString - m (CompResult y w user 
m a) ))



All three have (Seq y) which are the Data.Sequence.Seq of things which have been 
queued by yieldItem.


CFailed also has the Int64 count of bytes parsed successfully and an error 
message String.  Nothing more can be done.


CFinished also has the unused tail of the input as a L.ByteString and an Int64
of the bytes consumed.  And the output of the writer w, the final user state, 
and lastly it has the end value returned by the computation which has type 'a'. 
 Nothing more can be done.


CPartial is the intermediate result.  It also carries Either:
 Left : the rest of the computation, currently suspended, to continue running.
 Right: a function from (Maybe ByteString) to the suspended computation.

The Left is a result of the flushItems command and is merely a way to return 
the (Seq y) so far before continuing.


The Right is a result of running out of input data.  This allows the program to 
feed more input into the parser which will be appended to all the previous 
input. One does this by passing (Just someByteString) to the function.  If the 
parser again runs out of data it will again return CPartial with a Right value.


Alternatively, one can pass Nothing.  This tells the parser that there will 
never ever be more input. The parser will never ask for (though it may 
flushItems and return a Right valued CPartial).


A key thing about the (Seq y) is that yielded items are only returned once.  The 
CPartial may be returned many times and each time it will have an empty list or 
fresh list of (Seq y).  The values in the Seq are lazy, the Seq itself is 
finite.  To collect all the value the caller has to concatenate all the (Seq 
y)'s that are returned during parsing.


As for parsing, the module offers the usual BinaryParser interface (package 
binary-strict) and has an interface which mostly overlaps Data.Binary.Get 
(package binary). For example: it has getByteString and getWord64be and 
getStorable.


You don't have to use the yieldItem command.  You can just return the results 
in the final a return value (or the user state or the writer w value).  In 
this situation you only get an answer when CFinished is returned (and nothing if 
CFailed is returned).  I could not use the writer mechanism for yield-ing 
because the listen and pass parts of the MonadWriter class ensure it has the 
wrong semantics.


You might wonder:
 *) If the parser code uses MonadPlus to give several alternatives
 *) The first alternative gets more input via CPartial (perhaps several times)
 *) The first alternative then fails
 *) The second alternative starts parsing from the same position the first did
 *) Does the second alternative see the new input passed to CPartial earlier?
The answer is yes.  Changes to the input stream by appending with CPartial 
affect the whole computation and are never rolled back.


You might wonder:
  *) If the first alternative calls yieldItem foo
  *) The first alternative fails
  *) The second alternative calls flushItems
  *) Does the CPartial (Seq y) contain foo?
The answer is yes.  Items yielded are never rolled back.

[ Doing all of this in the presence of throwError/mzero/fail and lookAhead* and 
callCC was interesting to code. ]


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org

[Haskell-cafe] Re: ANN: haskell-src-exts 0.3.5

2008-07-15 Thread ChrisK

Thanks for the fix.  I have gotten the darcs version and I am compiling...

Niklas Broberg wrote:

Hi all,

I'm pleased to report that haskell-src-exts is now updated to
understand Template Haskell syntax (it used to understand pre-6.4 TH,
but now it works with the current version). At least I hope so, I
didn't have much TH code to try it on so if you find some bugs just
let me know. It wasn't all that hard to fix though, so I'm pretty
confident it actually works.

Get the code:
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/haskell-src-exts-0.3.5
darcs get http://code.haskell.org/HSP/haskell-src-exts

Cheers,

/Niklas


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Template Haskell and haskell-src-exts

2008-07-10 Thread ChrisK

Hi,

  Can one represent the ''Type template Haskell syntax of

$( makeMergeable ''FileDescriptorProto )

in haskell-src.exts Language.Haskell.Exts.Syntax ?

And what are the HsReify data (e.g. HsReifyType and HsReifyDecl and 
HsReifyFixity )?

I don't see any pretty print capability to produce the ''Type so I am wondering 
what else I might use...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: A Monad for on-demand file generation?

2008-07-03 Thread ChrisK

Joachim Breitner wrote:

 * The 5th line does not have this effect. Because this gets desugared
to (), the special implementation of () means that the next line
still sees the same dependency state as the before the call to liftIO.


You are violating the monad laws.  (f  k) and (f = \_ - k) should do the
same thing.  You might write a version of liftIO that has the effect you want,
however.



 * A change to inFile3 causes outFile1 to be re-written, although from
looking at the code, _we_ know that this is not necessary, but the ODIO
monad can not tell. The programmer should have swapped the lines.


Let me reverse engineer your algorithm (aside from the screwy ):

Every readFile that is encountered in processing ODIO is added to a list of
source files.  The reading deferred to be lazy with unsafePerformIO.

When a writeFile is encountered it is assumed to depend on all previously read
files.  If this output file already exists and is newer than all the source
files, then writing it is skipped (and perhaps also the lazy reads are skipped).
Otherwise, the writing is strict.



I would say this is an unusual module.  I rather prefer Makefile semantics,
which could be improved in some ways by using a DSL in Haskell instead.

The syntactic form of a file-oriented Makefile declaration is

output : input1 input2
  shell script
  more shell script

And the shell script has access to the output file name, and also has access
to the input names.

In Haskell you could have a monadic DSL where the output name (and perhaps some
explicit input names) are accessible like MonadReader.

The result of running the DSL would do no IO at all but, much like a compiler,
would return an IO action (the program to create the output file) and a list of
inferred dependencies (an improvement over the Makefile syntax).  Even if the
DSL does not allow liftIO, it can still compile to various IO actions.

Then you have a map from (outputname) to (dependencies,ioAction).  And when
outputname is demanded you can walk the dependencies to see if the timestamps
are newer or older, using the ioActions to create the desired files.

So perhaps to run the DSL monad you have a function like:

makeRule :: DSL () - FilePath - [FilePath] - ( [FilePath], IO () )

type Depends = Map FilePath ([FilePath], IO ())

demand :: Depends - FilePath - Maybe ByteString

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: A Monad for on-demand file generation?

2008-07-02 Thread ChrisK

hen, the readFileOD could put the timestamp
of the read file in a Monad-local state and the writeFileOD could, if
the output is newer then all inputs listed in the state, skip the
writing and thus the unsafeInterleaveIO’ed file reads are skipped as
well, if they were not required for deciding the flow of the program.


How is your system similar to make/Makefile or different to
make/Makefile ?

Are your actions more restricted?  Are the semantics more imperative?  Are the 
dependencies still explicit or are them implicit and inferred?


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Writing an 'expect'-like program with runInteractiveCommand

2008-05-01 Thread ChrisK

Are you adjusting 'System.IO.hSetBuffering' to NoBuffering for those handles?


Graham Fawcett wrote:

Hi folks,

I would like to communicate with an external, line-oriented process,
which takes a sequence of one-line commands, each returning an
arbitrary number of lines, and waits for another command after each
response.  So, something like:

sendCmd :: (Handle, Handle) - String - IO [String]
...

main = do
  handles - connectToExternalProcess
  sendCmd handles do something
  resp - sendCmd get results -- needs strict I/O, before quit?
  sendCmd quit
  mapM_ putStrLn resp

I've tried using runInteractiveCommand, and several combinations of
hFlush, hWaitForInput, etc., but I can't find a combination that
actually works.

I know this is a sketchy description, but can anyone offer some sample
code, or point me toward a program that has similar behaviour?

Thanks,
Graham


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Caching the Result of a Transaction?

2008-04-28 Thread ChrisK
The garbage collector never gets to collect either the action used to populate 
the cached value, or the private TMVar used to hold the cached value.


A better type for TIVal is given below.  It is a newtype of a TVal.  The 
contents are either a delayed computation or the previously forced value.


Thew newTIVal(IO) functions immediately specify the delayed action.

The newEmptyTIVal(IO) functions create a private TMVar that allows the delayed 
action to be specified once later.  Note the use of tryPutTMVar to return a Bool 
instead of failing, in the event that the user tries to store more that one action.


When force is called, the previous action (and any private TMVar) are forgotten. 
 The garbage collector might then be free to collect them.


--
Chris


-- By Chris Kuklewicz (April 2008), public domain
module TIVal(TIVal,newTIVal,newTIValIO,force,cached) where

import Control.Applicative(Applicative(..))
import Control.Concurrent.STM(STM,TVar,newTVar,newTVarIO,readTVar,writeTVar
 
,TMVar,newEmptyTMVar,newEmptyTMVarIO,tryPutTMVar,readTMVar)
import Control.Monad(Monad(..),join,liftM2)
import System.IO.Unsafe(unsafePerformIO)

newtype TIVal a = TIVal (TVar (Either (STM a) a))

-- the non-empty versions take a computation to delay

newTIVal :: STM a - STM (TIVal a)
newTIVal = fmap TIVal . newTVar . Left

newTIValIO :: STM a - IO (TIVal a)
newTIValIO = fmap TIVal . newTVarIO . Left

-- The empty versions stage things with a TMVar, note the use of join
-- Plain values 'a' can be stored with (return a)

newEmptyTIVal :: STM ( TIVal a, STM a - STM Bool)
newEmptyTIVal = do
  private - newEmptyTMVar
  tv - newTVar (Left (join $ readTMVar private))
  return (TIVal tv, tryPutTMVar private)

newEmptyTIValIO :: IO ( TIVal a, STM a - STM Bool )
newEmptyTIValIO = do
  private - newEmptyTMVarIO
  tv - newTVarIO (Left (join $ readTMVar private))
  return (TIVal tv, tryPutTMVar private)

-- force will clearly let go of the computation (and any private TMVar)

force :: TIVal a - STM a
force (TIVal tv) = do
  v - readTVar tv
  case v of
Right a - return a
Left wait - do a - wait
writeTVar tv (Right a)
return a

-- Conal's cached function. This is actually safe.

cached :: STM a - TIVal a
cached = unsafePerformIO . newTIValIO

-- The instances

instance Functor TIVal where
  f `fmap` tiv = cached (f `fmap` force tiv)

instance Applicative TIVal where
  pure x  = cached (pure x)
  ivf * ivx = cached (force ivf * force ivx)

instance Monad TIVal where
  return x  = cached (return x)
  tiv = k = cached (force tiv = force . k)

instance Applicative STM where
  pure x = return x
  ivf * ivx = liftM2 ($) ivf ivx 


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Wrong Answer Computing Graph Dominators

2008-04-18 Thread ChrisK

More algebraically, including 'or' for symmtry:

and xs = foldr () True xs
or xs = foldr (||) False xs

The True and False are the (monoid) identities with respect to  and || :

True  x == x
x  True == x

False || x == x
x || False == x

And so an empty list, if defined at all, should be the identity:

and [] = True
or [] = False

In english:
and returns false when is any element of the list false? is yes
or returns true when is any element of the list true? is yes

Matthew Brecknell wrote:

Dan Weston wrote:

Here, any path means all paths, a logical conjunction:

and [True, True] = True
and [True  ] = True
and [  ] = True


Kim-Ee Yeoh wrote: 
Hate to nitpick, but what appears to be some kind of a 
limit in the opposite direction is a curious way of arguing 
that: and [] = True.


Surely one can also write

and [False, False] = False
and [False  ] = False
and [  ] = False ???


No. I think what Dan meant was that for all non-null
xs :: [Bool], it is clearly true that:

and (True:xs) == and xs  -- (1)

It therefore makes sense to define (1) to hold also
for empty lists, and since it is also true that:

and (True:[]) == True

We obtain:

and [] == True

Since we can't make any similar claim about the
conjuctions of lists beginning with False, there
is no reasonable argument to the contrary.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: semi-closed handles

2008-04-15 Thread ChrisK

Ryan Ingram wrote:

I usually use something like this instead:

hStrictGetContents :: Handle - IO String
hStrictGetContents h = do
s - hGetContents h
length s `seq` hClose h
return s


A small idiomatic nitpick:  When I see (length s) gets computed and thrown away 
I wince at the wasted effort.  I would prefer (finiteSpine s):


finiteSpine = foldr (const id) ()

hStrictGetContents :: Handle - IO String
hStrictGetContents h = do
s - hGetContents h
finiteSpine s `seq` hClose h
return s

finiteSpine finds the end of a finite list and will hang forever on an 
infinite list.  One can even notice that the type of finiteSpine is Strategy [a]:


import Control.Parallel.Strategies(Strategy)
finiteSpine :: Strategy [a]
finiteSpine = foldr (const id) ()

And in fact finiteSpine = seqList r0, which returns () after applying the do 
nothing strategy r0 to every element.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Type families and GADTs in 6.9

2008-04-12 Thread ChrisK
The length calculation looked complicated.  So I reformulated it as a comparison 
using HasIndex.  But ghc-6.8.2 was not inferring the recursive constraint on 
proj, so I split proj into proj_unsafe without the constraint and proj with the 
constraint checked only once.  I also renamed ZT to Nil to be more consistent.



-- works in ghc 6.8.2
{-# LANGUAGE GADTs, TypeFamilies, EmptyDataDecls, TypeOperators #-}

data FZ
data FS a

data Fin fn where
  FZ :: Fin FZ
  FS :: Fin fn - Fin (FS fn)

f0 = FZ
f1 = FS f0
f2 = FS f1
-- ... etc.

data Nil
data t ::: ts

infixr 4 :::

data Tuple ts where
  Nil:: Tuple Nil
  (:::) :: t - !(Tuple ts) - Tuple (t ::: ts)

data HTrue

type family Lookup ts fn :: *
type instance Lookup (t ::: ts) FZ = t
type instance Lookup (t ::: ts) (FS fn) = Lookup ts fn

type family HasIndex ts fn :: *
type instance HasIndex (t ::: ts) FZ = HTrue
type instance HasIndex (t ::: ts) (FS fn) = HasIndex ts fn

{-# INLINE proj #-}
proj :: (HasIndex tsT fnT ~ HTrue) = Fin fnT - Tuple tsT - Lookup tsT fnT
proj = proj_unsafe  where
  proj_unsafe :: Fin fnT - Tuple tsT - Lookup tsT fnT
  proj_unsafe (FS fn) (_v ::: vs) = proj_unsafe fn vs
  proj_unsafe FZ  (v ::: _vs) = v
  proj_unsafe _   Nil = error Cannot proj Nil in proj_unsafe

fst' :: (HasIndex ts FZ ~ HTrue) = Tuple ts - Lookup ts FZ
fst' = proj f0

snd' :: (HasIndex ts (FS FZ) ~ HTrue) = Tuple ts - Lookup ts (FS FZ)
snd' = proj f1

pair :: Tuple (Char ::: (() ::: Nil))
pair = 'a' ::: () ::: Nil

q1 :: Char
q1 = fst' pair

q2 :: ()
q2 = snd' pair

{- This won't compile

q2 :: ()
q2 = snd' ('a' ::: Nil)
-}



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Longest increasing subsequence

2008-04-11 Thread ChrisK
It is late, but I was not sleepy enough, so here is my first translation of the 
algorithm to a functional approach...



{- Quote wikipedia: http://en.wikipedia.org/wiki/Longest_increasing_subsequence

 L = 0
 M[0] = 0
 for i = 1, 2, ... n:
binary search for the largest j ≤ L such that X[M[j]]  X[i] (or set j = 0 
if no such value exists)

P[i] = M[j]
if j == L or X[i]  X[M[j+1]]:
   M[j+1] = i
   L = max(L, j+1)
-}

{-
X[i] defined for i = 1,2,3…
So X[0] is not defined.
Now, rethink '0' as Nothing, and 1≤j≤L since X[M[0]] is also undefined.

Not that after the binary search that one the three conditions holds:

X[i] ≤ X[M[1]]
  The same or a new minimum value
  P[i] is created and set to Nothing
  If X[i]  X[M[1]] then M[1] is changed to i

X[M[j]]  X[i] ≤ X[M[j+1]] for some jL
  A value greater than the minimum and equal or less than the maximum
  P[i] is set to Just M[j]
  If X[i] == X[M[j+1]] then there is no need to update M[j+1]
  If X[i]  X[M[j+1]] then M[j+1] is changed to i

X[M[L]]  X[i]
  A new maximum value
  P[i] is set to Just M[L]
  M[L+1] is created and set to i
  L is set to L+1

Wikipedia is too loose.  X[M[1]], X[M[2]], …, X[M[L]] is not nondecreasing, 
but must be strictly increasing.  This is really sloppy of wikipedia.


The P[i] are just a stack, create a linked list going in and pull
apart on way out, will by O(N).

If you do not separately track the min and max values, then the
algorithm works like this:

Make a map mu from the set of X[M[j]] to M[j], starting empty.
Make a P as a list of Maybe Int, starting as [].
Note that size mu will always by L, and starts off as 0.
For i=1,2,3…:
 do a Data.Map.splitLookup using pivot X[i] to get (map1,m,map2) and find which
 of the three cases we are in:
  If there is a null map2 then third case:
If empty mu then prepend Nothing to P.
  else get M[L] from snd (snd (findMax mu)), prepend Just M[L] to P.
Create new my by inserting key X[i] with value i to mu.
  If there is a null map1 then first case (Note that mu cannot be empty):
Prepend Nothing to P
get min from fst (findMin mu)
If X[i]  min then make new mu from replacing key X[i] with value i
   with (Data.Map.updateMin).
  Otherwise this is the middle case and map1 and map2 are both non-empty.
Get M[j] from (snd (findMax map1)) and prepend Just M[j] to P.
If 'm' is (Just {}) then X[i]  X[M[j+1]] and do not change mu
  else change mu to Data.Map.adjust key X[i] with value i on mu

Also: keep track of the length of P, which is ultimate N, where 1≤i≤N.

Each operation in the loop with index i is order log (i-1)

Note that the j is never explicitly tracked.  It is implicit in the order of 
the keys of the map mu.


Once you are done, you have a maximum subsequence length of (size mu),
and the stack P is just the P[i]'s in reverse order.  You can get the
last index i of the longest subsequence from (snd (findMax mu)) and
backtrack to get the other i's by carefully popping the stack P (in a
single traversal) and keeping only the indices you need until you
reach a Nothing.

-}

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Longest increasing subsequence

2008-04-11 Thread ChrisK
My late night suggestions were nearly correct.  I have actually written the code 
now.  Once keeping track of indices, and a second time without them:



{-# LANGUAGE BangPatterns #-}
-- By Chris Kuklewicz, copyright 2008, BSD3 license
-- Longest increasing subsequence
-- (see http://en.wikipedia.org/wiki/Longest_increasing_subsequence)
import Data.List (foldl')
import Data.Map (Map)
import qualified Data.Map as M (empty,null,insert,findMin,findMax
   ,splitLookup,deleteMin,delete)

type DList a = [a] - [a]

lnds :: Ord a = [a] - [a]
lnds = lnds_decode . lnds_fold

lnds_fold :: Ord a = [a] - Map a (DList a)
lnds_fold = foldl' process M.empty where
  -- The Map keys, in sorted order, are the input values which
  --   terminate the longest increasing chains of length 1,2,3,…
  process mu x =
case M.splitLookup x mu of
  (_,Just {},_) - mu -- ignore x when it is already an end of a chain

  (map1,Nothing,map2) | M.null map2 -
-- insert new maximum element x
if M.null mu
  then M.insert x (x:) mu -- x is very first element
  else let !xs = snd (M.findMax mu)
   in M.insert x (xs . (x:)) mu

  | M.null map1 -
-- replace minimum element with smaller x
M.insert x (x:) (M.deleteMin mu)

  | otherwise -
-- replace previous element oldX with slightly smaller x
let !xs = snd (M.findMax map1)
!oldX = fst (M.findMin map2) -- slightly bigger key
!withoutOldX = M.delete oldX mu
in M.insert x (xs . (x:)) withoutOldX

lnds_decode :: Ord a = Map a (DList a) - [a]
lnds_decode mu | M.null mu = []
   | otherwise = snd (M.findMax mu) []

tests =  [ ['b'..'m'] == (lnds $ ['m'..'s'] ++ ['b'..'g'] ++ ['a'..'c'] ++ 
['h'..'k'] ++ ['h'..'m'] ++ ['d','c'..'a'])
 ,  == lnds 
 , a == lnds a
 , a == lnds ba
 , ab == lnds ab
 ]


Comparing to wikipedia:
  The X[M[1]],X[M[2]],… sequence is strictly increasing.  These are the ends of 
the current increasing chains of length 1,2,… and they are the keys to the Map 
in my code.


  The values of the map are the subsequences themselves, in DList form. 
Instead of pointing to the index of the previous element I just lookup '!xs' and 
append '(x:)' to that.


Complexity:
  The strictness annotations ensure that the garbage collector can destroy any 
unreachable DList entries.  The space usage is thus O(N) and may be O(1) for 
certain inputs (such as the best case of never-increasing input list).  A 
strictly increasing input list is the worst case for space usage.


The naive time complexity of 'process' for the i'th input value is O(log i). 
This can be double checked by looking at the time complexity of everything I 
import from Data.Map.


Peak performance could be had by
  (1) adding the first element before the foldl' to avoid checking for this 
case in process
  (2a) accessing the internal map structure to optimize the 
splitLookup-delete-insert case into a single operation
  (2b) Using something like a zipper to access the to-be-deleted-and-replaced 
element of the map
The (2a) and (2b) work because we know the changed key will go into the same 
'slot' of the map as the old one.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Dynamic typing makes you more productive?

2008-03-18 Thread ChrisK

Jules Bean wrote:
 Justin Bailey wrote:
 From a recent interview[1] with the guy leading Ruby development on
 .NET at Microsoft:

  You spend less time writing software than you spend maintaining
 software. Optimizing for writing software versus maintaining software
 is probably the wrong thing to do. Static typing makes it harder to
 maintain software because it's harder to change it.

 It's interesting to say that, because not only is it completely untrue,
 but the opposite is in fact true. I would make the following statement:

 Static typing makes it easier to maintain software because it's easier
 to change it.

It depends on type inferencing and tool support.  If you change a type in Java 
(or C++) then you will have to change its declaration at each point of use in 
the project.  This is only easy with tool support.  In Ruby or Python there are 
few declarations to change -- but you have to be sure it is safe.


In Haskell, the type inference meas there are few declarations to change -- and 
it catches all the unsafe usages.


If you have many declarations then you are going to wish for more tool support.




When you change the type of something in a program (be it statically 
dynamically typed) you have to change all uses of it. If your program is 
dynamically typed, you have to work very hard to make sure you catch all 
instances, perhaps by having an enormous test suite, perhaps by having a 
powerful IDE with semantically aware search and replace. A common source 
of bugs is making a partial change in this way, where a rarely-tested 
code path develops a semantic bug from a 'far-away change'.


If your program is statically typed, the compiler tells you all the 
places you need to change. Job done.


Therefore I find that generally speak change/refactoring is an order of 
magnitude easier in haskell than, say, ruby.


Jules


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Small displeasure with associated type synonyms

2008-03-06 Thread ChrisK

Tom Schrijvers wrote:

Stefan,


I tried lexically scoped type variables, but to no avail:

instance forall a b. (C a, C b) = C (a, b) where
  type T (a, b) = (T a, T b)
  val   = (val :: T a, val :: T b)


The problem is ambiguity. The type checker can't determine which val 
function to use, i.e. which dictionary to pass to val. Assume:


  instance C Int where
type T Int = Int
val= 0

  instance C Bool where
type T Bool = Int
val = 1

Now, if you want some val :: Int, which one do you get? The one of C Int 
of C Bool? Depending on the choice you may get a different result. We 
can't have that in a deterministic functional language. Hence the error.

Adding a type signature doesn't change the matter.


I don't see how your example explains this particular error.
I agree Int cannot be generalized to (T Int) or (T Bool).

I see Stefan's local type signature is not (val :: a) like your (val ::Int) but 
(val :: T a) which is a whole different beast.  And (T a) is the type that ghc 
should assign here.


The C (a,b) instance wants val :: T (a,b),  The T (a,b) is declared as (T a, T 
b).  The annotated val returns (T a, T b).  One never needs the sort of Int 
to (T Int) generalization.


So what is a better explanation or example to clarify why GHC cannot accept the 
original code?


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Small displeasure with associated type synonyms

2008-03-06 Thread ChrisK

Okay, I get the difference.

The T a annotation in val :: T a)and val :: T a does not help choose the 
C a dictionary.
But the val :: a- T a and val (undefined :: a) allows a to successfully 
choose the C a dictionary.


val :: T a fixes T a but does not imply C a.
(undefined :: a) fixes a and does imply C a.
I now see how the functional dependency works here (which I should have tried to 
do in the first place -- I should have thought more and relied on the mailing 
list less).


class C a b | a - b is here class C a where type T a = b.
So only knowing T a or b does not allow a to be determined.

--
Chris

Tom Schrijvers wrote:

I don't see how your example explains this particular error.
I agree Int cannot be generalized to (T Int) or (T Bool).


Generalized is not the right word here. In my example T Int, T Bool and 
Int are all equivalent.


I see Stefan's local type signature is not (val :: a) like your (val 
::Int) but (val :: T a) which is a whole different beast.


Not all that different. As in my example the types T Int, T Bool and Int 
are equivalent, whether one writes val :: Int, val :: T Int or val :: T 
Bool. My point is that writing val :: T Int or val :: T Bool does not 
help determining whether one should pick the val implementation of 
instance C Int or C Bool.



And (T a) is the type that ghc should assign here.


As my example tries to point out, there is not one single syntactic form 
to denote a type.


Consider the val of in the first component. Because of val's signature 
in the type class the type checker infers that the val expression has a 
type equivalent to T a2 for some a2. The type checker also expects a 
type equivalent to T a, either because of the type annotation or because 
of the left hand side. So the type checker must solve the equation T a ~ 
T a2 for some as yet to determine type a2 (a unification variable). This 
is precisely where the ambiguity comes in. The type constructor T is not 
injective. That means that the equation may hold for more than one value 
of a2 (e.g. for T Int ~ T a2, a2 could be Int or Bool). Hence, the type 
checker complains:


Couldn't match expected type `T a2' against inferred type `T a'.

Maybe you don't care what type is chosen, if multiple ones are possible. 
My example tried to show that this can effect the values computed by 
your program. So it does matter.


For this particular example, it seems that the type checker does not 
have have more than alternative for a2 in scope. However, it is not 
aware of that fact. It uniformly applies the same general strategy for 
solving equations in all contexts. This is a trade-off in type system 
complexity vs. expressivity.


There is an opportunity for exploring another point in the design space 
here.


Tom

--
Tom Schrijvers

Department of Computer Science
K.U. Leuven
Celestijnenlaan 200A
B-3001 Heverlee
Belgium

tel: +32 16 327544
e-mail: [EMAIL PROTECTED]
url: http://www.cs.kuleuven.be/~toms/


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Create a list without duplicates from a list with duplicates

2008-02-09 Thread ChrisK

For Bimap is there anything like Data.Map.insertWithKey ?

Stuart Cook wrote:

On Sat, Feb 9, 2008 at 7:36 AM, Dan Weston [EMAIL PROTECTED] wrote:

 If order is important, the new bijective Data.Bimap class
 http://code.haskell.org/~scook0/haddock/bimap/Data-Bimap.html
 may be your best bet (I haven't yet tried it myself).


Let me try:

  nub :: (Ord a) = [a] - [a]
  nub = map snd . Data.Bimap.toAscList . Data.Bimap.fromList . reverse
. zip [1..]

   nub hello, world!
  helo, wrd!

Without the call to (reverse), this would still be an order-preserving
nub, except that it would preserve the relative order of the *last*
occurrence of each element. Actually, this makes me wonder whether
fromList's behaviour should be changed, and whether I should add a
non-clobbering variant of insert.


Stuart


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: weird ghci behavior with gadts and existentials

2008-02-06 Thread ChrisK

Let me add:

 data ExpGADT t where
 ExpInt :: Int - ExpGADT Int
 ExpChar :: Char - ExpGADT Char

Which type do you think 'unHide' and 'wierd' should have:

 unHide h = case h of
  Hidden (_,e) - e

 wierd = unHide  (Hidden (TyInt,ExpInt 3))

either:


unHide :: HiddenTypeExp - ExpGADT t-- Choice 1: Polymorphic t



unHide :: HiddenTypeExp - ExpGADT Int  -- Choice 2: Monomorphic Int


Note that TypeGADT/TyInt are irrelevant to unHide and wierd.

Clearly the first choice to unHide violates the encapsulation of 't'.
Clearly I cannot choose the 2nd choice, since it might be an ExpChar.
So unHide is impossible to write.

Chris Casinghino wrote:

Hi all,

I've been playing a bit with gadts and existentials lately, and I
have an example where I don't quite understand the behavior of
ghc.  The expression in question typechecks sometimes in some
versions of ghc, depending on where you write it, and not in
other versions.  Some other people I've asked report not getting
any errors, even when using apparently one of the same versions
of ghc I checked.

If I create a file which contains:


data TypeGADT t where
TyInt :: TypeGADT Int

data ExpGADT t where
ExpInt :: Int - ExpGADT Int

data HiddenTypeExp =  forall t . Hidden (TypeGADT t, ExpGADT t)

weird = case Hidden (TyInt, ExpInt 3) of Hidden (TyInt, e) - e


I am able to :load it into ghci just fine (with -fglasgow-exts)
with version 6.8.2.  However, if I then copy the line:

let weird2 = case Hidden (TyInt, ExpInt 3) of Hidden (TyInt, e) - e

into ghci, I get a type error.  In the HEAD version 6.9, I get a
type error on the definition of weird right away when I :load
the file.  The type error goes away if I add the line


weird :: ExpGADT Int


before the definition of weird.

The type error in question is this:

interactive:1:46:
Inferred type is less polymorphic than expected
  Quantified type variable `t' escapes
When checking an existential match that binds
e :: ExpGADT t
The pattern(s) have type(s): HiddenTypeExp
The body has type: ExpGADT t
In a case alternative: Hidden (TyInt, e) - e
In the expression:
case Hidden (TyInt, ExpInt 3) of Hidden (TyInt, e) - e

So, several questions.

1) Why the discrepancy in behavior between :loading the file and
copying in the definition in 6.8.2.  It seems like, one way or the
other, this should be consistent.

2) Several other people report not getting any errors at all, even
people with ghc 6.8.2, one of the versions I tested.  What's the
right behavior?  My guess would be that this should cause no
type error, even without the type annotation.  The GADT pattern
match against TyInt in the case statement should refine the
existential type variable t to Int, so no existential type
variables are escaping.  Is that right?

3) Is there a bug here?  Are there two bugs (one, the typing error,
two, the difference between ghc and ghci)?  Or, do I just not
understand what is going on?

Sorry for the length of this email!

--Chris Casinghino


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: A handy little consequence of the Cont monad

2008-02-01 Thread ChrisK

The bit of a mess that comes from avoiding monads is (my version):

import Foreign.Marshal.Array(withArray0)
import Foreign.Ptr(nullPtr,Ptr)
import Foreign.C.String(withCString,CString)


This uses withCString in order of the supplied strings, and a difference list 
([CString]-[CString]) initialized by id to assemble the [CString].  This is 
the laziest way to proceed.



acquireInOrder :: [String] - (Ptr CString - IO a) - IO a
acquireInOrder strings act =
foldr (\s cs'io'a -
(\cs -
  withCString s (\c - cs'io'a (cs . (c:))
)
)
  )
  (\cs -
 withArray0 nullPtr (cs []) act
  )
  strings
  id


This uses in withCString in reversed order of the supplied strings, and normal 
list ([CString]) initialized by [] to assemble the [CString].  This is not as 
lazy since it needs to go to the end of the supplied list for the first IO action.



acquireInRerverseOrder :: [String] - (Ptr CString - IO a) - IO a
acquireInRerverseOrder strings act =
  foldl (\cs'io'a s -
  (\cs -
withCString s (\c - cs'io'a (c:cs)
  )
  )
)
(\cs -
   withArray0 nullPtr cs act
)
strings
[]



Cale Gibbard wrote:

Hello,

Today on #haskell, resiak was asking about a clean way to write the
function which allocates an array of CStrings using withCString and
withArray0 to produce a new with* style function. I came up with the
following:

nest :: [(r - a) - a] - ([r] - a) - a
nest xs = runCont (sequence (map Cont xs))

withCStringArray0 :: [String] - (Ptr CString - IO a) - IO a
withCStringArray0 strings act = nest (map withCString strings)
 (\rs - withArray0 nullPtr rs act)

Originally, I'd written nest without using the Cont monad, which was a
bit of a mess by comparison, then noticed that its type was quite
suggestive.

Clearly, it would be more generally useful whenever you have a bunch
of with-style functions for managing the allocation of resources, and
would like to turn them into a single with-style function providing a
list of the acquired resources.

 - Cale


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Security Notice: Buffer overflow fixed in PCRE library

2008-01-30 Thread ChrisK

The PCRE library has just fixed a buffer overflow (related to UTF-8 mode).
There are several haskell wrappers for the pcre library.
If you use a wrapper for the PCRE library (libpcre) then you may want to upgrade 
the underlying library.


http://pcre.org/news.txt states:


News about PCRE releases


Release 7.6 28-Jan-08
-

The main reason for having this release so soon after 7.5 is because it fixes a
potential buffer overflow problem in pcre_compile() when run in UTF-8 mode. In
addition, the CMake configuration files have been brought up to date.


Cheers,
  Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Why functional programming matters

2008-01-25 Thread ChrisK

Simon Peyton-Jones wrote:

1. Small examples of actual code.


I particularly like the lazy way of counting change example (also works for 
picking items off a menu).


The code below show 3 approaches :
 a function for computing the coins used in each way as a verbose list
 a function for computing just the total number of ways
 a simply Monoid that does both at once, which a pretty summary display
And it has a short but user friendly main function that drives it.

The method used is simple.  It considers each value of coin in turn, this loop 
is done by the foldr.  The value being folded is a list where the index into the 
list is an amount for which change is being made; the value at that list index 
is the list or count of the ways to make that amount using the coins considered 
so far.


These exploit laziness since the returned lists are infinite and since 'result' 
is defined recursively for each different value of coin.


The example of defining a Monoid is a clear abstraction or generalization of the 
first two functions.



-- This demonstrates a way to find every eay to make change for a
-- given total using a set of coins.
--
-- By Chris Kuklewicz, Public Domain
import System.Environment(getArgs)
import Control.Exception as E(catch)
import Control.Monad(when)
import Data.List(group)
import Data.Monoid(Monoid(mempty,mappend))

computeListOfWays :: [Int] - [[[Int]]]
computeListOfWays coins = foldr includeValue noValues coins
  where noValues = [] : repeat []
includeValue value oldResult =
  let (unchangedResult,changedResult) = splitAt value oldResult
  result = unchangedResult ++
   zipWith (++) changedResult (map addCoin result)
  addCoin = map (value:)
  in result

computeCountOfWays :: [Int] - [Integer]
computeCountOfWays coins = foldr includeValue noValues coins
  where noValues = 1 : repeat 0
includeValue value oldResult =
  let (unchangedResult,changedResult) = splitAt value oldResult
  result = unchangedResult ++
   zipWith (+) changedResult result
  in result

computeWays :: [Int] - [Ways]
computeWays coins = foldr includeValue noValues coins
  where noValues = Ways [[]] 1 : repeat mempty
includeValue value oldResult =
  let (unchangedResult,changedResult) = splitAt value oldResult
  result = unchangedResult ++
   zipWith mappend changedResult (map addCoin result)
  addCoin (Ways list count) = Ways (map (value:) list) count
  in result

data Ways = Ways [[Int]] Integer

instance Monoid Ways where
  mempty = Ways [] 0
  mappend (Ways list1 count1) (Ways list2 count2) = Ways (list1++list2) 
(count1+count2)

instance Show Ways where
  show (Ways list count) = unlines (map summary list) ++ Count of Ways =  ++ show count 
++ \n
where summary = show . map (\sub - (length sub,head sub)) . group


coins_US :: [Int]
coins_US = [1,5,10,25,50]

coins_UK :: [Int]
coins_UK = [1,2,5,10,20,50]

main = do
  args - getArgs
  case args of
[] - error Pass a number of cents for which to count ways of making 
change
[x] - do n - E.catch (readIO x) (const (error The argument passed needs to be 
a number))
  when (n0) (error The argument passed needs to be a non-negative 
number)
  print (computeWays coins_US !! n)
_ - error Too many parameters, need just one number


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Why functional programming matters

2008-01-24 Thread ChrisK

Achim Schneider wrote:

Don Stewart [EMAIL PROTECTED] wrote:


jwlato:

In addition to STM, another item that should interest serious
programmers is forkIO.  Lightweight threads that (unlike in Python)
can use multiple cpu's.  Coming from Python, I personally appreciate
this.  Using STM to handle concurrency issues often greatly
simplifies multithreaded code.
And further on this, the use of `par` in pure code to make it go 
multicore is way beyond what most people think is possible.



I said _don't_ make me think of using par on a beowolf cluster of
ps3's. Don't you guys have any scruples?



Well... ghc still has a single-threaded garbage collector, so all the par 
threads must stop for garbage collection.  So scaling to the level of a cluster 
would be significantly sub-linear.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: \_ - not equivalent to const $

2008-01-15 Thread ChrisK

Luke Palmer wrote:

In attempting to devise a variant of cycle which did not keep its
argument alive (for the purpose of cycle [1::Int..]), I came across
this peculiar behavior:

import Debug.Trace

cycle' :: (a - [b]) - [b]
cycle' xs = xs undefined ++ cycle' xs



take 20 $ cycle' (const $ 1:2:3:4:trace x 5:[])

[1,2,3,4,x
5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5]

Nuts.  Oh, but wait:


take 20 $ cycle' (\_ - 1:2:3:4:trace x 5:[])

[1,2,3,4,x
5,1,2,3,4,x
5,1,2,3,4,x
5,1,2,3,4,x
5]

Hey, it worked!

Can someone explain what the heck is going on here?

Luke


(\_ - 1:2:3:4:trace x 5:[]) literally could mean your second program, but...

the 1:2:3:4:trace x 5:[] does not depend on the _ argument, and so it can be 
lifted outside the (\_ - ... ) and lazily evaluated once and shared between 
calls.  Optimization in ghc do this for you.


The definition const x = (\_ - x) binds 'x' outside of the _ argument, so 'x' 
is obviously outside (\_ - ...) and will be lazily evaluated once and shared.


I see that making the binding and sharing explicit in

 take 20 $ cycle' (let x = 1:2:3:4:trace x 5:[] in (\_ - x))
 [1,2,3,4,x
 5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5]

behaves like const. And pushing the binding inside the (\_ - ...)


take 20 $ cycle' (\_ - let x = 1:2:3:4:trace x 5:[] in x)

[1,2,3,4,x
5,1,2,3,4,x
5,1,2,3,4,x
5,1,2,3,4,x
5]


behaves like your second example.

--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Haskell and GUI

2008-01-15 Thread ChrisK
The advice below is for Mac OS X 10.4 and below.  Starting with Mac OS X 10.5 
(Leopard) the DISPLAY is set for you by the operating system.  Mine is currently 
/tmp/launch-sQZXQV/:0 which looks very strange because it is used to cause the 
launchd daemon to start the X server on demand (i.e. lazily).


Yitzchak Gale wrote:

btw, if you use GTK2HS on the Mac, don't forget to
start up X Windows support, and run export DISPLAY=:0.0
in your terminal window, before you run your program.
X Windows is usually in Applications/Utilities, but only
if you installed it manually from the Mac OS X discs,
it is not installed by default.

Good news - Gtk2 now has native support for the
Mac, so you probably won't need to run X starting
with the next version of GTK2HS.

Regard,
Yitz


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Implicit parameters and Arrows/Yampa?

2008-01-07 Thread ChrisK
Could I has one question?  What is the purpose of the stream function in the 
ArrowLoop instance?  Is it just to catch an unexpected [] at runtime?



8
module Main where

import Control.Arrow
import Control.Arrow.Operations
import Control.Arrow.Transformer.Reader

--
-- Standard list/stream arrow.
--

newtype SF b c = SF { runSF :: [b] - [c] }

instance Arrow SF where
   arr f = SF (map f)
   SF f  SF g = SF (g . f)
   first  (SF f) = SF (uncurry zip . (f *** id) . unzip)
   second (SF f) = SF (uncurry zip . (id *** f) . unzip)

instance ArrowLoop SF where
   loop (SF f) = SF $ \as -
   let (bs,cs) = unzip (f (zip as (stream cs))) in bs
 where stream ~(x:xs) = x:stream xs


It looks like stream is (almost) an identity which would crash at runtime if it 
encountered a [].  In particular it is equivalent to


  where stream xs = head xs:stream (tail xs)



instance ArrowCircuit SF where
   delay x = SF (init . (x:))


--
-- Some state we want to pass around without manual plumbing.
--

data AudioState = AudioState { sampleRate :: Double }

runAudio state graph = proc p - (| runReader (graph - p) |) state


--
-- Some unit generators for audio.
--

wrap x = x - fromIntegral (floor x)

-- phasor needs the sample rate
phasor phase0 = proc hz - do
   sr - pure sampleRate  readState - ()
   rec accum - delay (wrap phase0) - wrap (accum + hz / sr)
   returnA - accum

-- osc doesn't need to know about sample rate
osc phase0 = proc hz - do
   phase - phasor phase0 - hz
   returnA - cos (2 * pi * phase)


--
-- Test it out.
--

main = print (runSF (runAudio (AudioState{sampleRate=1000}) (osc 0)) 
(replicate 10 100))


8


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: Build fixed for regex-base,posix,compat,pcre

2008-01-07 Thread ChrisK

ANNOUNCEMENT: Build fixed for regex-base, regex-posix, regex-compat, regex-pcre

The changes are mainly to the Cabal build files to support ghc-6.8 and ghc-6.6 
simultaneously.  They definitely work with cabal version 1.2.3.0 (required for 
regex-pcre).  The regex-base, regex-posix, and regex-compat packages also seem 
to work with cabal-1.2.2.0 (but not regex-pcre).


The darcs repositories under http://darcs.haskell.org/packages/regex-unstable/
hold the new code.

I have uploaded new versions of the most popular regex-* packages to hackage:

Mon Jan 7 13:18:00 PST 2008 ChrisKuklewiczregex-pcre-0.94.1
Mon Jan 7 13:05:57 PST 2008 ChrisKuklewiczregex-posix-0.93.1
Mon Jan 7 13:04:49 PST 2008 ChrisKuklewiczregex-base-0.93.1
Mon Jan 7 06:50:54 PST 2008 ChrisKuklewiczregex-compat-0.91

I would like to thank everyone who sent me updated, and particularly Olivier 
Boudry who sent patches.


These build fixes have only been tested with ghc-6.6.1 on Mac OS X 10.5.1
(powerpc) and with ghc-6.8.2 on Linux (x86).

Happy New Year,
  Chris Kuklewicz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Quanta. Was: Wikipedia on first-class object

2008-01-06 Thread ChrisK

Brandon S. Allbery KF8NH wrote:


On Jan 6, 2008, at 15:02 , Ketil Malde wrote:


More seriously, perhaps quantum enters into the equation in how the
brain works, perhaps it is even necessary for thought.  However, I
get worried it's just another mystical mantra, a gratuitous factor
that, lacking any theory about how and what it does, adds nothing to
help understanding the issue.


The brain, being real, is best modeled by a final theory that physicists have 
not yet (noticed) written down.


how the brain works appears to be though electro- and bio- chemistry, which 
are best modeled/described right now by quantum mechanics.


There are observable quantum correlations that cannot be described by a 
classical theory.


So long as the processes you care about (e.g. whatever the hell consciousness 
is) do not use these non-classical correlations then you can create a simplified 
model that avoids the complexity of quantum theory.




I should not get into these off-topic things... but the viewpoint that 
worries you is only espoused by people looking for excuses to apply 
their favorite mystical mantra.  Quantum effects are well defined, but 
nonintuitive (for example, particles tunneling through a barrier).




Right.  Even if there are some quantum correlations that are used in the brain, 
then you just use a more complicated mathematical model.  Nothing 
mystical/spooky/special about it.


Final note:  Quantum tunneling is only surprising if you insist on thinking 
about particles or photons meeting some kind of impenetrable wall (made of 
what?).  All real barriers are made of forces exerted by other particles.  So 
the whole idea of a barrier is flawed in this regime.


--
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Missing join and split

2007-12-29 Thread ChrisK
Mitar wrote:
 Hi!

 On Dec 28, 2007 5:51 PM, Lihn, Steve [EMAIL PROTECTED] wrote:
 Since regex is involved, it is specific to (Byte)String, not a generic
 list.

 Oh, this gives me an interesting idea: making regular expressions more 
 generic.


The new regex-base API is fairly generic.

If you look at the classes in regex-base's Text.Regex.RegexLike:

class Extract source = RegexLike regex source where
 matchAll :: regex - source - [MatchArray]
 matchOnce :: regex - source - Maybe MatchArray
 matchCount :: regex - source - Int
 matchTest :: regex - source - Bool
 matchAllText :: regex - source - [MatchText source]
 matchOnceText :: regex - source - Maybe (source, MatchText source, source)

you can see that the regex type parameter is fully abstract, and that the
source being searched is also fully abstract.  The reason for having those
specific class methods is to allow for the instance to expose the most efficient
way to do each operation.

You could make an instance for string seaching (e.g. KMP or BM searching).
Pretty much any search or find operation could be made into an instance of
RegexLike.   The main constraint is that the MatchArray/MatchText use Int
indexing and the Extract instance wants to be able to do lookup with this:

type MatchOffset = Int
type MatchLength = Int
type MatchArray = Array Int (MatchOffset, MatchLength)
type MatchText source = Array Int (source, (MatchOffset, MatchLength))

class Extract source where
 before :: Int - source - source
 after :: Int - source - source
 empty :: source
 extract :: (Int, Int) - source - source

One benefit is that all the RegexContext instances are implemented by using just
the above class methods, so all the polymorphic match/matchM will immediately 
work.

If there is ever a strong need for going beyond the range of Int indexing, then
one could either make new variants of the classes, or add methods to the
existing ones.  But if you are searching over 2GB of something, then perhaps
have this generic type class API is not the top priority.

 Would not it be interesting and useful (but not really efficient) to
 have patterns something like:
 
 foo :: Eq a = a - ...
 foo (_{4}'b') = ...
 
 which would match a list with four elements ending with an element 'b'. Or:
 
 foo (_+';'_+';'_) = ...
 
 which would match a list with embedded two ';' elements. (Last _
 matched the remaining of the list.)
 
 OK, maybe guards are not the proper place to implement this as would
 add a possibility to make a really messy Haskell programs. But
 extending regular expressions to work on any list of elements with
 type implementing Eq would be realy powerfull. And then we could use
 split in many other than just text processing contexts.
 
 Of course, the problem in both cases is implementing something like
 regular expressions efficiently, especially on lists, but this is why
 there are smart people around. :-)
 
 
 Mitar

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Missing join and split

2007-12-29 Thread ChrisK
Albert Y. C. Lai wrote:
 Mitar wrote:
 I am really missing the (general) split function built in standard
 Haskell. I do not understand why there is something so specific as
 words and lines but not a simple split? The same goes for join.
 
 Don't forget Text.Regex.splitRegex.

Which is just:

 matchRegexAll p str = matchM p str

 {- | Splits a string based on a regular expression.  The regular expression
 should identify one delimiter.
 
 This is unsafe if the regex matches an empty string.
 -}
 
 splitRegex :: Regex - String - [String]
 splitRegex _ [] = []
 splitRegex delim str =
 case matchRegexAll delim str of
Nothing - [str]
Just (firstline, _, remainder, _) -
if remainder == 
   then firstline : [] : []
   else firstline : splitRegex delim remainder

Inlining the matchRegexAll/matchM means this is 8 lines of code.

Any given split function is very short, but there are enough design choices that
  I think the best library is none at all; the user can write exactly what they
want in = 10 lines of code.

Though now that I look at it again, I think I like

 splitRegex :: Regex - String - [String]
 splitRegex _ [] = []
 splitRegex delim strIn = loop strIn where
   loop str = case matchM delim str of
Nothing - [str]
Just (firstline, _, remainder) -
  if null remainder
then [firstline,]
else firstline : loop remainder

slightly better.  I'll eventually update the unstable regex-compat.

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: what does @ mean?.....

2007-12-28 Thread ChrisK
Nicholls, Mark wrote:
 Hello, I wonder if someone could answer the following…
 
 The short question is what does @ mean in
 
  
 
 mulNat a b
 
 | a = b = mulNat' a b b
 
 | otherwise = mulNat' b a a
 
 where
 
  mulNat' x@(S a) y orig
 
  | x == one = y
 
  | otherwise = mulNat' a (addNat orig y) orig
 

The @ means an as-pattern as defined in the Haskell 98 report section 3.17.1
http://www.haskell.org/onlinereport/exps.html#3.17.1

The 'x' binds to the whole (S a) and the 'a' binds to the parameter of the
constructor 's'.

There is a possible performance benefit here.  Consider:

zeroNothing Nothing = Nothing
zeroNothing (Just n) =
  if n == 0 then Nothing else (Just n)

versus

zeroNothing Nothing = Nothing
zeroNothing x@(Just n) =
  if n == 0 then Nothing else x


The first example takes apart the (Just n) and later reconstructs (Just n).
Unless the compiler is fairly clever, this will cause the new (Just n) to be a
new allocation instead of reusing the input value.  The second form uses an
at-pattern to bind 'x' to the whole input parameter and the returned 'x' will
not need to be reallocaed.

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Wikipedia on first-class object

2007-12-28 Thread ChrisK
This thread is obviously a source of much fun.  I will play too.

Cristian Baboi wrote:
 On Fri, 28 Dec 2007 18:32:05 +0200, Jules Bean [EMAIL PROTECTED]
 wrote:
 
 Cristian Baboi wrote:
 Let me ask you 3 simple questions.
 Can one use Haskell to make dynamically linked libraries (DLL on
 Windows, so on Linux) ?

 The short answer is yes.
 
 The long answer is that this is not a feature of haskell, but rather a
 feature of certain programs, which happen to be mostly but not
 entirely written in haskell, in particular, the haskell compiler. GHC
 can produce dynamically linked libraries.
 
 What you choose not to notice is the context in which I asked these
 questions.
 The context is: Haskell functions as first-class objects.
 What I am interested in the first place is dynamically linked libraries
 written in Haskell and used in Haskell.
 The interface with other languages like C come second.
 
  If yes, what is in them ?
 
 Object code, mostly. Sometimes a little data.
 
 What is the definition of an entry point in Haskell ?

Haskell does not have such a concept.  At all.  An implementation may have
such a concept.

Most people on this list define Haskell as any attempt at an implementation of
one of the standards which define Haskell, most recently the Hakell 98 standard.

This can be nhc / yhc / ghc / hugs / winhugs / helium / jhc.  Some of these
compile to native code, some compile to byte code for a virtual machine.  If an
implementation can compile separately, then it might support dynamic libraries.
 If so then a specific version of that compiler will define its own
implementation specific concept of an entry point.

 What is the semantics of those entry points ?

It depends.  For recent ghc versions, see its user manual:
http://haskell.org/ghc/docs/latest/html/users_guide/ffi-ghc.html#ffi-library
http://haskell.org/ghc/docs/latest/html/users_guide/win32-dlls.html

Other note:
  An imperative language, such as C++ or Java, specified the binary output of
any instance of the compiler.  Class methods will have very specific names and
addresses.  In C++ you can even get the member-function pointer values and
examine the byte offsets in the object.  In Java one gets a very specific layout
of bytecode in a class file.

  Haskell is a declarative language.  It does not specify anything about the
implementation's internals.  It specifies only properties like non-strict.
The fact that ghc uses lazy evaluation is merely an implementation detail,
chosen as a way of satisfying the non-strict requirement of Haskell.  The
output of a Haskell compiler is free to take all the source code and implement
things with jumps and branches that look absolutely nothing like function calls.

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Missing join and split

2007-12-28 Thread ChrisK
Lihn, Steve wrote:
 Programmer with perl background would think split like:
   list of string = split regex original string
 Since regex is involved, it is specific to (Byte)String, not a generic
 list. Also it appears one would need help from Text.Regex(.PCRE) to do
 that.
 
 intercalate a (split a xs) = a
 This identity rule does not hold for perl's join/split if regex is used.
 
 Steve  
 

Well, libpcre does not have a split function.

One can already write a split that uses the high level Regex API.  The only
reason you might want to dig into regex-pcre's Text.Regex.PCRE would be if it
would help efficiency.

Specifically, regex-base defines a RegexContext instance which is:

( RegexLike a b = RegexContext a b (b, b, b) ) : The text before the match, the
text of the match, the text after the match

So you can iteratively generate the pieces that split returns.

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Doing some things right

2007-12-28 Thread ChrisK
Brian Sniffen wrote:
 On Dec 28, 2007 6:05 AM, Andrew Coppin [EMAIL PROTECTED] wrote:
 [I actually heard a number of people tell me that learning LISP would
 change my life forever because LISP has something called macros. I
 tried to learn it, and disliked it greatly. It's too messy. And what the
 heck is cdr ment to mean anyway? To me, LISP doesn't even seem all
 that different from normal languages (modulo weird syntax). Now
 Haskell... that's FUN!]
 
 Contents of Data Register.
 
 Macros are like Template Haskell.  One example of where they're useful
 is programmer definition of new binding forms.  That's not possible in
 Haskell without Templates.  Macros were invented in Lisp because the
 syntax is so easy for machine manipulation---they don't have a tenth
 the complexity of Template Haskell for about the same power.
 
 -Brian
 

There is also Likell, at http://liskell.org/ by Clemens Fruhwir, which
translates Haskell source into a lisp style prefix syntax (will (all (the
(parentheses.  This is to allow macro-like analysis and transformation
without template haskell.

The main use of template haskell that I have seen mentioned on the mailing lists
is to analyze a data declaration and auto-generate some new class and
instance declarations.

But I agree that template haskell has problems:
  (*) Its own syntax, as big and complicated as Haskell
  (*) Has historically been very poorly documented (that _might_ have changed)
  (*) Only works with ghc

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


  1   2   >