[Haskell-cafe] Re: MonadFix

2007-12-21 Thread apfelmus

Daniel Fischer wrote:

apfelmus writes:


 | r == 0= p : f (p:ps) q
 | p*p  n   = [n]
 | otherwise = f ps n


However, when you do the sensible thing (which Joost did) and have the intsqrt 
a parameter of the function, like in


factorize :: Integer - [Integer]
factorize n = f n (intsqrt n) primes'
  where
primes' = something more or less clever here
f m sr (p:ps)
| r == 0= p:f q (intsqrt q) (p:ps)
| p  sr= if m == 1 then [] else [m]
| otherwise = f m sr ps
  where
(q,r) = m `quotRem` p

, then you only have the expensive intsqrt for each prime factor, and the test 
for each candidate is only one comparison instead of one multiplication and 
one comparison.


Ah thanks, sorry for not seeing it earlier. My thinking was that 
intsqrt q  is calculated multiple times (for  q , q/p, q/p^2, ...) per 
prime candidate  p  when  n  is divisible by a large power of  p  . But 
I didn't see that, in return,  intsqrt q  stays the same for candidates 
that aren't factors of  n . Compared to that,  p*p  is only calculated 
once per candidate, but then for every candidate. The former is clearly 
preferable to the latter.


In fact, thanks to lazy evaluation, the above code (test r==0 before p  
sr) evaluates  intsqrt q  at most once per prime candidate and thus 
combines both advantages.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: MonadFix

2007-12-22 Thread apfelmus

Joost Behrends wrote:

apfelmus writes:

Huh?  p  intsqrt n  is evaluated just as often as  p*p  n , with 
changing  n  . Why would that be less expensive? Btw, the code above 
test for  r==0  first, which means that the following  p*p  n  is 
tested exactly once for every prime candidate  p .


No. One point in the introduction of DivIter is, that intsqrt dividend is stored
there and only recomputed, when a new factor is found.


Yes, I'm sorry, it didn't occur to me that recomputing per actual prime 
factor (with multiplicities, i.e. p^5 counted 5 times) is better than 
recomputing per candidate factor (without multiplicities, i.e. p^5 
counted only once).



And concerning my cycled lists of summands as [2,4] or [2,4,2,4,2,4,2,6,2,6]:

an easily computable function stepping through all primes can only be
a function, which yields primes plus some more odd numbers. This is, what i
tried.


Yes, this scheme was my intention, too. The list  primes'  doesn't need 
to (and indeed shouldn't) be a list of actual primes, just a good guess like


  primes' = 2:[3,5]
  primes' = 2:3:scanl (+) 5 (cycle [2,4])

or something with [2,4,2,4,2,4,2,6,2,6]. So, it's an infinite list of 
numbers that qualify as candidates for being a prime factor of  n 
(which I called prime candidates. Not a good name, since they don't 
need to be actual prime numbers.)



What I want to say is that using such an infinite is a nice way to 
separate the generate-prime-factor-candidates-logic from the 
test-all-those-candidates-loop. It's not necessary to hard-code it with 
a predicate like



iterdivisors x | x == 0 = 3
   | x == 1 = 5
   | otherwise x = iterdivisors (x-1) + ((cycle [2,4]) !! x)


(which, in this particular form, is hopelessly inefficient) or special 
step functions like



d2 :: DivIter - DivIter
d2 x |dividend x `mod` divisor x  0  = x { divisor = divisor x + 2}
 |otherwise   = found x
d4 :: DivIter - DivIter
d4 x |dividend x `mod` divisor x  0  = x { divisor = divisor x + 4}
 |otherwise   = found x
d6 :: DivIter - DivIter
d6 x |dividend x `mod` divisor x  0  = x { divisor = divisor x + 6}
 |otherwise   = found x

divisions :: DivIter - DivIter
divisions y |or[divisor y == 3, 
divisor y == 5]   = divisions (d2 y)

|divisor y = bound y = divisions (d6$d2$d6$d4$d2$d4$d2$d4 y)
|otherwise= y


It's not even necessary to treat 2 in a special way like


twosect :: (Integer,[Integer]) - (Integer,[Integer])
twosect m |odd  (fst m) = m
  |even (fst m) = twosect (div (fst m) 2, snd m ++ [2])


does, the  primes'  list handles it all.


Regards
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Printing and Referential transparency excuse

2007-12-24 Thread apfelmus

Cristian Baboi wrote:
While reading the Haskell language report I noticed that function type 
is not an instance of class Read.


I was told that one cannot define them as an instance of class Show 
without breaking referential transparency or printing a constant.


  f :: (a-b)-String
  f x = bla bla bla

How can I define a function to do the inverse operation ?
  g :: String - ( a - b )

This time I cannot see how referential transparency will deny it.
What's the excuse now ?


The new excuse is that a better name for  g  is

  full-fledged-compiler :: String - (Int - Int)

(the function returned by  g  better not have a polymorphic type). Which 
programming language should the argument  String  be written in?



Regards
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Wikipedia on first-class object

2007-12-27 Thread apfelmus

Wolfgang Jeltsch wrote:
If x doesn’t equal y, x == y is False, but if x 
equals y, x == y might be True or undefined.


x == y  may be _|_ for the False case, too, depending on its 
implementation (like first comparing all list elements on even indices 
and then comparing all list elements on odd indices). But the standard 
== for lists has indeed the stated property.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Wikipedia on first-class object

2007-12-27 Thread apfelmus

Achim Schneider wrote:

Jonathan Cast wrote:

More importantly, we can prove that [1..] == [1..] = _|_, since

   [1..] == [1..]
= LUB (n = 1) [1..n] ++ _|_ == [1..n] ++ _|_
= LUB (n = 1) _|_
= _|_


As far as I understand
http://www.haskell.org/haskellwiki/Bottom
, only computations which cannot be successful are bottom, not those
that can be successful, but aren't. Kind of idealizing reality, that is.


Ah, that's only a glitch in the wording. [1..] == [1..] is still _|_ 
since it loops forever.


For more about _|_, see also

  http://en.wikibooks.org/wiki/Haskell/Denotational_semantics


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Wikipedia on first-class object

2007-12-28 Thread apfelmus

Cristian Baboi wrote:

 http://en.wikipedia.org/wiki/First-class_object

The term was coined by Christopher Strachey in the context of “functions 
as first-class citizens” in the mid-1960's.[1]


Depending on the language, this can imply:
1.  being expressible as an anonymous literal value
2.  being storable in variables
3.  being storable in data structures
4.  having an intrinsic identity (independent of any given name)
5.  being comparable for equality with other entities
6.  being passable as a parameter to a procedure/function
7.  being returnable as the result of a procedure/function
8.  being constructable at runtime
9.  being printable
10. being readable
11. being transmissible among distributed processes
12. being storable outside running processes

I'll guess that 5,9,12 does not apply to Haskell functions.


Exactly, together with 10 and 11 (when the distributed processes are on 
different machines).


But there is good reason that those things can't be done in Haskell. 
With extensional equality (two functions are considered equal if they 
yield the same result on every possible argument) number 5 is 
undecidable. Similarly, there cannot be functions


  print   :: (Int - Int) - String
  compile :: String - (Int - Int)

with

  compile . print = id

A  print  function based on an intensional representation (assembly, 
byte code, etc.) would have to distinguish extensionally equal functions


  print f ≠ print g   although   f = g

which is not allowed.


More importantly, I don't quite understand your question. If you 
definitively need 9-12 for a practical problem at hand, then you may 
want to take a look at the functional language Clean


  http://clean.cs.ru.nl/

which is similar to Haskell but offers 9-12 in some form.

In all other cases, an email thread is not a good (often not even 
successful) way to get a coherent world view on Haskell (or on 
something  else) since this necessarily involves nitpicking 
philosophical questions. In my experience, interrogating one person in 
real-time in audio and interrogating books are the best ways to do 
that. Concerning books, maybe


  The Haskell Road to Logic, Maths and Programming
  http://www.cwi.nl/~jve/HR

is for you. More books on

  http://haskell.org/haskellwiki/Books

You don't have to buy them, borrow them from a library.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Wikipedia on first-class object

2007-12-28 Thread apfelmus

Jules Bean wrote:

Cristian Baboi wrote:

But I guess it won't work because the compiler will optimize it
and the call will disappear.


I'm not quite sure what you're trying to say here.


(That's the world view on Haskell issue I mean. I know this feeling 
with physics: what the heck are they talking about, this is just 
mathematically wrong, ill-defined or otherwise void of contents?)



Cristian Baboi wrote:

How can be Clean similar to Haskell and at the same time satisfy 9-12 ?


In Clean, print has the type

   print :: (Int - Int) - Dynamic

but there is (hopefully) no equality on  Dynamic . But it can be stored 
in a file or something


   store :: Dynamic - IO ()

and loaded back. Thanks to IO, we can think of the file contents to be a 
non-deterministically chosen intentional representation for a value with 
extensional equality.


I don't know whether Clean really does  store  that way, it may do more 
and hence break the extensional semantics a bit.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difference lists and ShowS

2008-01-03 Thread apfelmus

Henning Thielemann wrote:

I can't see it. If I consider (x++y) but I do not evaluate any element of
(x++y) or only the first element, then this will need constant time. If I
evaluate the first n elements I need n computation time units. How is (.)
on difference lists faster than (++) here?


That's a very good question. Basically, the problem is: how to specify 
the time complexity of an operation under lazy evaluation?



You argue that (++) is constant time in the sense that evaluating (x ++ 
y) to WHNF is O(1) when x and y are already in WHNF. Same for (.). This 
is indeed correct but apparently fails to explain why (.) is any better 
than (++). Help!



Of course, this very paradox shows that just looking at WHNF is not 
enough. The next best description is to pretend that our language is 
strict and to consider full normal form


  x in NF, y in NF -- (x++y) evaluates to NF in O(length x) time

Even when x and y are not in normal form, we know that evaluating the 
expression (x ++ y) takes


  O(x ++ y) ~ O(length x) + O(x) + O(y)

time to evaluate to NF. Here, O(e) is the time needed to bring the 
expression e into NF first. This approach now explains that (++) takes 
quadratic time when used left-associatively


  O((x ++ y) ++ z) ~   O(length x + length y) + O(length x)
 + O(x) + O(y) + O(z)

instead of the expected

  O((x ++ y) ++ z) ~ O(x) + O(y) + O(z)

or something (only up to constant factors and stuff, but you get the 
idea). Note that considering NFs is still only an approximation since


  O(head (qsort xs)) ~ O(n) + O(xs)  where n = length xs

instead of the expected

  O(head (qsort xs)) ~ O(qsort xs)
 ~ O(n log n) + O(xs) where n = length xs

thanks to lazy evaluation. Also note that despite considering full 
normal forms, we can express some laziness with this by giving timings 
for an expression in different contexts like


  O(take n ys)
  O(head ys)

instead of only O(ys). Same for parameters with something like

  O(const x) ~ O(1)

instead of the anticipated O(const x) ~ O(x). (For lazy data structures, 
there are better ways to take laziness into account.)




With difference lists I write

shows L . (shows T . shows R)
(shows LL . (showsLT . shows LR)) . (shows T . shows R)
((shows LLL . (shows LLT . shows LLR)) . (showsLT . shows LR)) . (shows T . 
shows R)

I still need to resolve three (.) until I get to the first character of
the result string, but for the subsequent characters I do not need to
resolve those dots. In the end, resolution of all (.) may need some time
but then concatenation is performed entirely right-associative. Seems to
be that this is the trick ...


So far so good, but the problem now is that analyzing (.) with full 
normal forms is doomed since this would mean to evaluate things under 
the lambda which may take less time than doing call-by-need reductions. 
Still, we somehow have


  O(x . y) ~ O(x) + O(y)

which is better than O(x ++ y) but I'm not quite sure how to make this 
exact yet.



In the end, the above O(e)s are just random doodles conjured out of the 
hat, I don't know a formalism for easy reasoning about time in a lazy 
language. Anyone any pointers? Note that the problem is already present 
for difference lists in strict languages.




Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: An interesting monad: Prompt

2008-01-04 Thread apfelmus

Felipe Lessa wrote:

Ryan Ingram wrote:
[snip]

data Prompt (p :: * - *) :: (* - *) where
PromptDone :: result - Prompt p result
-- a is the type needed to continue the computation
Prompt :: p a - (a - Prompt p result) - Prompt p result

[snip]

runPromptM :: Monad m = (forall a. p a - m a) - Prompt p r - m r
runPromptM _ (PromptDone r) = return r
runPromptM f (Prompt pa c)  = f pa = runPromptM f . c

[snip]

How can we prove that (runPromptM prompt === id)? I was trying to go with


You probably mean

  runPromptM id = id


runPromptM prompt (PromptDone r)
 = return r
 = PromptDone r

runPromptM prompt (Prompt pa c)
 = prompt pa = runPromptM prompt . c
 = Prompt pa return = runPromptM prompt . c
 = Prompt pa ((= (runPromptM prompt . c) . return)
 = Prompt pa (runPromptM prompt . c)

 and I got stuck here. It seems obvious that we can strip out the
'runPromptM prompt' down there to finish the proof, but that doesn't
sound very nice, as I'd need to suppose what I'm trying to prove. Am I
missing something here?


You want to deduce

  runPromptM id (Prompt pa c) = Prompt pa c

from

  runPromptM id (Prompt pa c) = Prompt pa (runPromptM id . c)

by somehow assuming that  runPromptM id = id  at least when applied to 
c . If it were a problem about lists like


  foldr (:) [] = id

you could use  mathematical induction . You can do the same here, but 
you need to use coinduction. For more, see also


  http://www.cs.nott.ac.uk/~gmh/bib.html#corecursion


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: US Homeland Security program language security risks

2008-01-06 Thread apfelmus

Achim Schneider wrote:

That's an interesting task: Design a non-touring complete,
restricted language in which every expression is decidable, without
making the language unusable for usual programming problems.


Have a look about dependently typed languages like Epigram:

  http://www.e-pig.org/


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Basic question concerning the category Hask (was: concerning data constructors)

2008-01-06 Thread apfelmus

Yitzchak Gale wrote:

I wrote:

...it was recently claimed on this list that tuples
are not products in that category.


I've not been convinced yet.


I'm going to try convince you :) The crucial problem of Haskell's 
product is that (_|_,_|_) ≠ _|_ but that the two projections


  fst :: (A,B) - A
  snd :: (A,B) - B

cannot distinguish between both values. But if (,) were a categorial 
product,  fst  and  snd  would completely determine it. We would have 
the universal property that for every


  f :: C - A
  g :: C - B

there is a _unique_ morphism

  f  g :: C - (A,B)

subject to

  f = fst . (f  g)
  g = snd . (f  g)

In other words, there is a unique function

  () :: forall c . (c - A) - (c - B) - (c - (A,B))
  f  g = \c - (f c, g c)

In the particular case of  C=(A,B), f=fst  and  g=snd , the identity 
function is such a morphism which means


  fst  snd = id

due to uniqueness. But

  id _|_  ≠  id (_|_,_|_)

while clearly

  (fst  snd) _|_  =  (fst  snd) (_|_,_|_)


Derek Elkins wrote:

 Also, there is a Haskell-specific problem at the very get-go.
The most obvious choice for the categorical composition operator
assuming the obvious choice for the arrows and objects does not work...
...This can
easily be fixed by making the categorical (.) strict in both arguments
and there is no formal problem with it being different from Haskell's
(.), but it certainly is not intuitively appealing.


Note that the problem with (.) is seq's fault (pun intended :) 
Otherwise, it would be impossible to distinguish  _|_  from its 
eta-expansion  \x._|_ .



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Concurrency questions

2008-01-07 Thread apfelmus

Andrew Coppin wrote:
2. You have to take the data out of an MVar to read it. In other words, 
only 1 thread can read an MVar at once [by design]. This isn't truly a 
problem in the current case, but it's irritating in principle that I 
can't make it so that once the cell is written, multiple threads can 
read it simultaneously...


This is also known as I-structures i.e. IVar. I think you can simulate 
them via MVar with the  readMVar  function?



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: PROPOSAL: Some more 'Applicative' combinators

2008-01-09 Thread apfelmus

Sterling Clover wrote:
The more general question, which stems from a certain degree of 
ignorance on my part, is what uses folks have found for Alternative at 
all outside of parsing.


Alternative is the generalization of MonadPlus, so  empty  and  |  are 
useful for lists and Maybe at least. But I'm not sure whether


   many :: Alternative f = f a - f [a]

and friends have any uses outside of parsing.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difference lists and ShowS

2008-01-09 Thread apfelmus

Albert Y. C. Lai wrote:

apfelmus wrote:
I don't know a formalism for easy reasoning about time in a lazy 
language. Anyone any pointers? Note that the problem is already 
present for difference lists in strict languages.


http://homepages.inf.ed.ac.uk/wadler/topics/strictness-analysis.html

especially strictness analysis aids time analysis.


Ah, of course, thanks. Together with

  D. Sands. Complexity Analysis for a Lazy Higher-Order Language.
  http://citeseer.ist.psu.edu/291589.html

for the higher-order case, a satisfactory analysis can be put together.


The formalism is basically as follows: for a function f, let f^T denote 
the time needed to execute it to weak normal form given that it's 
arguments are already in weak normal form. Weak normal form = full 
normal form for algebraic values and lambda-abstraction for functions = 
what you'd expect in a strict language. Plain values = nullary 
functions. For instance


  (++)^T [] ys = 1 + ys^T = 1
  (++)^T (x:xs) ys = 1 + (x:(xs ++ ys))^T
   = 1 + (++)^T xs ys + xs^T + ys^T  -- (:) is free
   = 1 + (++)^T xs ys
 ==
  (++)^T xs ys = O(length xs)

Substituting a function application by the function body is counted as 1 
time step, that's where the  1 +  comes from.



For difference lists, we have

  (.)^T f g = O(1)

since it immediately returns the lambda-abstraction  \x - f(g x) . Now, 
we missed something important about difference lists namely the function


  toList f = f []

that turns a difference list into an ordinary list and this function is 
O(n). In contrast, The pendant for ordinary lists, i.e. the identity 
function, is only O(1). Why is it O(n)? Well, (.) itself may be O(1) but 
it constructs a function that needs lots of time to run. In particular


  (f . g)^T [] = ((\x-f (g x))[])^T
   = 1 + (f (g []))^T
   = 1 + f^T (g []) + (g [])^T
   = 1 + f^T (g []) + g^T []

So, to analyze higher-order functions, we simply have to keep track of 
the size of the returned functions (more precisely, Sands uses 
cost-closures). The above reduces to


  (f . g)^T [] = 1 + f^T [] + g^T []

Since our difference lists don't care of what they are prepended to

  f^T xs = f^T []

Cheating a bit with the notation, we can write

  toList^T (f . g) = 1 + toList^T f + toList^T g

This means that a difference list build out of  n  elements by  m 
applications of (.) will take  O(n + m) time. This is the same as O(m) 
because m = n , our lists are concatenations of singletons. That's not 
O(n) as anticipated, but it's alright: a concatenation of  m  empty 
lists is empty but clearly takes O(m) time, so the number of 
concatenations matters.



Since difference lists offer such a good concatenation, why not replace 
ordinary lists entirely? Well, the problem is that we have another 
function that should run fast, namely  head . For ordinary lists,


  head^T xs = O(1)

but for difference lists, we have

  (head . toList)^T f = O(m) which is = O(n)

in the worst case, lazy evaluation notwithstanding. How to analyze lazy 
evaluation? Wadler's approach is to add an extra argument to every 
expression which says how much of the expression is to be evaluated. 
This extra information can be encoded via projections. But I think it's 
sufficient here to let (head expr)^T symbolize the time to reduce  expr 
 to weak head normal form. For example,


  (head . toList)^T (f . g) = 1 + (head . toList)^T f

assuming that  f  is nonempty. But due to the  1 + , any left-nested 
composition like


  head . toList $ (((a . b) . c) . d) . e

still needs O(m) time. So, difference lists are no eierlegende 
wollmilchsau either.




Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difference lists and ShowS

2008-01-10 Thread apfelmus

Achim Schneider wrote:

Henning Thielemann wrote:


apfelmus wrote:


So, difference lists are no eierlegende wollmilchsau either.



LEO's forum suggests 'swiss army knife' as translation. :-)


But you really need one with 5 differently-sized blades plus three
spezialized carving blades, an USB stick, microscope, 13 kinds of torx,
imbus etc drivers each, a tv set (analogue/digital) with unfoldable
touchscreen, at least 3-band GSM and WiFi connectivity, hydraulic car
jack and chain saw to award it with the term egg-laying woolmilkpig.


But even such knives still can't lay eggs :(


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Displaying steps in my interpreter

2008-01-11 Thread apfelmus

Victor Nazarov wrote:

   import Control.Monad.Writer

  -- use a difference list or something for better performance
   type Trace v = [Zipper v]

   whnf :: Term v - Writer (Trace v) (Term v)
   whnf t = whnf' ([],t)
  where
  whnf' (AppL e':cxt, Lam x e) = tell  (cxt, App (Lam x e) e') 
 whnf' (cxt   , subst x e' e)
  whnf' (cxt, App f e) = whnf' (AppL e:cxt, f)
  whnf' z  = return $ unwind z

The definition of  whnf  is basically left unchanged, except that a
redex is recorded via  tell  whenever a beta-reduction is about to be
performed. The recorded terms can be printed afterwards

   printTrace :: Writer (Trace v) (Term v) - IO ()
   printTrace w = let (t, ts) = runWriter t ts
 putStrLn . unlines . map show $ ts



Is this function lazy? Can I run this code on term without nf and
print n-first steps:


   printTraceN :: Int - Writer (Trace v) (Term v) - IO ()
   printTraceN n w =
 let (t, ts) = runWriter t ts
 in putStrLn . unlines . map show $ take n ts



Will this work:


printTraceN 5 (read (\x. x x x) (\x. x x x))


Yes, it should (you can just try, right? :). That's because

   tell w  something

is basically

   let (w', x) = something in (w ++ w', x)

Now,  something  may loop forever, but  w ++ w' doesn't care, it 
prepends  w  no matter what  w'  is. Of course, asking for  x  (the 
normal form in our case) instead of  w ++ w'  won't succeed when 
something  loops forever.


(Cave: this is only true for the writer monad in 
Control.Monad.Writer.Lazy  which is imported by default. The writer 
monad in  Control.Monad.Writer.Strict  intentionally behaves differently.)



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [newbie question] Memoization automatic in Haskell?

2008-01-13 Thread apfelmus

Henning Thielemann wrote:

David Benbennick wrote:


But how can I implement memoization for a more complicated function?
For example, perhaps I want to memoize

f :: String - Int - Double - String - Bool


There was a long thread about a sophisticated technique called blue
prints, which allows you to use binary search trees as memoizing
structure.
 http://www.haskell.org/pipermail/haskell-cafe/2006-September/018204.html


That's not what blueprints were for. You want generalized tries here

 Ralf Hinze. Generalizing generalized tries.
 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz

as Luke pointed out.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [newbie question] Memoization automatic in Haskell?

2008-01-13 Thread apfelmus

Luke Palmer wrote:

David Benbennick wrote:


It would be nice if I could just tell the compiler I command you to
memoize this function, and have it produce the required code
automatically.


Tru dat!

But it's not clear what the best way for the compiler writer to do
that is.  For example, if I know the access patterns of the function,
I can design the aforementioned data structure to favor those.
Also, not every type admits memoization, for example functions.


Indeed. There are plenty of choices of data structures for memo tables 
and hash tables are not the best of them. Such choices are better left 
to the programmer.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: MonadPrompt + Gtk2Hs = ?

2008-01-13 Thread apfelmus
?


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: practice problems?

2006-09-04 Thread apfelmus

Tamas K Papp wrote:


I am looking for small to medium sized practice problems, preferably
with solutions.


Hi Tamas,

writing a Haskell library is very good idea yet requires some 
confidence. So for your first somehow useful programs, you could want to

try olympiad / competition problems in Haskell. Of course, most are not
specifically oriented towards Haskell as programming language and the
harder ones are about tricky algorithms. But they are the kind of small
yet intelligent task you are looking for. Not all have solutions but
usually, google will reveal quite a few.

The ultimate Haskell challenge is of course the ICFP contest
http://icfpcontest.org/
There is also the International ACM Programming Contest
   http://acm.uva.es/problemset/
Your country surely has some kind of high school computing science 
competition to get problems from.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Monad laws

2006-09-08 Thread apfelmus
Deokhwan Kim wrote:

 What is the practical meaning of monad laws?
 
 (M, return, =) is not qualified as a category-theoretical monad, if
 the following laws are not satisfied:
 
   1. (return x) = f == f x
   2. m = return == m
   3. (m = f) = g == m  (\x - f x = g)

The 3. law contains a typo and must be
3. (m = f) = g == m = (\x - f x = g)

 But what practical problems can unsatisfying them cause? In other words,
 I wonder if declaring a instance of the Monad class but not checking it
 for monad laws may cause any problems, except for not being qualified as
 a theoretical monad?

This question is likely to be a result of an unlucky introduction to
monads where they are introduced top down: Hear ye, a monad, this is
some mystic thing obeying the spiritual laws 1.,2. and 3., isn't it?
It is this way that monads get the attribute theoretical.

Asking what the practical meaning of the monad laws might be is like
asking what the practical meaning of the laws for natural number
addition could be: what does
i)  a + (b+c) == (a+b) + c mean?
How can i understand
ii)  a + 0 == a ?
What does
iii)  a + b == b + a signify?

These question are unlikely to arise because you have an intuition of
what a natural number is: a number of bullets in sack, coins in your
pocket, people in the mailing-list etc. With this knowledge, you will
most likely not have any problems explaining the laws i),ii),iii) to
somebody else and most likely you will have not doubt about *why* they
must be true.



For monads, my intuition is as following: a value of type (M a) is an
action, something producing a value of type  a  and (or by) executing a
side-effect like drawing on the screen or screwing up the hard drive.


With the operator =, I can execute such actions in a specific
sequence. For the sequence, it is of course unimportant how i group my
actions: i can group actions act1 and act2 first and then postpend act3,
or i can group act2 and act3 first and then prepend it with act1.

To simplify writing down a formular corresponding to this fact, we
introduce the operator  defined by
act1  act2 = act1 = \x - act2
which sequences actions but for simplicity discards the computed value x
of type a. It is only the side-effect of act1 we are interested in.

Now, the thought about grouping written does as formular is just
(act1  act2)  act3  ==  act1  (act2  act3)
and this is the simplified version of law 3. Of course, we know that
this is coined associativity.

The actual law 3 is just a formulation for = that takes proper care of
the intermediate calculation result x.


With  return x , we can create an action which computes the value x but
 has absolutely no side-effects.
This can also be stated in formulas, as Mr return explains:
1. if i am prepended to guys doing side-effects, i give them the value
x but do not take any responsibility for side-effects happening
   (return x) = (\y - f y) ==  f x
2. if i am postponed to an action which computes a value x, i don't do
any additional side-effects but just return the value i have been given
   m = (\x - return x)  ==  m
which is of course equivalent to
   m = return  ==  m



So to answer your question:
 In other words, I wonder if declaring a instance of the Monad class
 but not checking it for monad laws may cause any problems, except for not
 being qualified as a theoretical monad? 
A thing you declare to be an instance of the Monad class, but one that
does not fulfill the three laws above, simply does not match the
intuition behind a monad. I.e. your definitions of (=) and (return)
are most likely to be void of the intended meaning.



Regards,
apfelmus


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: evaluate vs seq

2006-09-10 Thread apfelmus
Michael Shulman wrote:
 The GHC documentation says that (evaluate a) is not the same as (a
 `seq` return a).  Can someone explain the difference to me, or point
 me to a place where it is explained?

(evaluate a) is weaker than (a `seq` return a). (a `seq` return a) is
always _|_ whereas (evaluate a) is not _|_, but does throw an exception
when executed. The appended code shows the differences.

Regards,
apfelmus


import Prelude hiding (return,catch)
import qualified Control.Monad as M
import Control.Exception

a = undefined :: ()
return = M.return :: a - IO a

e 0 = return a
e 1 = a `seq` return a
e 2 = evaluate a

t x = e x  return ()
-- t 0 == return ()
-- t 1 == throwIO something
-- t 2 == throwIO something
-- to check this, evaluate the expressions
-- (t x = print) and (t x `seq` ())

u x = e x `seq` return ()
-- u 0 == return ()
-- u 1 == undefined
-- u 2 == return ()
-- to check this, type (u x = print) into hugs/ghci

v x = catch (e x) (\_ - return ()) = print
-- v 0 == throwIO something
-- v 1 == print ()
-- v 2 == print ()

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: evaluate vs seq

2006-09-11 Thread apfelmus
Hello Michael,

you are correct. Only

 * (a `seq` return a) = evaluate a *right now*, then produce an IO action
  which, when executed, returns the result of evaluating a.  Thus, if
  a is undefined, throws an exception right now.

is a bit misleading as there is no evaluation right now. It's better
to say that (a `seq` return a) is _|_ (bottom, i.e. undefined) when a
== _|_.

The subtle point is the difference between the action of throwing an
exception (throw some_error :: IO a) and an undefined value (_|_).
(throw ..) will cause your program to crash when executed, but it's
still a well defined value of type (IO a).

For a more detailed semantics of exceptions in Haskell, see
Tackling the awkward squad: monadic input/output, concurrency,
exceptions, and foreign-language calls in Haskell
   http://research.microsoft.com/%7Esimonpj/Papers/marktoberdorf/

I further think (evaluate x) can be implemented as follows:
   evaluate x = catch (x `seq` return x) throw


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: evaluate vs seq

2006-09-13 Thread apfelmus
Michael Shulman wrote:
 On 9/11/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  * (a `seq` return a) = evaluate a *right now*, then produce an IO
 action
   which, when executed, returns the result of evaluating a.  Thus, if
   a is undefined, throws an exception right now.

 is a bit misleading as there is no evaluation right now. It's better
 to say that (a `seq` return a) is _|_ (bottom, i.e. undefined) when a
 == _|_.
 
 Sure... but what about when a is not _|_?  I would also like to
 understand the difference between `seq' and `evaluate' for arguments
 that are defined.  How would you describe that without talking about
 when expressions are evaluated?

Ah well, the discussion goes about denotational semantics of
Haskell. Unfortunately, I did not find a good introductory website about
this. Personally, I found the explanation from Bird and Wadler's book
about infinite lists very enlightening.

The game roughly goes as follows: denotational semantics have been
invented to explain what a recursive definition should be. I mean,
thinking of functions as things that map (mathematical) sets on sets
excludes self-referencing definitions (russell's paradox!).

The solution is to think functions of as fixed points of an iterative
process: factorial is the fixed point of
   (fac f) n = if n == 0 then 1 else n*f(n-1)
what means
   (fac factorial) n == factorial n

Now, the iterative process goes as follows:
  factorial_0 n = _|_
  factorial_1 n = fac (factorial_0) n
= if n == 0 then 1 else _|_
  factorial_2 n = fac (factorial_1) n
= case n of
   0 - 1
   1 - 1
   _ - _|_
and so on. Everytime, a more refined version of the ultimate goal
factorial is created, on says that
  _|_ == factorial_0 = factorial_1 = factorial_2 = ...
(= means less or equal than)
That's why _|_ is called bottom (it's the smalled thing in the hierarchy).

This was about functions. In a lazy language, normal values can show a
similar behavior. For instance, we have
  _|_  =  1:_|_  = 1:2:_|_ = 1:2:3:_|_ = ... = [1..]
That's how the infinite list [1..] is approximated. The inequalities
follow from the fact that bottom is below everything and that all
constructors (like (:)) are monotone (by definition), i.e.
  1:x = 1:y  iff  x = y

A function f is called *strict*, if it fulfills
  f _|_ = _|_
which means that it does not produce a constructor (information)
without knowing what its argument is.




Back to your original question, we can now talk about functions in terms
of _|_ and *data constructors*. As an example, we want to think about
the meaning of (take 2 [1..]). What should this be? I mean, one argument
is an infinite list! By tracing the definition of (take 2), one finds
  take 2 _|_ == _|_ -- (btw (take 2) is strict)
  take 2 (1:_|_) == 1:_|_
  take 2 (1:(2:_|_)) == 1:(2:[])
  take 2 (1:(2:(3:_|_))) == 1:(2:[])
and so on. The right and side remains equal for all further refinements,
so we must conclude
  take 2 [1..]   == 1:(2:[]).

For the evaluate and `seq` problem, we simplify things by specializing
the polymorphic type to ([Int] - IO [Int]). Then, we introduce two
constructors (ThrowException :: IO [Int]) and (Return :: IO [Int]) with
the obvious meaning. The semantics of `seq` are now as following:
  _|_`seq` x == _|_
  [] `seq` x == x
  (y:ys) `seq` x == x
So `seq` forces its first argument. When we define
  f x = x `seq` (Return x)
we thereby get
  f _|_== _|_
  f [] == Return []
  f (x:xs) == Return (x:xs)
To compare, the semantics of (evaluate) is
  evaluate _|_== ThrowException =/= _|_
  evaluate [] == Return []
  evaluate (x:xs) == Return (x:xs)
That should answer your question.

Please note that this answer is actually a lie, as functions must be
monotone (halting problem, fix id == _|_), but (evaluate) is not:
  _|_ = []
yet
  evaluate _|_ == ThrowException   /=  evaluate [] == Return []
This can be fixed by departing from denotational semantics and giving
all (IO a)-things an operational meaning. Further explanations and
further references are found in the paper I pointed out last time.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-14 Thread apfelmus
Bertram Felgenhauer wrote:

 splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
 splitSeq' bp [] = ([], map (const []) bp)
 splitSeq' bp ((a,b):xs) = case lookup a bp bp of
 Just _ - let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m)
 _  - let (bp', m) = insert a bp m'
   (l, m')  = splitSeq' bp' xs
 in
   ((a, b : (fromJust $ lookup a bp' m')) : l, m)
 
 splitSeq' takes a blueprint for a map with all keys seen so far, and
 a list tail. It returns the result list for all new keys, and a map
 (corresponding to the given blueprint) with the tails of the seen
 elements.
 
 The in the base case it just fills the blueprint with empty lists and
 returns to the caller.
 
 If a new element is seen, insert is used to create a new blueprint,
 including the new key a, which is passed to the recursive call of
 splitSeq'. The resulting map from that call is threaded back into
 insert, which gives us a new map without the a key which matches
 the structure of the original blueprint.

Very interesting! So the map with the seen tails matches the blueprint
and therefore throws away information (the future keys) from the future.
This is what effectively allows the key-tree structure to be rebalanced.
   If one would return the tails-map with all future keys, it would be
_|_ because the key-order in the tree depends on the future keys and
changes everytime when a new key is found.

I thought that there can only be a static solution, i.e. like the one
Ross Paterson presented where the structure (I mean the outermost
constructors) of the returned tree are determined before the future.
This obviously excludes rebalancing.

I found a static solution by trying to fit the key-tails pairs into an
infinite tails-map which is filled first comes first:
  map ! 1 := (first key seen, tails)
  map ! 2 := (second key seen, tails)
By keeping another key-position-map around which records where each key
has been inserted, so that the future knows where to put the tails. The
point is that the structure of tails-map is known before the future
comes as its keys are just 1,2,3,... and so on.

It remains to construct such an infinite random access list, but this is
turns out to be even easier than finite random access lists (when using
non-uniform recursion from Okasaki's chapter 10) :)

 data Imp a = Imp a (Imp (a,a)) deriving (Show)

 instance Functor Imp where
fmap h ~(Imp x xs) = Imp (h x) (fmap (\(x,y) - (h x, h y)) xs)

 update :: (a - a) - Position - Imp a - Imp a
 update f 1 ~(Imp x xs) = Imp (f x) xs
 update f n ~(Imp x xs) = Imp x $ update f2 (n `div` 2) xs
where
f2 ~(y,z) = if even n then (f y, z) else (y, f z)

Note that we can use irrefutable patterns without hesitation as there is
only one constructor.

Folding over an infinite thing is strange but here we are

 fold :: (a - b - b) - Imp a - b
 fold f ~(Imp x xs) = f x (fold f2 xs)
where
f2 (x,y) z = f x (f y z)

It's not so strange anymore when we realize that this can be used to
convert it into an infinite list

 toList = fold (:)

The implementation of fromList is fun as well, so I won't tell it. As
fold and unfold can be defined generically for Mu f where f is a
functor, i wonder how to deal with it in the case of non-uniform recursion.


For splitStreams, the key-position-map is needed in both directions, so
we quickly define a bijective map

 type BiMap a b= (Map.Map a b, Map.Map b a)

 switch :: BiMap a b - BiMap b a
 switch (x,y) = (y,x)

with the usual operations (empty, member, size etc.) omitted here.

Now comes splitStreams itself:

 splitStreams :: Ord a = [(a,b)] - [(a,[b])]
 splitStreams xs =
takeWhile (not . null . snd) $ toList $ splitStreams' empty xs

 splitStreams' :: Ord a = BiMap a Position - [(a,b)] - Imp (a,[b])
 splitStreams' bimap [] =
fmap (\i - (findWithDefault undefined i $ switch bimap,[])) $
fromList [1..]
 splitStreams' bimap ((a,b):xs) =
update fun pos $ splitStreams' bimap' xs
where
fun ~(_,bs) = (a,b:bs)
sz  = size bimap
pos = findWithDefault (sz+1) a bimap
bimap'  =
   (if member a bimap then id else insert a (sz+1)) bimap

Note that update actually generates fresh constructors, so the structure
of our tails-Imp is not really static. But this is no problem as the
form of the constructors is completely known: there is only one.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-15 Thread apfelmus
Ross Paterson wrote:
 On Thu, Sep 14, 2006 at 05:22:05PM +0200, Bertram Felgenhauer wrote:
 [much subtle code]
 We can now build the splitStream function, using the following helper
 function:

 splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
 
 This works for infinite lists if there is no balancing, but if insert does
 balancing, the top of the map will not be available until the last key
 is seen, so splitSeq' could only be used for finite chunks.  Then you'll
 need a way to put the partial answers together.  It might be possible
 to amortize the cost of that for an appropriate choice of chunk length.
 It would also cost some laziness: the chunked version would work for
 infinite lists, but wouldn't produce all the output for a partial list.

No. The point is that in
   let blueprint = empty
   (_,m) = splitSeq' blueprint $ concat $ repeat [(1,'a'),(2,'b')],
the map m contains only as many keys as there are in blueprint, i.e. not
a single one! After munching the first (1,'a'), the first recursive call
to splitSeq', the returned map m' fulfills
   toList m' == [(1,a...)]

The trick is to throw away all keys from the future, so that there is no
need to wait on them.

Also, if your argument would have been correct, then the version without
balancing wouldn't work either because insert already exchanges Leafs
for Nodes in m. So the top of the map would be unavailable until all
Leafs have been exchanged.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: foreach

2006-09-15 Thread apfelmus
Bulat Ziganshin wrote:

 because REAL code is somewhat larger than examples. try to rewrite the
 following:
 
   directory_blocks  -  (`mapM` splitBy (opt_group_dir command) 
 files_to_archive)
 ( \filesInOneDirectory - do
   datablocks  -  (`mapM` splitToSolidBlocks filesInOneDirectory)
 ( \filesInOneDataBlock - do 
 [...]

This particular snippet contains too many undefined identifiers to be
rewritten effectively, but I'm very sure that the whole program can be
restructured to great effect.

Maybe by designing a binary-block-combinator language which calculates
padding bytes and length headers automatically and fiddles out
scheduling for fast writing to a pipe, something like that. Eventually,
a binary parser combinator library which can read single bit flags and
things is a must here. It may even be possible to combine the two
providing a bijection between abstract file tree, tar-ed blocks and
compressed binary file. Separate your concerns, ban IO as much as
possible and any function that takes more than 15 lines is a wart.

I admit that real world applications are not a good exercise to practice
functional programming, but once acquired, advanced functional tactics
prove very powerful. An example might be WASH/CGI which successfully
abstracts session state over HTTP, a problem where Perl-Scripts are
doomed and all kind of imperative buzz like JavaBeans and so on have
been invented but somewhat fail to solve it for non-trivial cases.

Regards,
afpelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-15 Thread apfelmus
[EMAIL PROTECTED] wrote:
 type BiMap a b = (Map.Map a b, Map.Map b a)

Actually BiMap is not needed at all, it suffices to have

 splitStreams :: Ord a = [(a,b)] - [(a,[b])]
 splitStreams xs =
 takeWhile (not . null . snd) $ toList $ splitStreams' Map.empty xs

 splitStreams' :: Ord a = Map.Map a Position - [(a,b)] - Imp (a,[b])
 splitStreams' map [] =
fmap (const (undefined,[])) $ fromList [1..]
 splitStreams' map ((a,b):xs) =
update fun pos $ splitStreams' map' xs
where
fun ~(_,bs) = (a,b:bs)
sz  = Map.size map
pos = Map.findWithDefault (sz+1) a map
map'=
   (if Map.member a map then id else Map.insert a (sz+1)) map

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-17 Thread apfelmus
Ross Paterson wrote:

 How embarrassing.  Still, your code is quite subtle.  As penance, here's
 my explanation, separating the main idea from the knot-tying.
 
 The ingredients are a map type with an insertion function
 [...]
   insert :: Ord k = k - a - Map k a - Map k a
 
 with a partial inverse
 
   uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a)
 
 [...]
 Applying a tupling transformation to insert+uninsert gives your version.
 
 It's interesting that these composed transformations don't seem to cost
 too much.

That the composed transformations are indeed cheap is not necessarily
disturbing. Let's call Bertram's original insert version insertdelete
from now on. When analyzing the magic behind, you do

ross_analysis :: ((bp,map') - (bp',map)) - (bp-bp', (bp,map') - map)
ross_analysis insertdelete = (insert, uninsert)

Of course a tupling transformation will reverse this

tupletiple :: (bp-bp', (bp,map') - map) - ((bp,map') - (bp',map))
tupletiple (insert,uninsert) = insertdelete

But as @djinn will tell you, ross_analysis cannot be realized :) So
the original version possesses additional computational power, it can
reverse everything it's doing with bp on the map'. uninsert does not
have information about the single steps that have been done, it only
knows what should come out. From that, it would have to reconstruct
quickly what happened, if it wants to be fast.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-19 Thread apfelmus
Ross Paterson wrote:
 On Sun, Sep 17, 2006 at 01:08:04PM +0200, [EMAIL PROTECTED] wrote:
 Ross Paterson wrote:
 It's interesting that these composed transformations don't seem to cost
 too much.
 That the composed transformations are indeed cheap is not necessarily
 disturbing.
 
 I meant the composition of the transformations of the tree (update or
 reverse insert) that Bertram does for each list element.  They're cheap
 because they're bounded by the depth of the tree, and even cheaper if
 you're probing some other part of the tree.

Me too :) I tried to point out that separating uninsert from in insert
increases time complexity. In general

   uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a)

   fst $ uninsert _ bp m == differenceWith (curry snd) bp m

with needs O(size of blueprint) time. This is to be compared with O(log
(size of blueprint)) for the combined insertdelete.

That there (likely) must be such a difference can already be seen from
the types of insertdelete and (insert,uninsert) !

But you already pointed out that something like

   insertdelete :: Ord k = k - Map shape k -
   (exists shape' . (Map shape' k, Map shape' a - (a, Map shape a))

is the best type for insertdelete. Here, it is clear that insertdelete
likely can do a fast uninsert.


Btw, why are there no irrefutable patterns for GADTs? I mean, such a sin
should be shame for a non-strict language...


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-19 Thread apfelmus
Conor McBride wrote:

 Just imagine
 
 data Eq a b where Refl :: Eq a a
 
 coerce :: Eq a b - a - b
   Robert Dockins wrote:
 coerce ~Refl x = x 

 coerce undefined True :: String
 
 Bang you're dead. Or rather... Twiddle you're dead.

Mom, he's scaring me! ;)

Who says this function may not be strict in the first argument despite
the irrefutable pattern? Also, for the sarcastically inclined, death is
semantically equivalent to _|_. As is (fix id :: a - b), too.

Alas, one can say

   dontcoerce :: Eq a (Maybe b) - a - Maybe b
   dontcoerce ~Refl x = Just x

and

   dontcoerce' _|_ == (\x - Just (x :: b(~Refl)))
   == \(x::_|_) - Just (x :: _|_)
   == \_ - Just _|_

The type of x on the right-hand side inherently depends on ~Refl and
should be _|_ if ~Refl is. Type refinement makes the types depend on
values, after all.

For our optimization problem, it's only a matter of constructors on the
right hand side. They should pop out before do not look on any
arguments, so it's safe to cry so you just know, i'm a Just.

Maybe one can remedy things by introducing a _|_ on type-level with the
only inhabitant _|_ :: _|_. That would treat objections Ross Paterson
referenced:
 See the second restriction.  Irrefutable patterns aren't mentioned
 there, but they're the general case.  See also
 
 http://www.haskell.org/pipermail/haskell-cafe/2006-May/015609.html
 http://www.haskell.org/pipermail/glasgow-haskell-users/2006-May/010278.html

Admittedly, it's a thin ice to trample on, but otherwise, for this
special splitSeq-problem, one runs into the more haste less speed
dilemma (i mean Wadler's paper ). Bertram's lazy algorithm actually is
an online-algorithm and it should remain one when making it type safe.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] irrefutable patterns for existential types / GADTs

2006-09-29 Thread apfelmus
Ross Paterson wrote:
 The story so far:
 apfelmus: why are there no irrefutable patterns for GADTs?
 Conor: because you could use them to write unsafeCoerce
 Ross: how about irrefutable patterns (or newtypes) for existential types?
 Simon: Try giving the translation into System F + (existential) data types

I'd like to add that I see no problem with

   coerce :: Eq a b - a - b
   coerce ~Refl x = x

as long as we have

   coerce _|_ x === _|_

The wish is that

   f = \refl - Just . coerce refl
 = \~Refl x - Just x

should satisfy

   f _|_ x === Just _|_
   f _|_ x =/= _|_

and likewise for any other constructor than Just.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs

2006-09-30 Thread apfelmus
 {-# OPTIONS -fglasgow-exts #-}

 module Main where

 import Data.IORef

 data T a where
   Li:: Int - T Int
   Lb:: Bool - T Bool
   La:: a - T a

 writeInt:: T a - IORef a - IO ()
 writeInt v ref = case v of
 ~(Li x) - writeIORef ref (1::Int)

 readBool:: T a - IORef a - IO ()
 readBool v ref = case v of
 ~(Lb x) - 
 readIORef ref = (print . not)

 tt::T a - IO ()
 tt v = case v of
  ~(Li x) -  print OK

 main = do
  tt (La undefined)
  ref - newIORef undefined
  writeInt (La undefined) ref
  readBool (La undefined) ref

This code is more intricate than

  data Eq a b where Refl :: Eq a a

  coerce :: Eq a b - a - b
  coerce ~Refl x = x

but I think it amounts to exactly the same thing: ref and x are forced
to a particular type witnessed by the GADT.

But I think that something still can be squeezed out, strictness is not
absolutely necessary (see upcoming mail on this thread).

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs

2006-09-30 Thread apfelmus
Here is a formulation of what exactly I require from irrefutable pattern
matches for GADTs.

The problem arouse from the Optimization problem thread. In short,
here is a GADT-using, type safe version of Bertram's solution (without
balancing)

  -- a binary search tree with witness about its shape
 data Map s k a where
 Leaf :: Map () k a
 Node :: k - a - Map s k a - Map t k a - Map (s,t) k a
 
 empty :: Map () k a
 empty = Leaf
 
 member :: Ord k = k - Map s k a - Bool
 member _ Leaf= False
 member k (Node k' _ l r) = case compare k k' of
 LT - member k l
 EQ - True
 GT - member k r
 
 -- a wrapper for an existential type
 data Undoer s k a where
 Undoer :: (Map t k a) - (Map t k a - (a,Map s k a)) - Undoer s k a
 
 -- insert key element blueprint map (blueprint, result, map)
 insert :: Ord k = k - a - Map s k a - Undoer s k a
 insert k a Leaf =
Undoer (Node k a Leaf Leaf) (\(Node k a Leaf Leaf) - (a,Leaf))
 insert k a (Node k' b (l :: Map l k a) (r :: Map r k a) :: Map s k a)
 = case compare k k' of
 LT - case insert k a l of
 Undoer (m :: Map t k a) f -
 Undoer (Node k' b m r :: Map (t,r) k a)
 (\(Node k' b' m' r' :: Map (t,r) k a) -
 let (a,l') = f m' in
   (a,Node k' b' l' r' :: Map s k a))
 EQ - error inserting existing element
 GT - case insert k a r of
 Undoer (m :: Map t k a) f -
 Undoer (Node k' b l m :: Map (l,t) k a)
 (\(Node k' b' l' m' :: Map (l,t) k a) -
 let (a,r') = f m' in
   (a,Node k' b' l' r' :: Map s k a))
 

 update :: Ord k = k - (a - a) - Map s k a - Map s k a
 -- the culprit, to be defined later
 
 splitSeq :: Ord a = [(a,b)] - [(a,[b])]
 splitSeq = fst . splitSeq' empty
 
 splitSeq' :: Ord a = Map s a [b] - [(a,b)] - ([(a,[b])], Map s a [b])
 splitSeq' bp [] = ([], bp)
 splitSeq' bp ((a,b):xs) = case member a bp of
 True - let (l, m)  = splitSeq' bp xs in (l, update a (b:) m)
 _- case insert a [] bp of
 Undoer bp' f - let
 (rs,m)  = splitSeq' bp' xs
 (bs,m') = f m
 in ((a, b:bs) : rs, m')

To make this work in a lazy manner (it becomes an online algorithm then
and works for infinite lists), I'd like to have

 update :: Ord k = k - (a - a) - Map s k a - Map s k a
 update k f ~(Node k' a l r) = case compare k k' of
 LT - Node k' a (update k f l) r
 EQ - Node k' (f a) l r
 GT - Node k' a l (update k f r)

reasoning that the Node constructor should be output before one inspects
the incoming ~Node. I thought that (l, m)  = splitSeq' bp xs witnesses
that  bp  and  m  have the same Shape  s, but this is the point where
the not-normalizing argument throws in: the type of splitSeq' claims to
be a proof that  bp  and  m  have the same  s  but who knows whether it
really terminates?


So, I'm opting for a different update which is more along the lines of
Bertram's original:

 update :: Ord k = k - (a - a)
   - Map s k a - Map t k a - Map s k a
 update k f (Node k' _ l' r') ~(Node _ a l r) = case compare k k' of
 LT - Node k' a (update k f l' l) r
 EQ - Node k' (f a) l r
 GT - Node k' a l (update k f r)

The blueprint gives immediate witness that splitSeq' preserves shape, so
this update should be fine.



To summarize, the main problem is to get a lazy/online algorithm (the
problem here falls into the more haste, less speed category) while
staying more type safe.
@Conor: how does this issue look like in Epigram?

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs

2006-09-30 Thread apfelmus
 But that makes it refutable! For the above, either
 
  coerce _|_ x === x
 
 or the notation is being abused.

Making a pattern irrefutable does not mean that the function in question
will become lazy:

  fromJust (~Just x) = x

  fromJust _|_ === _|_

The point with coerce is that it looks very much like being lazy in its
first argument but in fact it is not.

 The trouble is that GADT pattern matching has an impact on types, as
 well as being a selector-destructor mechanism, and for the impact on
 types to be safe, the match must be strict.

 I think it's the extra power of GADTs to tell you more about type
 variables already in play which does the damage.

But I think that something still can be squeezed out, strictness is not
absolutely necessary. I thought something along the lines of

  f :: Eq a b - a - Maybe b
  f ~Refl x = Just x

with

  f _|_ x  === Just _|_

The point is one can always output the constructor Just, it does not
inspect the type of x.

Now, I don't think anymore that this is feasible as the type of (Just x)
still depends on the type of x (even if the constructor Just does not
mention it). Nevertheless, I still want to remove strictness, see my
next mail in this thread.

 For existentials, I'm not sure, but it seems to me that there's not such
 a serious issue. Isn't the only way you can use the type which allegedly
 exists to project out some dictionary/other data which is packed inside
 the existential? Won't this projection will cause a nice safe _|_
 instead of a nasty unsafe segfault?

I agree. The only practical problem I can imagine is that GHC internally
treats existentials as GADTs.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs

2006-10-01 Thread apfelmus
 Thanks for asking!
 [...]
 Online algorithms do look like a good place to try to get some leverage
 from this, but we haven't really been in a position to experiment so
 far. I'm sure that will change. 

Well, I asked because laziness with memoization gives a definite edge in
terms of algorithmic time complexity over strictness (proved for online
algorithms in more haste, less speed) and I wonder how this goes
together with totality and dependent types. A forthcoming ultimate
programming language should retain this edge, shouldn't it? ;-)


 It isn't necessary to perform
 constructor discrimination when it's statically known that exactly one
 constructor is possible, so those patterns can always be made
 irrefutable, with matching replaced by projection.

Thanks for pinpointing the culprit, because translated to GADTs, it
means that

whenever we know from the type index that only one constructor is
possible, it can be made irrefutable.

So this is the kind of irrefutable pattern one could safely add. In
particular, this can be weakened to

whenever we know the type index, i.e. there is no type refinement
depending on the value, an irrefutable pattern is safe.

But once no type refinement happens, the irrefutable pattern can already
be obtained via let bindings in existing Haskell! Below is a lazy
version of Bertram solution to the Optimization problem following
Ross' suggestions with existentials and GADTs:

 {-# OPTIONS_GHC -fglasgow-exts #-}
 
 data Map s k a where
 Leaf :: Map () k a
 Node :: k - a - Map s k a - Map t k a - Map (s,t) k a

(Map s k a) is a tree which makes its structure explicit in the type
index s.


 unNode :: Map (s,t) k a - (k,a,Map s k a, Map t k a)
 unNode (Node k a l r) = (k,a,l,r)

unNode knows the type index and is to be used for a lazy pattern match
  let (k,a,l,r) = unNode ..

 empty :: Map () k a
 empty = Leaf
 
 member :: Ord k = k - Map s k a - Bool
 member _ Leaf= False
 member k (Node k' _ l r) = case compare k k' of
 LT - member k l
 EQ - True
 GT - member k r
 
 data Undoer s k a where
 Undoer :: (Map t k a) - (Map t k a - (a,Map s k a)) - Undoer s k a

Undoer is just (exists t. ...) wrapped into a digestible type.

 -- insert key element blueprint map (blueprint, result, map)
 insert :: Ord k = k - a - Map s k a - Undoer s k a
 insert k a Leaf = Undoer (Node k a Leaf Leaf)
 (\map - let (_,a,_,_) = unNode map in (a,Leaf))

Note how the type system proves that the function (\map - ..) always
has to expect (map) == Node ...
Then, unNode binds k',b',... /lazily/.
Same goes for the rest of insert:

 insert k a (Node k' b (l :: Map l k a) (r :: Map r k a) :: Map s k a)
 = case compare k k' of
 LT - case insert k a l of
 Undoer (m :: Map t k a) f -
 Undoer (Node k' b m r :: Map (t,r) k a)
 (\map - let
 (k',b',m',r') = unNode map
 (a,l') = f m'
 in (a,Node k' b' l' r' :: Map s k a))
 EQ - error inserting existing element
 GT - case insert k a r of
 Undoer (m :: Map t k a) f -
 Undoer (Node k' b l m :: Map (l,t) k a)
 (\map - let
 (k',b',l',m') = unNode map
 (a,r') = f m'
 in (a,Node k' b' l' r' :: Map s k a))
 
 -- update k fun blueprint map
 update :: Ord k = k - (a - a) - Map s k a - Map s k a - Map s k a
 update k f (Node k' _ l' r') map = case compare k k' of
   LT - Node k' a (update k f l' l) r
   EQ - Node k' (f a) l r
   GT - Node k' a l (update k f r' r)
 where
 (_,a,l,r) = unNode map

Again a lazy pattern match on (map). Note that the type system enforces
that blueprint and actual (map) have the same shape s. The type
refinement for GADTs happens in the strict pattern match on the
blueprint and this allows us to lazily match the (map).

 splitSeq :: Ord a = [(a,b)] - [(a,[b])]
 splitSeq = fst . splitSeq' empty
 
 splitSeq' :: Ord a = Map s a [b] - [(a,b)] - ([(a,[b])], Map s a [b])
 splitSeq' bp [] = ([], bp)
 splitSeq' bp ((a,b):xs) = case member a bp of
 True - let (l, m)  = splitSeq' bp xs in (l, update a (b:) bp m)
 _- case insert a [] bp of
 Undoer bp' f - let
 (rs,m)  = splitSeq' bp' xs
 (bs,m') = f m
 in ((a, b:bs) : rs, m')

To summarize, the problem can be solved without irrefutable patterns for
GADTs: the code above works for infinite lists. Yet, they are handy and
can be introduced safely in the case where the type indices are known in
advance and no type refinement happens.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: question - which monad to use?

2006-10-02 Thread apfelmus
Tamas K Papp wrote:
 Hi,
 
 I have a computation where a function is always applied to the
 previous result.  However, this function may not return a value (it
 involves finding a root numerically, and there may be no zero on the
 interval).  The whole problem has a parameter c0, and the function is
 also parametrized by the number of steps that have been taken
 previously.
 
 To make things concrete,
 
 type Failmessage = Int  -- this might be something more complex
 data Result a = Root a | Failure Failmessage -- guess I could use Either too
 
 f :: Double - Int - Double 0 - Result Double
 f c0 0 _ = c0
 f c0 j x = {- computation using x, parameters calculated from c0 and j -}
 
 Then
 
 c1 = f c0 0 c0
 c2 = f c0 1 c1
 c3 = f c0 2 c2
 
 
 up to cn.
 
 I would like to
 
 1) stop the computation when a Failure occurs, and store that failure
 
 2) keep track of intermediate results up to the point of failure, ie
 have a list [c1,c2,c3,...] at the end, which would go to cn in the
 ideal case of no failure.
 
 I think that a monad would be the cleanest way to do this.  I think I
 could try writing one (it would be a good exercise, I haven't written
 a monad before).  I would like to know if there is a predefined one
 which would work.

There are a several ways to achieve your goal, most do not use monads.

*a) The underappreciated unfold*

unfoldr :: (a - Maybe (b,a)) - a - [b]

basically iterates a function

g :: a - Maybe (b,a)

and collects [b] until f fails with Nothing.

With your given function f, one can define

g c0 (j,c) = case f c0 j c of
Root c' - Just (c',(j+1,c'))
_   - Nothing

and get the job done by

results = unfoldr (g c0) (0,c0)

The only problem is that the failure message is lost. You can write your
own unfold, though:

   unfoldE ::
   (a - Either Failmessage a) - a - ([a], Maybe Failmessage)



*b) tying the knot, building an infinite list*

cs = Root c0 : [f c0 j ck | (j,Root ck) - zip [0..] cs]

will yield

cs [Root c0, Root c1, ..., Failure i] ++ _|_

Then, you just have to collect results:

collect xs = (failure, [ck | Root ck - ys])
where
isFailure (Failure i) = True
isFailure _ = False
(ys,failure:_) = break isFailure
results = collect cs

Note that in this case, you always have to end your values with a
failure (success as failure). Alas, you didn't mention a stopping
condition, did you?



*c) the monadic way*
This is not the preferred solution and I'll only sketch it here. It only
makes sense if you have many different f whose calling order depends
heavily on their outcomes. Basically, your monad does: 2) keep track of
results (MonadWriter) and 2) may yield an error (MonadError). Note that
you want to keep track of results even if an error is yielded, so you
end up with

type MyMonad a = ErrorT (Either Failmessage) (Writer [Double]) a

where ErrorT and Writer are from the Control.Monad.* modules.

f :: Double - Int - Double - MyMonad Double
f c0 j ck = do
{computation}
if {screwed up}
then fail you too, Brutus
else tell {c_{k+1}}
return {c_{k+1}}



*d) reconsider your definition of f, separate concerns *
The fact that the computation of ck depends on the iteration count j
makes me suspicious. If you are using j for convergence tests etc. only,
then it's not good.
The most elegant way is to separate concerns: first generate an infinite
list of approximations

f :: Double - Double - Double
f c0 ck = {c_{k+1}}

cs = iterate (f c0)

and then look for convergence

epsilon = 1e-12
takeUntilConvergence []  = []
takeUntilConvergence [x] = [x]
takeUntilConvergence (x:x2:xs) =
if abs (x - x2) = epsilon
then [x]
else x:takeUntilConvergence (x2:xs)

or anything else (irregular behaviour, ...). If it's difficult to tell
from the cs whether things went wrong, but easy to tell from within f
(division by almost 0.0 etc.), you can always blend the separate
concerns approach into a) and b):

-- iterate as infinite list
iterate f x0 = let xs = x0 : map f xs in xs
-- iterate as unfoldr
iterate f x0 = unfoldr g x0
where g x = let x' = f x in Just (x',x')


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How would you replace a field in a CSV file?

2006-10-02 Thread apfelmus
Pete Kazmier wrote:
 import Data.ByteString.Lazy.Char8 as B hiding (map,foldr)
 import Data.List (map)
 import Data.Map as M hiding (map)
 
 -- This will be populated from a file
 dict = M.singleton (B.pack Pete) (B.pack Kazmier)
 
 main = B.interact $ B.unlines . map doline . B.lines
 where doline= B.join comma . mapIndex fixup . B.split ','
   comma = B.singleton ','
   fixup 3 s = M.findWithDefault s s dict
   fixup n s = s
 
 -- f is supplied the index of the current element being processed
 mapIndex :: (Int - ByteString - ByteString) - [ByteString] -
 [ByteString]
 mapIndex f xs = m xs 0
 where m []  _ = []
   m (x:xs') i = f i x : (m xs' $! i+1)

How about

import Data.ByteString.Lazy.Char8 as B hiding (map,foldr)
import Data.List (map)
import Data.Map as M hiding (map)

dict = M.singleton (B.pack Pete) (B.pack Kazmier)

main = B.interact $ B.unlines . map doline . B.lines
where
doline  = B.join comma . zipWith ($) fixup9 . B.split ','
fixup9  = fixup 9
fixup n = replicate n id
  ++ [\s - M.findWithDefault s s dict] ++ repeat id

Note that fixup9 is shared intentionally across different invocations of
doline. The index n starts at 0.


Also note that because (compare :: (Byte)String - ..) takes time
proportional to the string length, the use of Map will inevitably
introduce a constant factor. But I'm still happy you didn't use arrays
or hash tables (urgh!) :)

In any case, tries are *the* natural data structure for (in)finite maps
in functional languages, see also

Ralf Hinze. Generalizing generalized tries. Journal of Functional
Programming, 10(4):327-351, July 2000
http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Problematic irrefutable pattern matching of existentials

2006-10-02 Thread apfelmus
[EMAIL PROTECTED] wrote:
 I have come to realize that irrefutable pattern matching of
 existentials may indeed be problematic. Let us consider the following
 existential data type
 
 data FE = forall a. Typeable a = Foo a
| forall a. Typeable a = Bar a
 
 The following tests type and run (the latter raising the expected
 exception).
 
 test1 = case (Foo ()) of Foo x - show $ typeOf x
 test2 = case (Bar True) of Foo x - show $ typeOf x
 
 Let us suppose that irrefutable pattern matching of existentials is
 allowed. What would be the value of the expression
 
   case (Bar True) of ~(Foo x) - show $ typeOf x
 then?

Interesting, interesting. Without those unsafe Typeables and further
simplified, we can say

class Foo a where
foo :: a - Int

instance Foo ()   where foo = const 1
instance Foo Bool where foo = const 2

data Bar = forall a . Foo a = Bar a

culprit :: Bar - Int
culprit ~(Bar x) = foo x

Now, hugs -98 gives

 culprit (Bar (undefined :: ()))
1
 culprit (Bar (undefined :: Bool))
2

The interesting part, however is

 culprit undefined
Program error: Prelude.undefined

 culprit (Bar undefined)
ERROR - Unresolved overloading
*** Type   : Foo a = Int
*** Expression : culprit (Bar undefined)

But both should be the same, shouldn't they? In the first case, the
implementation gets an undefined dictionary. But in the second case, the
type system complains.

If we used explicit dictionaries as in
data Bar = forall a . Bar (a-Int, a)
the second case would yield _|_, too,

 One may claim that the existential pattern match also binds an
 implicit dictionary argument, which is demanded above. That
 explanation is quite unsatisfactory, because it breaks the abstraction
 of type classes.

Indeed, the above shows the subtle difference: with type classes, one
may not pass an undefined dictionary as this is an unresolved
overloading. Irrefutable patterns for existential types somehow disturb
things.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs

2006-10-03 Thread apfelmus
John Meacham wrote:

 the reason being that id translates internally to 
 
 id = /\a . \x::a . x 
 
 since you cannot pull out an appropriate type to pass to id without
 evaluating the 'irrefutable' pattern, you end up with _|_ instead of 4.
 
 basically, allowing irrefutable matching on existentials would expose
 whether the underlying implementation performs type-erasure. even if an
 implementation does type erasure like ghc. suddenly odd things occur,
 like adding an unusned class constraint to a type signature might change
 the behavior of a program. (since it will suddenly have to pull out a
 dictionary)

So you mean that

id = /\a . \x :: a . x

is /strict/ in its first argument, i.e. in the /type/ a. So to speak,
type erasure is equivalent to strictness in all and every type argument.
For an irrefutable existential pattern, the type argument should be
lazy, of course. This allows you to pull out an appropriate type, only
that it's pulled out only when needed. The point is that

const c = /\a . \x :: a . c

has no problems of being lazy in its type argument.


Strictness in type arguments is of course a consequence of the fact that
there is no _|_ type. A systematic way to add irrefutable pattern
matches to existentials (or even GADTs) would therefore be the
introduction of lazy type arguments (alas introduction of a type _|_).
Type erasure then becomes a form of /strictness analysis/. I remember
that Wadler's projection based strictness analysis can discover unused
values and that's what happens very often with types as they are
seldomly calculated with at runtime. So you can erase them by virtue of
your strictness analyzer.

Thinking further, this would even allow to free type classes from the
dictionary approach: an overloaded function like

   (+) = /\a . \x :: a . \y :: a . x+y

performs a case analysis on its type argument and selects the right
specialized function. In case where this type is known at compile time,
the /inliner/ will select the right specialization. This type based
dispatch is more natural for type classes than dictionaries because the
latter would in principle allow to supply different dictionaries for one
and the same type.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs

2006-10-04 Thread apfelmus
 yeah, that is what I mean. however, since we don't have explicit type
 passing in haskell, reasoning about the lazyness of types would be quite
 tricky, leading to odd things like changing type signatures (without
 changing the underlying types) can change the behavior of a program.

You mean f.i. by specializing a polymorphic type via type signature? So that

data Foo = exists a . Foo a
f ~(Foo x) = const 4 (id x)

would behave different for (const :: a - Int) and (const :: (a,a) -
Int)? At least the latter does not even typecheck. I suspect that all
possible program behavior changes are already catched at the
type/implementation level: when all types are strict, there just cannot
be irrefutable patterns for existentials :)

 not to mention the optimizations that would be excluded by having to
 preserve the strictness in types behavior... worrying about it in values
 is tricky enough. :)

True. But you don't have different optimizations for types and values,
do you?

 It would actually not be difficult to add lazy types to jhc at all. I
 just don't see any use for them when dealing with haskell source. but
 another front end.. (omega or cayenne perhaps?) certainly might make use
 of them.

In Haskell, only irrefutable patterns for existentials and perhaps GADTs
come to mind. But it would be for the good health of our poor computer
programs (whaa, My mind just exploded.) :)


 again, this is precicely how jhc implements type classes. There are
 numerous benefits, in particular a single scrutinization of a type will
 determine concrete methods for _every_ class used on that type, in
 effect you get run-time specialization for free. The huge disadvantage
 is that it does not play well nice at all with separate compilation,
 this is an active area of research for me in jhc.

Mh, type classes are in essence polymorphic variants on the type level.
As types and values do not really differ, you could gain something from
existing polymorphic variant implementations like in some ML dialects. A
quick google revealed

http://caml.inria.fr/pub/papers/garrigue-polymorphic_variants-ml98.pdf

but I don't know if it helps. Polymorphic recursion makes things difficult.

As a side effect, once you got type classes separately compiled, you get
some form of polymorphic variants on the value level for free! Then,
it's only a step backward to support open data types proposed by Löh and
Hinze

http://www.informatik.uni-bonn.de/~ralf/publications/OpenDatatypes.pdf

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Haskell performance (again)!

2006-10-08 Thread apfelmus
 thing only partly needed) work.

 But this doesn't use tail recursion/accumulation (which seems to me
 like a complete hack that is a mandatory price to pay for using FP)

I never payed it :) You actually fell into the trap by rewriting your
code: performance got worse! Because of laziness, your first version can
deliver the first coefficient of the sum without calculation the others.
Your tail recursive version must calculate the hole sum before returning
anything. This is because the list constructor (:) is lazy in its second
argument. addPoly1 falls into the (large - large) category.

Of course, if you have to sum Int's over a list (which is large - small
and all list elements are needed), then it is wise to be strict and a
seq will come in handy (foldl' (+) 0 in Haskell). On the other hand, if
you are 'or'ing Boolean values over a list ((any) in Haskell), then
laziness is the right thing, why?

Your second version only adds overhead by introducing additional
parameters and so on, it won't get better than the minor laziness
overhead. Test both functions with hugs +s and be disappointed about
the number of reductions/cells used.

Concerning overhead in general, the simplest version will always do
and the compiler will figure out the rest with automatic strictness
analysis, inlining, unboxing, tail recursion etc. To the programmer,
tail recursion is of no importance in a lazy language.



And my last remark, again about data structures:
* (mutable) arrays and hash tables don't fit into a functional language.
Don't use them. Use Data.Map for random access.

That being said, static arrays whose entries do not change (you only
have a single shot) can be very handy for Dynamic Programming. In this
case, it is crucial that they are boxed, i.e. the entries are
calculated lazily.

If need any other data structure,
  http://haskell.org/ghc/docs/latest/html/libraries/
  http://haskell.org/haskellwiki/Libraries_and_tools/Data_structures
  http://www.eecs.tufts.edu/~rdocki01/edison.html

For (in)finite maps, tries are *the* natural data structure for
functional languages, see also
  Ralf Hinze. Generalizing generalized tries. Journal of Functional
  Programming, 10(4):327-351, July 2000
  http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz
If you are in a true need for speed, they'll outperform everything else.

Should you need a custom data structure (appendable priority search
queue interval region ... tree sequence), you can build one from the
all-rounding Finger Tree, see also
  Finger Trees: A Simple General-purpose Data Structure
  Ralf Hinze and Ross Paterson.
  in Journal of Functional Programming16:2 (2006), pages 197-217
  http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-12 Thread apfelmus
 at length in September on this list. The problem here
is that writeFile currently cannot be interleaved lazily, this has to be
simulated with appendFile. We can read files lazily but we cannot output
them lazily.
Can this be remedied? Can there be a version of writeFile which is, in a
sense, dual to getContents?


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: tail-recursing through an associative list

2006-10-12 Thread apfelmus
Malcolm Wallace wrote:
 Seth Gordon [EMAIL PROTECTED] wrote:
 
 almost-entirely-functional code ... that in its first draft, took
 about three seconds to process 2,000 rows, eight minutes to process
 20,000 rows, and overflowed a 1-MB stack when processing 200,000 rows.
  Oops.
 
 Which just goes to show that your algorithm is non-linear in the size of
 the input.  I doubt that your posted snippet is the cause of the
 problem, since it is certainly linear in the AList it is given.

The linear time in the length of AList may well be the problem :)

You seem to use AList as a key-value mapping (I know the word
associative only as mathematical property, please correct me). The Key
acts as key for the grouping you mentioned and the [MetaThingie] is the
actual group of MetaThingie, is that so? That means that with each call
to myFunction, the AList roughly grows by one element.

For logarithmic access times, you should use a binary search tree like
Data.Map or similar. The problem in your case could be that matchKeys is
only approximate and your keys cannot be ordered in suitable fasion.
Then you need a clever algorithm which somehow exploits extra structure
of the keys (perhaps they are intervals, then you can use interval trees
etc.). The C++ code is likely to do some magic along these lines. In
such case, stunning functional power may come from
  Finger Trees: A Simple General-purpose Data Structure
  Ralf Hinze and Ross Paterson.
  in Journal of Functional Programming16:2 (2006), pages 197-217
  http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf

 I profiled the code and
 observed that the most-called-upon function in the program (200
 million entries for those 20,000 rows)

 By optimisation, you can only make this function a constant factor
 faster.  You need to work out how to call it less often in the first
 place.

Almost true. I think that recursive calls are counted as proper call, so
that each top level call to myFunction will result a count of calls to
myFunction linear in the length of AList. Thus in first place alias
top level is not enough.


 type AList = [(Key, [MetaThingies])]

 myFunction :: AList - Thingie - AList
 myFunction [] x = [(key x, [meta x])]
 myFunction ((k, v):tail) x | matchKeys k (key x) =
case tryJoining v x of
Nothing - (k, v) : (myFunction tail x)
Just v' - (k', v') : tail
where v' = bestKey k (key x)
should be (?) where k' = bestKey k (key x)
| otherwise = (k, v) : (myFunction tail x)


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-12 Thread apfelmus
 A better solution would be to begin output before the the whole input is
 read, thus making things more lazy. This can be done the following way:
 from the input, construct a lazy list of (date,line) pairs. Then, let
 foldM thread a map from dates to corresponding output file pointers
 through the list and, at the same time, use the file pointers to output
 the line in question via appendFile. This way, every line consumed is
 immediately dispatched to its corresponding output file and things
 should only require memory for the different dates, besides buffering.

 I tried this approach previously and it seems to be unacceptably slow.
 I thought the slowness was just due to the fact that file operations
 are slow, but I'll include my code here (cleaned up to take some of
 your previous comments into account) just in case there is something
 subtle I'm missing which is slowing down the code (B, M, and myRead
 are as above):
 
 dates' file nRows = do
  (cols, rows) - myRead file
  dateIx - M.lookup cols $ B.pack \Date\
  let writeDate row = B.appendFile (dataDir++fmt date) row where
  date = (B.split ',' row)!!dateIx
  fmt = B.unpack . B.map (\x - if x == '-' then '_' else x) .
 B.takeWhile (/= ' ')
  oldFiles - getDirectoryContents dataDir
  mapM_ (\f - catch (removeFile $ dataDir++f) $ const $ return ()) oldFiles
  mapM_ writeDate $ take nRows rows
 
 This code takes over 20 minutes to process 100MB on my machine.

No wonder, as this opens and closes the file on every row. The operating
system will be kept quite busy that way! In some sense, your are
outsourcing the row collecting M.Map to the OS... Of course, you want to
open the files once and dispatch the rows to the different open handles.

Here is a version (untested) which either does the read all then write
approach (group'n'write) or opens the output files simultaneously
(group'n'write2). Note also that there is no need to use M.Map for
finding the Date keyword in the CSV header (which even hurts
performance) though the effects are completely negligible.


main = do
  args - getArgs
  case args of
[dates,file,nRows] - dates file (read nRows)

dates file nRows =
B.readFile file =
group'n'write . sugarWithDates . take nRows . B.lines

sugarWithDates (header:rows) =
map (\r - (B.split ',' r) !! dateIx, r)) rows
where
Just dateIx = Data.List.lookup (B.pack \Date\) $
zip (B.split , header) [0..]

formatDate= B.unpack .
B.map (\x - if x == '-' then '_' else x) . B.takeWhile (/= ' ')
date2filename = (dataDir ++) . formatDate

group'n'write = mapM_ writeDate . M.toList . foldl' addDate M.empty
where
addDate mp (date,row) =
M.insertWith date (\new old - row:old) [] mp
writeDate (date,rows) =
B.writeFile (date2filename date) $ B.unlines rows

group'n'write2 =
foldM addDate M.empty = mapM_ hClose . M.elems
where
addDate mp (date,row) = do
(fp,mp) - case M.lookup date mp of
Just fp - return (fp,mp)
_   - do
fp - openFile (date2filename date) WriteMode
return (fp, M.insert date fp mp)
hPut fp row
return mp



The thing that bugs me is that one cannot separate
group'n'write2 = write2 . group
where (group) is a pure function.
I think some kind of lazy writeFile could allow this.


 thanks for your help,
No problem. :)

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: function result caching

2006-10-13 Thread apfelmus
Carl Witty wrote:
 Instead of using an infinite list, you can use an infinite binary tree,
 with a cached result at every node.

This, also known as patricia tree, is indeed the canonical answer. In
general, one can always construct an (in)finite map from keys (Ints in
the present case) to values as a trie. See also

  Ralf Hinze. Generalizing generalized tries. Journal of Functional
  Programming, 10(4):327-351, July 2000
  http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz

This paper uses generic programming, but that should not be the problem
as concrete cases can always be constructed by hand in plain Haskell.
To apply this to Ints, one should view them as

type Int = [Digit]
data Digit = Zero | One

Also note that there is absolutely no balancing involved (which would
break infinite and lazy stuff).

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-13 Thread apfelmus
 The (almost) point-free versions run faster than my fast
 imperative version and take up significantly less heap space-- even
 the version which reads everything and then writes takes up about 1/3
 the heap space as my version.

That was not intended, though I'm very pleased :-D

  I get the impression that point-free style is a preventive measure
 against space leaks.

Ah, good point, I didn't think about it that way. Point-less makes a bit
sure that old values are not leaking around as they are not passed
through the pipe.

Yet, I'm a bit astonished. I thought that when compiling with -O2,
cosmetic changes should become negligible. Perhaps the strict foldl' has
an effect?

Regards,
afpelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-14 Thread apfelmus
Paul Hudak wrote:
 In fact avoiding space leaks was one of the motivations for our moving
 to an arrow framework for FRP (now called Yampa).  Arrows amount to a
 point-free coding style, although with arrow syntax the cumbersomeness
 of programming in that style is largely alleviated.

I think that's an entirely different thing.

You changed representation of signal transformers from

newtype SF a b = SF ([a] - [b])

to

data SF a b = SF (a - (b, SF a b))

By enforcing a synchronous processing, you avoid leaking Signals. The
latter cannot be isomorphic to a function type (Signal a - Signal b)
for an appropriate Signal, so this implies a point-free style as there
is no way to hold stuff of type (Signal a) in variable bindings.

This does not mean that there is no point-wise syntax for arrows, it
just means that point-wiseness cannot be achieved via variables in the
context of function application, i.e. via lambda abstraction. In fact,
the main point about Arrows is not that they are an abstraction for
computations but that they allow a point-wise syntactic sugar (which
stems from their computational being, of course)!


The optimization problem here uses (almost) one and the same
representation (pure (a - b), sometimes packed in (a - IO b)) and
point-free turns out to be space friendlier than point-wise. That's very
puzzling and i think ghc -O2 should eliminate this.


Regards,
afpelmus



PS: IMHO the semantics of (SF a b) need a real cleanup. (Time - a) -
(Time - b) is too crude, because these map transformers even cannot be
made an instance of ArrowChoice. Also, Dirac-like peaks (that is Events)
do not fit in.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-17 Thread apfelmus
 I think that you misunderstood what I said.  When we went from FRP to Yampa, 
 we 
 changed from using signals directly, i.e. Signal a, to using signal 
 functions, i.e.:
 
 SF a b = Signal a - Signal b
 
 When we did this, we made SF abstract, and we adopted the arrow framework to 
 allow composing, etc. signal functions.  This meant that you could not get 
 your 
 hands on Signals directly, which helped to prevent space leaks.
 
 What you describe above is a change that we made in the /implementation/ of 
 signal functions (specifically, from streams to continuations), which indeed 
 is 
 an entirely different thing.

You mean that only the fact that (Signal a - Signal b) got abstract
prevented space leaks? Can you give an example?

That the implementation with continuations avoids space leaks is clear.
The question is whether the old does when using the new primitives. In
fact, this amounts to the question whether (inject) as defined in

newtype SF' a b = SF' ([a] - [b])
data SF a b = SF (a - (b, SF a b))

inject :: SF a b - SF' a b
inject (SF f) (a:as) = let (b,g) = f a in b:inject g as

preserves space and time complexity:

inject (sf `o` sf')  =same space time=  (inject sf) . (inject sf')

and the same for all other operations you provide besides `o`.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-18 Thread apfelmus
Time to write actual code and do some measurements :)

The code attached at the end of the message gets compiled with -O2.

Writing a sample test file happens with (writeTest #columns #rows) like
in writeTest 4 50 (~7 seconds, ~770MB heap (=:-o), 3MB test file).
I assume the heap spaces from writeTest are everything added together as
'top' does not report any memory bursts.

The following matrix represents the performance comparisons between four
possibilities. The times get produced by calling (test #matrixrow
(filtertest #matrixcol))

 map   transpose
++[nl]   2.76,2.87,2.832.72,2.88,2.96
   2.72,2.92,2.872.82,2.79,2.88

No significant differences. In a test with more rows,  seems to
perform slightly better (as expected). Transpose is a bit better, too:

writeTest 10 75 (~24 seconds, ~2.8GB heap (=:-o), 15MB test file)
 map   transpose
++[nl]   3.50,3.59,3.423.23,3.26,3.29,3.19
   3.38,3.41,3.413.19,3.14,3.23

Looks like my measurements somewhat disagree with yours. Odd. But note
that by discriminating the to be tested functionality on run-time, the
compiler gets no chance to optimize things for the particular case. So
in reality, (++[nl]) could trigger a good code transformation whereas
() does not.

Also note that transpose is very lazy and is far cheaper than it looks.

Somehow, the 2.8 and 3.5 seconds are not in proportion with respect to
the inputs of 3MB and 15MB or the outputs of 590KB and 400KB (yes, the
smaller input produces a larger output). Your 13 seconds versus 90
seconds makes this even more puzzling. But it looks like writing a CSV
file is far more expensive than reading one. Maybe it's not a good idea
to call hPut very often.


 the mask (map (`elem` tags) cols) is
 only computed once (shouldn't the compiler do that automatically since
 the expression is constant?)
 [...]

 col x cols row = row !! i
  where Just i = lookup x $ zip cols [0..] 

One has to be careful,

col x cols = \row - row !! i
where Just i = lookup x $ zip cols [0..]

is different as it shares i across all rows. The compiler is likely not
to do this easy transformation (full laziness transformation), for col
because this can introduce space leaks. These are things the programmer
should have control over, so no optimization here. See also

http://haskell.org/haskellwiki/GHC/FAQ#When_can_I_rely_on_full_laziness.3F

I think the reason given there is wrong, it's not about efficiency but
about space leaks. The map showcase suggests that (map (`elem` tags)
cols) is computed only once, though personally, I don't rely on that (yet).

Regards,
apfelmus

 
 module CSV where
 
 import qualified Data.ByteString.Lazy.Char8 as B
 import Data.List
 import System.IO
 
 {---
   Reading and writing CSV (comma separated value) files
 }
 
 readCSV :: FilePath - IO [[B.ByteString]]
 readCSV file = do
  v - B.readFile file
  return $ map (B.split ',') $ B.lines v
 
 writeCSV :: Int - FilePath - [[B.ByteString]] - IO ()
 writeCSV i file tbl = do
 h - openFile file WriteMode
 mapM_ (writeRow i h) tbl
 hClose h
 
 writeRow j = case j of
 1 - \h - mapM_ (B.hPut h) . (++ [nl]) . intersperse comma
 2 - \h row - (mapM_ (B.hPut h) $ intersperse comma row)  B.hPut h 
 nl
 where
 comma= B.singleton ','
 nl   = B.singleton '\n'
 
 {---
   Processing [[ByteString]]
 }
 select j targs test (cols : rows) =
 narrow $ cols : filter (test cols) rows
 where
 narrow   = colmap j (map snd . filter fst . zip mask)
 mask = map (`elem` targs) cols
 
 colmap :: Int - (forall a . [a] - [a]) - [[a]] - [[a]]
 colmap 1 f = map f
 colmap 2 f = transpose . f . transpose
 
 col x cols = \row - row !! i
 where Just i = lookup x $ zip cols [0..]
 
 if12 = ((== B.pack 2) .) . col (B.pack 1)
 filtertest j = select j (map B.pack [1,2,4]) if12
 
 test i f = readCSV test.csv = writeCSV i result.csv . f
 
 {---
   Test cases
 }
 rotated :: Int - [[B.ByteString]]
 rotated n = map (take n) . iterate (drop n) . concat . repeat .
 map (B.pack . show) $ [1..(n+1)]
 
 writeTest c r = writeCSV 1 test.csv . take r . rotated $ c


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Lexically scoped type variables

2006-10-18 Thread apfelmus
I think the problem comes from the fact that

let (x :: a) = rhs in expr

has two interpretations: either the programmer mainly specifies a
judgment (\Gamma |- rhs :: a) (f.i. as a guide for type inference) or he
just introduces a new variable and sees let as a sugar for

(\(x :: a) - expr) rhs

with the viewpoint that the program is a huge expression and all
top-level declarations are just (somewhat superflous) lambdas which lead
to a final goal.


In both cases, one would assume that the main variable binder is our
good old lambda. Thus, the (::) in (\(x::a) - ..) behaves like a
constructor and the type variable on the right hand side is just a pattern:
f (x :: [a]) = ...
is not any different from
f (Cons x [a]) = ...
When f is applied to an argument, it's the type checker that performs
pattern matching. Of course, one needs nonlinear type patterns:
(f :: a - b) $ (x :: a) = f x
So the type variables in lambdas behaves like normal variables and
things like
g :: Int - Int
g (x :: a) = x + 1 :: a
are entirely fine. All in all, type variables in lambdas are real variables.


The first interpretation of let amounts to the Haskell98 intuition: the code

x :: a - a
x = rhs

means a judgment that x has type (forall . a - a). This also means that
this does not bring any new type variables in scope, only lambdas are
capable of introducing new variables. GHC 6.6 rules are not compatible
with that. They somehow admit variable bindings by judgments only to
discover that this doesn't work for existential types.

But note that we have (x :: ...) and (x=...) as separate statements, and
we may well keep the second interpretation:

   f :: Int - Int
   (f :: a - a) =
   \x - (x+1 :: a)

The intuition is that stand-alone statements (x :: ...) are exactly what
they look like: judgments for the type system. The only odd thing about
them is that they are stand-alone and do not add to the main program
expression.

IMHO, the most natural thing would be that type judgments appear on the
rhs of a let or a lambda and that type variable bindings appear on the
lhs of a let or a lambda. In the definition of f above, (f :: Int -
Int) and (x+1 :: a) are judgments and (f :: a - a) is a variable
binding. Any confusion between judgment and binding is just like a
confusion between constructor application and pattern.

Regards,
apfelmus


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: optimization help

2006-10-18 Thread apfelmus
Hello Bulat,

sorry, I just completely forgot to write down the answer for your post.

 Can this be remedied? Can there be a version of writeFile which is, in a
 sense, dual to getContents?
 
 this can be solved in other way. here is a program that reads stdin
 and puts to stdout lines starting with '' and to stderr the rest.
 note that our main processing function is pure:
 
 main = do a - getContents
   let b = map (processing stdout stderr) (lines a)
   mapM_ (\(file,line) - hPutStrLn file line) b
 
 processing file1 file2 line
   = if  `isPrefixOf` line
   then (file1,line)
   else (file2,line)

(processing) does grouping, but only a part of it. The task of
collecting the writes to the different files is still left to the
operating system. Furthermore, this code will have a hard time to extend
to a dynamical count of files.

I imagined that there might be a way where (processing) already does the
grouping work and trashes the order in which things were read in and a
lazy write would reconstruct the order by laziness and interleave
appropriate write calls. But I now think this is not possible. So any
grouping function like

group :: Input - Data.Map Key [Value]

trashes the order in which different data was read and there cannot be a
reconstruction. Of course, if one uses a different grouping data
structure which still keeps track of the order of data arrival, i.e.

group :: Input - Data.Magic.Groupy Key Value

reconstruction becomes possible. Indeed, (IO a) can be used as grouping
data structure by virtue of

insert :: Key - Value - IO ()
insert = writeFile

and this is too short to merit the boilerplate of an additional purely
functional abstract data structure between the grouping and the file
writing part. One simply does not gain additional clarity.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: don't: a 'do' for comonads?

2006-11-09 Thread apfelmus
Donald Bruce Stewart wrote:
 As seen on #haskell, from an idea by Malcolm,
 
 14:42  ?let top'n'tail = (pre++) . (++/pre) 
 14:42  lambdabot Defined.
 14:43  dons  L.top'n'tail foo me now
 14:43  lambdabot  prefoo me now/pre
 14:43  mauke that reminds me, haskell needs don't
 14:43  dons yes!
 14:44  pkhuong- mm. the opposite of do, eh? do for comonads? :)
 
 So now a prize to the person who comes up with the best use for the
 identifier:
 
 don't :: ?
 
 -- Don

don't :: IO a - a

example :: ()
example = don't (do erase /dev/hda)


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: aggressiveness of functional dependencies

2006-11-09 Thread apfelmus
Nicolas Frisby wrote:

  The inferred type for rite_t1 is
  rite_t1 :: (Iso (Either Char a) (Either f' g')) = () - Either f' g'
 
  Why isn't the inferred type of rite_t1 the same as the ascribed type
  of rite_t1'?
 
   rite_t1' :: Iso b b' = () - Either MyChar b'
   rite_t1' () = rite t1

I think GHC does not know whether the given instance declaration

   instance ... = Iso (Either a b) (Either a' b')

even applies to the special case of (a = Char) because it mostly ignores
the preconditions on the left side of (=). Hugs is much different.
Maybe  throwing away undecidable instances will drastically change things.

 Last post until a response I promise! Another demonstration:

 bar () = runIdentity . flip runStateT 0 $ return 'c'

 Inferred signature:
   bar :: (Monad (StateT s Identity), Num s) = () - (Char, s)

 Why not?
   bar :: Num s = () - (Char, s)

 I am not coming up with an s that could prevent (StateT s Identity)
 from being a monad. Is there one?

The same might go on for this case. By not looking at the preconditions
in the instance declaration

instance Monad m = Monad (StateT s m)

GHC concludes that (Monad (StateT s Identity)) might or might not hold
depending on s.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: aggressiveness of functional dependencies

2006-11-11 Thread apfelmus
Nicolas Frisby wrote:
 First off, thanks for the reply.
 
 I am accustomed to GHC ignoring instance contexts as you mentioned.
 It's the mostly part that I'm curious about: mostly implies there's
 some phases that don't ignore context. Is that only the checking the
 type of the method definitions and absolutely nothing else? So it
 seems...

I just said mostly because I don't know exactly... Though I strongly
suspect, like you, that the preconditions are only used in type
inference/checking and not for overlapping instances and similar
questions related to uniqueness of instance declarations.

 
 My project is rather committed to GHC, but could you elaborate on your
 reference to Hugs being different?

By tradition from Gofer, Hugs implements type classes more flexible than
GHC does. I once experimented with the following code using overlapping
instances:

 data Lift a = Lift a
 type Test = Char
 
 class Message m o where
   send :: m - o - Test
 
 instance Message (o - Test) o where
   send m o = m o
 
 instance Message m o = Message m (Lift o) where
   send m (Lift o) = send m o
 
 msg :: Test - Test
 msg = id 
 
 r1 = send msg 'a'
 r2 = send msg (Lift 'b')

It implements some kind of subtyping. GHC won't typecheck this but hugs
-98 +m will. If I remember correctly, the problem was with

   instance Message (Lift a - Test) (Lift a)

Does this follow from the first or from the second instance declaration?
GHC ignores the precondition in the second declaration, thus believes it
follows from both and consequently rejects it. But Hugs has no problems
with that: it follows the precondition and sees that the second
declaration does not apply to the culprit because there is no (instance
(Lift a - Test) a). Note that if you add this instance later on
(perhaps in a different module), things will break.

The flexibility comes at a price: Gofer's type class system was unsound
and I don't know how much Hugs comes close to this.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difficult memory leak in array processing

2006-11-23 Thread apfelmus
Ah, yet another UndeadArray necromancer exhausting his stack of bones.
May the forces of light suggest to structure the incantation of darkness?

modifyArray arr i f =
readArray arr i = \y - writeArray arr i (f y)

accumM :: (MArray a e m, Ix i) =
(e - e' - e) - a i e - [(i, e')] - m ()
accumM f arr xs = mapM_ chg xs
where chg (i,x) = modifyArray arr i (flip f x)

twodice (x:x':xs)   = (x+x') `div` 2 : twodice xs
noise rng gen   = twodice $ randomRs rng gen

main = do
let bnds = (0, 1000)
buf - newArray_ bnds :: IO Buffer

gen - getStdGen
accumM (curry snd) buf $ zip (range bnds) $ noise (2,12) gen


I absolutely don't know why there is no (accumM) function in the
standard libraries. And having the ByteString API (maybe even the
fusion) for general arrays would be very nice. Maybe (modifyArray) is
missing, too.


Regards,
apfelmus

PS: did you try
   worker low (i `seq` i-1)   ?
PSS: The strictness analyzer is likely to insert that automatically if
you compile with -O or -O2.


Niko Korhonen wrote:
 Hi everyone,
 
 I have the following code whose purpose is to add dither (noise) to a given
 array. The code looks very straightforward but apparently it has a memory
 leak somewhere. Here I try to run the algorithm for an array of 10,000,000
 integers. Ten million unboxed strict integers should equal to 40MB which
 should pose no problems to any modern system. However, the program fails
 with a stack overflow error. I'm using GHC 6.6 on Windows with 1 GB of RAM.
 
 I've tried applying seq and some other strictness tricks (such as x == x)
 pretty much everywhere on the code with no results. Could you please
 help me
 understand what is going on here? Have I misunderstood something
 critical in
 how Haskell works? Here is the relevant portion of the code:
 
 module Main where
 
 import Data.Array.IO
 import System.Random
 
 type Buffer = IOUArray Int Int
 
 -- | Triangular Probability Density Function, equivalent to a roll of two
 dice.
 -- The number sums have different probabilities of surfacing.
 tpdf :: (Int, Int) - IO Int
 tpdf (low, high) = do
first - getStdRandom (randomR (low, high))
second - getStdRandom (randomR (low, high))
return ((first + second) `div` 2)
 
 -- | Fills an array with dither generated by the specified function.
 genSeries :: Buffer - ((Int, Int) - IO Int) - (Int, Int) - IO ()
 genSeries buf denfun lims =
let worker low i
| i = low = do
r - denfun lims
writeArray buf i r
worker low (i - 1)
| otherwise = return ()
in do
(lo, hi) - getBounds buf
worker lo hi
 
 main = do
-- This should allocate a 40 MB array
buf - newArray_ (0, 1000) :: IO Buffer
-- Fill the array with dither
genSeries buf tpdf (2, 12)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difficult memory leak in array processing

2006-11-23 Thread apfelmus
Udo Stenzel wrote:
 Niko Korhonen wrote:
 I have the following code whose purpose is to add dither (noise) to a given
 array. The code looks very straightforward but apparently it has a memory 
 leak
 somewhere.
 
 No, it doesn't.  It can't, because it doesn't even compile.  After
 correcting the obvious
 
 (lo, hi) - getBounds buf
 
 to
 
   let (lo,hi) = bounds buf

The interface changed between GHC 6.4.2 and 6.6.
But no honorable Haskell paladin would ever dare to use UndeadArrays.

 it just works and needs 40MB plus epsilon.  Your problem has to be
 somewhere else.

The strictness analyzer likes Udo more than Niko, does it?

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimizing a hash function

2006-11-26 Thread apfelmus
 Yesterday evening I had a go at porting Bob Jenkins' hash function
 (http://www.burtleburtle.net/bob/c/lookup3.c) to Haskell.

If you need hash functions, I hope that you don't need them to become a
hash table necromancer: this dark data structure simply does not fit
well into Haskell. Tries are a better alternative:

  Ralf Hinze. Generalizing generalized tries. Journal of Functional
  Programming, 10(4):327-351, July 2000
  http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz

Otherwise, take care to conduct hash experiments inside Otiluke's
Resilient Sphere only.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difficult memory leak in array processing

2006-11-27 Thread apfelmus
 nor that I would really understand the code you've posted. On the positive
 side of things, you've given me a lot to think about. Maybe in the
 fullness of time I shall return and say 'Lo! I can write leakless
 Haskell code!'. But alas, that time seems so distant now.

I tried to show how the code can be rewritten in a more declarative
manner by the use of the higher order function (accumM) which is akin to
the accumulation function (Data.Array.accum) for immutable arrays. This
removes the need for traversing the array by hand with (worker). As
Claus Reinke already pointed out, the accumulating parameter (acc) in
your full example has to be evaluated strictly or it will overflow the
stack. This is the same situation as in the famous example

   length' [] n = n
   length' (_:xs) n = length' xs (n+1)

By avoiding (worker) and using higher order functions, you can avoid
this kind of accumulating parameters altogether:

   length xs= foldr (+1) 0 xs

Besides, writing things explicitly tail recursive does not help much in
Haskell.


In the following, I'm going to explain the restructured code posted.

First of all, reducing the amount of IOs is always good for code sanity.
Formulated with an emotional undertone, IO is evil. Why do you need
IO? There are two reasons: the mutable array and the random numbers.
Random numbers clearly are a side-effect, but there is a wonderful
trick: you simply fetch a lazy infinite list of random numbers and work
with that. With System.Random, you can say

do
-- get the default pseudo-random number generator,
-- it already has good random seed
gen - getStgGen
let
rs = randomsRs (2,12) gen
result = dostuffwith rs
return result

(rs) is an infinite list of random numbers and (dostuffwith) is a pure
function, no IO involved. In our case,

twodice (x:x':xs)   = (x+x') `div` 2 : twodice xs
noise rng gen   = twodice $ randomRs rng gen

is a combination of (rs) and (dostuff). (noise) simply supplies an
infinite list of random numbers to (twodice). (twodice) processes this
list by taking pairs and averaging them. Both cover the functionality of
your (tpdf) offers.


Concerning mutable arrays,
 I can say neither that I have any idea what an 'undead array' is
UndeadArray is a bowdlerization of Unboxed Array which is the type
you're using as Buffer. They generally are to be considered evil as
well, hence the renaming. Their only use is to store huge amounts of
memory like their most prominent example (ByteString). If you want to
add noise to an actual audio channel, then they can be adequate. But if
you only want statistics about random numbers, they're completely out of
place. In that case,

countNumber k xs = length $ filter (k==) xs
main = do
gen - getStdGen
return $ countNumber 7 (noise (2,12) gen)

will do everything you need.


If mutable arrays are really unavoidable (hence this is only for
necromancers), the use of higher order functions is mandatory. One
useful higher order function is (accum) from Data.Array. The adaption to
the mutable case uses the helper function

modifyArray arr i f =
readArray arr i = \y - writeArray arr i (f y)

which applies f to the array element at position i.

accumM f arr xs = mapM_ chg xs
where chg (i,x) = modifyArray arr i (flip f x)

takes an array and an association list and accumulates pairs from the
list into the array with the accumulating function f (documentation
from Data.Array.accum). For example if

arr[0] == 0, arr[1] == 1, arr[2] == 2, arr[3] == 3

then

brr = accum (+) arr [(1,2),(1,3),(2,3)]

yields

brr[0] == 0, brr[1] == 6, brr[2] == 5, brr[3] == 3

As another example, (accum (curry snd) arr xs) replaces the array
entries by those listed in xs.

Finally, countNumber can be expressed as a fold over the array. In
general, every higher order function for lists can be translated to one
for arrays.


 However, this necromancy business really does sound like an exiting new
 career prospect. Interesting job opportunities, respect of the
 community, flexible hours and extremely loyal peers and other
 commandlings that will literally work for just for the Brain Food.
 
 Regards,
 Nik The Blak, Necromancer of the Glorious Forces of Evil

This is indeed very tempting :) Though I suspect that the glory of
forces built on (IO a) will be very limited.


Regards,
apfelmus, Golden Delicious of the Shining Bulbs of Light

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difficult memory leak in array processing

2006-11-29 Thread apfelmus
 I personally find doing higher order functions with IO extremely
 difficult, and the resulting compiler errors are often downright scary.
 But this is probably a direct consequence my rather limited
 understanding of the monadic operators that the do notion hides. It
 seems that one has to be /very/ good in Haskell before one can do IO
 effectively. Hmm.

Well, as good as a necromancer has to be :) But seriously, I think the
most mind boggling point about IO actions is that those are higher
order, too. I for myself have taken the point of view that values of
type (IO a) are just plans that will be executed when the main Haskell
program has done its work. The programs task is to assemble the main
plan by sequencing smaller plans with (=).

 But back to the point, using Haskell lists for representing a largish
 buffer of, say, 16-bit samples that is constantly being refreshed from
 hard disk and sent into the sound system for playback would probably be
 inefficient beyond imagination.

Lists are indeed not suited for massive byte crunching.

If they fit into a black box, you can of course outsource things into C.
In case you were writing a track scheduler à la iTunes, the black box
most likely would be a single function (playSoundFile) which does the
job of handling data from disk to the sound card.

Actual sound processing with Haskell functions is more involved. As
already said, specifying loops over the samples as higher order
functions will save you a lot of headaches. The point is that one just
cannot start writing Haskell code and hope that it will run as fast as a
tight loop in C. Instead, one should do aggressive optimizations at a
few critical points only. And these are exactly the higher order loops.

I have to admit that (accumM) is not very fast because it is able to
change data at arbitrary indexes which therefore have to be kept around.
Most often, you want to process each index exactly once which is better
expressed as a (map) or a (fold). As Bulat and Duncan already said,
Data.ByteString does exactly this for arrays of bytes. Using it or the
generalization promised by Duncan is likely to be the best way to go.

On the implementation level, lazy evaluation is in the way when
crunching bytes. So be sure to tell GHC to make things strict and to
unbox and inline everything it can put its hands on by giving
appropriate command line options. As others already pointed out, the
details are on
http://haskell.org/haskellwiki/Performance
As a first test, you may want to compile your original code with -O3
-optc-O3 -funfolding-use-threshold=16 and explore what happens.
GHC does a good job with strictness analysis and it's of no use to drown
your code in strictness annotations. Of course, some well placed ones
may hint GHC about things it overlooked.

To mention yet another approach, the image synthesis library Pan
http://conal.net/pan/
pretends that the programmer writes ordinary Haskell functions, but
under the hood, it's a tiny programming language that gets compiled to
C++. Of course, the main point is that the image synthesis combinators
are strictly less powerful than full Haskell. But as the full power is
unneeded in that context, this doesn't hurt.

While there are audio related projects,
http://haskell.org/haskellwiki/Libraries_and_tools/Music_and_sound
I don't know whether they focus on speed.

Regards,
afpelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Difficult memory leak in array processing

2006-11-30 Thread apfelmus
Duncan Coutts wrote:
 On Wed, 2006-11-29 at 20:27 +0100, [EMAIL PROTECTED] wrote:
 
 On the implementation level, lazy evaluation is in the way when
 crunching bytes.
 
 Something I rather enjoyed when hacking on the ByteString lib is finding
 that actually lazy evaluation is great when crunching bytes, though you
 do need to know exactly when to use it.
 
 Lazy ByteStrings rely on lazy evaluation of course. Demanding a lazy
 ByteString alternates between strictly filling in big chunks of data in
 memory with lazily suspending before producing the next chunk.
 
 As many people have observed before, FP optimisation is to a great
 extent about thinking more carefully about a better evaluation order for
 a computation and making some bits stricter and some bits lazier to get
 that better evaluation order.

I completely agree. My statement was not well formulated, I actually
meant that the overhead implied by lazy evaluation occurring at every
single byte to be crunched is in the way. In this case, the cost is too
high to pay off as the bytes are most likely consumed anyway. The
detailed account keeping about every byte (is it _|_ or not?) is
unnecessary for a (map) which invariably does look at every byte. The
situation is already different for a (fold), though:

any p = foldr (\x b - p x `or` b) False

Here, the computation may stop at any position in the list.

In a sense, lazy ByteStrings just reduce the cost of lazy evaluation /
byte ratio by grouping bytes strictly. Bookkeeping becomes cheaper
because one doesn't look up so often. Of course, with a stricter fold,
(any) gets more costly. The aim is to make the former ratio smaller
while not raising the latter too much. One may say that ByteString makes
explicit what the Optimistic Haskell Compiler aimed to make implicit.

IMHO, lazy evaluation is always the better choice (in theory). In
practice, the only problem about lazy evaluation is the overhead (which
hurts mostly at (large - small)) which is *not* a consequence of no
free lunch but stems from the fact that current machine architecture is
not very friendly to purely functional things. In a sense, the natural
complexity measure in Haskell is the number of reductions in hugs +s
whereas the natural complexity measure on RAM machines is the number of
operations in 0xDEADBEAF-arithmetic. Unfortunately, it's the latter
which is inside Intel.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Generate 50 random coordinates

2006-12-02 Thread apfelmus
Huazhi (Hank) Gong wrote:
 Hello,all
 
 My intention is to generate 50 random coordinates like (x,y).
 
 myrand :: Int
 myrand = randomRIO(1::Int, 100)
 
 rf=[(myrand, myrand) | a - [1..50]]
 
 My short program is like this. However, GHCI say that the return type of
 randomRIO is IO a while the function defined by me is Int. Since I only need
 a integral type as my cooridinate, could you tell me how to fix this?

Taral wrote:
 do
let myrand = randomRIO (1 :: Int, 100)
rf - replicateM 50 (liftM2 (,) myrand myrand) 

Jason Dagit wrote:
 When we look at the type of randomRIO we see:
 randomRIO :: forall a. (Random a) = (a, a) - IO a
 
 You're giving it a tuple of Int, so we can substitute Int for 'a' in
 that type signature:
 myrand :: IO Int
 

 rf=[(myrand, myrand) | a - [1..50]]
 
 Here you are creating a list of tuples.  We see from above that the
 type of the tuples would be (IO Int, IO Int), so rf :: [(IO Int, IO
 Int)].  This is because we have not run the IO action to generate the
 Int yet.

 My short program is like this. However, GHCI say that the return type of
 randomRIO is IO a while the function defined by me is Int. Since I only need
 a integral type as my cooridinate, could you tell me how to fix this?
 
 Your type signature tries to make a claim that myrand has type Int,
 but the compiler will disagree because of that pesky IO type. 

Dons wrote:
 Try initialising your random generator in 'main' , creating a list of
 infinite randoms, take the number you need, then feed that to the
 functions that need the list:
 
 import System.Random
 import Text.Printf
 import Data.Word
 
 main = do
 g - newStdGen  -- intialise a random generator
 let (a,b) = split g -- create two separate generators
 as = randoms a  -- one infinite list of randoms
 bs = randoms b  -- another
 rs = zip as bs  -- an infite list of pairs
 dump (take 50 rs)   -- take 50, and consume them


-- The IO --

Who rides so late through the bits and the bytes?
It's Haskell with his child Hank;
He has the boy type safe in his arm,
He holds him pure, he holds him warm.

My son, what makes you hide your face in fear? -
Father, don't you see the IO?
The IO with randomRIO? -
My son, it's a wisp of the outside world. -

You dear child, do come along with me!
Such lovely replicateMs I'll do with you;
Many colorful liftM2s are to be done,
My Taral does have many a golden suggestions!

My father, my father, and do you not hear
What the IO promises me so softly? -
Be quiet, stay quiet, my child;
I know he won't treat you good. -

Don't you come along with me, my fine boy?
My Jason shall do explain to you so nicely.
My Jason will do tutor you to understand 'return',
And he'll do help you and do show you and do guide you to =.

My father, my father, and do you not read over there
IO's minions in that dark post? -
My son, my son, I see it most definitely:
It's the imperative paradigm looking so grey.

I do love you; I'm charmed by your beautiful mind;
And if you're not willing, then I'll do use imperative force!
My father, my father, now he's grabbing hold of me!
IO has done, IO did do me harm! -

Haskell shudders, he rides swiftly,
He holds in his arms the moaning child.
He reaches Dons' stronghold with effort and urgency.
With the following code, the child will not fall:

   import System.Random

   randPairs :: (RandomGen g, Random a) = (a,a) - g - [(a,a)]
   randPairs range gen = zip as bs
 where  (a,b) = split gen  -- create two separate generators
as = randomRs range a  -- one infinite list of randoms
bs = randomRs range b  -- another

   seed   = 13561956 :: Int
   mygen  = mkStdGen seed

   coords :: [(Int,Int)]
   coords = take 50 $  -- 50 random coordinates derived
randPairs (1,100) mygen-- from the random seed above





Regards,
apfelmus

PS: As you may have guessed, any similarity with living people is either
randomRIO or accidental ... I hope that you accept my apologies for the
latter.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: (a - [b]) - [a - b] ?

2006-12-06 Thread apfelmus
 while pondering over the four fours problem, I wondered: Is there a
 function of type
   (a - [b]) - [a - b]

 It looks a bit like sequence when applied in the ((-) a) Monad:
   sequence :: [a - b] - a - [b]
 but I was looking for the other direction.
 
 I came up with:
   \g - map (\n a - g a !! n) [1..]
 which has the desired type and functionality, but it looks rather
 inelegant and messy. Any better ideas?

While you showed that there exists

unsequence :: (a - [b]) - [a - b]   ,

I doubt that it's very useful. The point is that (sequence) is not
surjective: (a - [b]) has strictly more functions than [a - b]. (a -
[b]) has the freedom to adjust the length of the resulting depending on
a list whereas [a - b] hasn't.

(sequence) converts an Applicative Functor to a Monad but there is no
canonical other way round. In other words,

   unsequence . sequence === id
butsequence . unsequence =/= id


Regards,
apfelmus




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How to combine Error and IO monads?

2006-12-07 Thread apfelmus
Cat Dancer wrote:
 I have a program that performs a series of IO operations, each which
 can result in an error or a value.  If a step returns a value I
 usually want to pass that value on to the next step, if I get an error
 I want to do some error handling but usually want to skip the
 remaining steps.

 Thus I have a lot of functions with return types like IO (Either
 String x), where x might be (), Integer, or some other useful value
 type, and a lot of case statements like 

You are on the right track. The point is that (IO (Either String a)) is
a Monad, too. This allows you to write the ever repeating case
statements once and forall:

   newtype ErrorIO a = ErrorIO (IO (Either String a))

   instance Monad ErrorIO where
   return x = return (Right x)
   f = g  = do
   ex - f
   case ex of
   e@(Left _) - return e
   Right x- g x

It happens that you can parametrize this on IO:

   newtype ErrorT m a = ErrorT (m (Either String a))
   typeErrorIO a  = ErrorT IO a

   instance Monad m = Monad (ErrorT m) where ... -- same as above

And you just rediscovered monad transformers.


Regards,
apfelmus

PS: In the special case of IO, you can also use exceptions. But using
ErrorT is better style.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Why so slow?

2006-12-11 Thread apfelmus
Lyle Kopnicky wrote:
 The code below is using way more RAM than it should. It seems to only
 take so long when I build the 'programs' list - the actual
 reading/parsing is fast. For a 5MB input file, it's using 50MB of RAM!
 Any idea how to combat this?

(ethereal voice)
... Children of Amaunator ... heed my words ... in no long, the world
... will perish, will ... crumble under the blackened forces of IO ...
but there is ... hope ... i can feel that ...
ensure :: (a - Bool) - String - a - a
ensure b s x = if b x then x else error s
... or switching to ... monadic parser combinators ... like
Text.ParserCombinators.Parsec ... can hold strong the light for ...
another aeon or two ...



Regards,
afpelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Well-typed functions with nonexisting signatures [Was: type variable question]

2006-12-16 Thread apfelmus
[EMAIL PROTECTED] wrote:
 Is there a canonical example example that exhibits this behavior?
 Yes, it was discussed back in 2003. Here's the canonical form:
 
 g::(Show a,Show b) = a-b
 g = undefined
 
 --h :: Show a = b - a
 h x = g(h x)
 
 Both Hugs and GHC (in the pure Haskell98 mode!) are happy with h. 
 Both compilers infer the type for h, shown in the comment. If you give
 h the type signature -- the one that is inferred by the compilers
 themselves -- both compilers complain.

Strangely, ghci accepts it on the command line:

$ ghci
[...]
  GHC Interactive, version 6.4.2, for Haskell 98.
[...]
Prelude let g :: (Show a, Show b) = a - b; g = undefined; h :: Show a
= b - a; h x = g (h x) in h 1
*** Exception: Prelude.undefined

but not from a file to be loaded. Hugs +98 does not accept it on command
line as expected. What's going on?

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Seeking advice on a style question

2006-12-27 Thread apfelmus
Steve Schafer wrote:
 In my text/graphics formatting work, I find myself doing a lot of
 pipeline processing, where a data structure will undergo a number of
 step-by-step transformations from input to output. For example, I have a
 function that looks like this (the names have been changed to protect
 the innocent--and to focus on the structure):
 
  process :: a - b - c - d - e
  process x1 x2 x3 x4 = 
let y01   = f01 x1 x2 x3;
y02   = f02 x1;
y03   = f03 y02;
y04   = f04 y03;
y05   = f05 x1 y01 y04;
y06   = f06 x2 y05;
(y07,y08) = f07 y01 y06;
y09   = f08 y07;
(y10,y11) = f09 x2 x4 y09 y08;
y12   = f10 y10;
y13   = f11 y12;
y14   = f12 x1 x2 x3 y01 y13;
y15   = f13 y14;
y16   = f14 y15 y11
in y16
 [...]
 In principle, it could be
 managed with a bunch of nested StateT monads, but my attempts to do so
 seem to get so caught up in the bookkeeping that I lose the advantages
 mentioned above.
 [...]
 So here's the question: Is there a reasonable way to express this kind
 of process (where I suppose that reasonable means in keeping with
 Haskell-nature) that preserves the advantages mentioned above, but
 avoids having to explicitly pass all of the various bits of state
 around?

To me, it looks more like MonadReader than MonadState because I have the
impression that x1 and x2 are enivronments to fetch something from.
(Btw, MonadReader is best used as an Applicative Functor, but that's a
different story).

But in general, it's futile trying to simplify things without knowing
their meaning: names are *important*. I assume that your proper goal is
not to structure pipeline processes in full generality, but to simplify
the current one at hand.

Even if you wanted to simplify the general structure, I think you'd have
to make the types of the different yk explicit. Otherwise, the problem
is underspecified and/or one has to assume that they're all different
(modulo some equalities implied by type correctness).


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread apfelmus
 version is much clearer than any diagram with boxes
and arrows. At least, you should keep names like 'questions', 'bands'
and 'pages'. Only cumbersome derivations like 'groupedBands' or names
with additional ticks are really redundant.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Seeking advice on a style question

2006-12-31 Thread apfelmus
 layout of
 the parent question is not generated by a simple traversal over its
 children, for example. So, for all of the processing that follows, a
 parent question (one with child questions) looks just like any other
 question, and any parent question-specific details remain hidden inside.

Again, I'd say that the algorithm and now more than ever the meaning
dictates the data structure. Assuming that processing children of
different parents is independent and that processing children of the
same parent is *not* independent, I'd group families right together in
a data structure. Whether it's a simple traversal (I interpret this as
independent?) or not, at some point you have to mess with the whole
group at once anyway, so you can put it together right now.

 validateQuestionContent :: [Question] - [Question]
 Uh, I think the type is plain wrong. Doesn't the name suggest 'Question
 - Bool' and a fatal error when a question content is invalid?
 
 No. The idea is to never fail to assemble the questionnaire. If there is
 a question with invalid content, then it is replaced by a dummy question  
 [...]

Ah, of course you are right, I didn't think of enhanced error
processing. I guess that (validateQuestionContent) is not a filter,
because you have to check non-local parent-child relations as well? If
so, then I suggest grouping them beforehand to make it a filter.

  (numberedQuestions,questionCategories) = numberQuestions pagemaster 
 questions;
 Another piece of miscellaneous information contained within the
 pagemaster is the starting question number.

You can still automatically number questions in dependence of a first
number by overloading the (Num) class:

   newtype RelativeInteger = RI { unRI :: Integer - Integer }
   instance (Num RelativeInteger) where ...

   mkAbsolute :: Integer - RelativeInteger - Integer
   mkAbsolute pointOfReference relint = unRI relint pointOfReference

 (Some questionnaires start
 with a question number other than 1 because there is a post-processing
 step where various front ends are pasted onto variable back
 ends--another example of where a hierarchical approach would have made
 more sense, but couldn't be adopted because the database people couldn't
 cope.)

Uh, that doesn't sound good. I assume that the post-processing is not
implemented in Haskell? Otherwise, you could incorporate this stuff into
(process) and choose suitable interfaces. IMHO, dealing with some
modestly expressive interface which still only offers medium abstraction
(like object orientation) is a pain in a type system as powerful as
Haskell's.

  bands' = resolveCrossReferences bands questionCategories;

 Questions are cross-referenced by question number. For example, question
 4 might be in the Sales category, while question 22 might be Detailed
 Sales. The last item of question 22 might be Total; should equal the
 value reported in (4). In order to make the layouts as reusable as
 possible, rather than hard-coding (4) in that last item in (22), there
 is a tag that looks something like this:
 
 textTotal; should equal the value reported in question-ref 
 category=Sales/./text

Fine, though I don't see exactly why this isn't done before after the
questions have been transformed to printable things but before there are
distributed across pages. So the references cannot refer to page
numbers, yet must be processed after transforming questions to rectangles?

  groupedBands = groupBands bands';
 (can't guess on that)
 
 In order to implement widow/orphan control, not every band is allowed to
 start a new page (keep with previous and keep with next, in effect).
 Before being handed off to the paginator, the bands are grouped so that
 each group of bands begins with a band that _is_ allowed to start a
 page, followed by the next n bands that aren't allowed to start a page.
 Each grouped band is then treated by the paginator as an indivisible
 entity. (At this point, the grouped bands could be coalesced into single
 bands, but doing so adds a bit of unnecessary overhead to the rendering
 phase.)

Maybe (paginate) can be given a type along the lines of

   paginate :: Rectangle a = [a] - Pages a

and perhaps you could merge several bands into a single rectangle simply
by saying

   instance Rectangle [Band] where ...



To conclude, I think that (process) can be roughly factorized as follows:

   process = buildPages . questions2rectangles . expandMacros

Now, you get 2/3 of TeX or another desktop publishing system for free,
you only have to replace (questions2rectangles) by (text2rectangles).


Regards,
apfelmus

Footnote:
* Well, it is possible to recover insert, but only by introducing a
contradiction into the logic of types with the help of (undefined):
insert x bp = foo x (bp, (undefined :: map'))
This is clearly unsafe and heavily depends on the implicit knowledge
that the returned (BluePrint') ignores its arguments.

___
Haskell-Cafe mailing list

[Haskell-cafe] Re: Possible (GHC or HGL) bug or ??

2007-01-01 Thread apfelmus
Calvin Smith wrote:
 This basically works, in that it does exactly what I want in Hugs, but
 GHC sometimes pauses partway through rendering, and does not continue
 rendering until I type any key (except space, which exits) or
 unfocus/refocus the window, or move the mouse pointer across the window.

 Sometimes, more often the first time in a GHCI session, it renders
 completely with no pauses, and it seems to pause more and more if I
 evaluate main, then close the window, evaluate again in the same GHCI
 session, repeatedly. The same pausing behavior is observed in a
 GHC-compiled executable.
 
 When the problem occurs, there is a message to the console that says:
 thread blocked indefinitely.

I can reproduce this on OS X with ghc-6.4.2, X11-1.1 and HGL-3.1. The
console message is rare but I also got it once. This looks like a bug in
HGL, perhaps some issue with polling the event queue in a threaded fashion.

 p.s. Any stylistic or other comments about the code welcome too.

The infinite list of colors is a very good idea.

It might also be a good idea not to mess with trigonometry when creating
the snowflake. These things can be put into a single function (rotate)
which rotates a point around the origin by a specified number of
degrees. The following code demonstrates this. Note that the resulting
snowflake has slightly different proportions than your original one, but
it shouldn't be a problem to adjust this.

module Main where

import Graphics.SOE

main = runGraphics $ do
w - openWindow Snowflake Fractal (600, 600)
drawInWindow w $ snowflake (300,300) 200 (cycle $ enumFrom Blue)
spaceClose w

spaceClose w = do
k - getKey w
if k == ' ' then closeWindow w else spaceClose w

rotate :: Double - Point - Point
rotate deg (x,y) = (truncate $ c*x' - s*y', truncate $ s*x' + c*y')
where
(x',y') = (fromIntegral x, fromIntegral y)
rad = deg * pi / 180
(s,c)   = (sin rad, cos rad)

translate :: (Int, Int) - Point - Point
translate (dx,dy) (x,y) = (x + dx, y + dy)

minSize = 2 :: Int

snowflake :: Point - Int - [Color] - Graphic
snowflake _   h _   | h = minSize = emptyGraphic
snowflake pos h (c:cs)  = overGraphics $
map (\pos - snowflake pos (h `div` 3) cs) (mkPoints corners)
++ map (withColor c . polygon . mkPoints) [triangle1, triangle2]
where
-- things gets specified by their angle
-- with respect to the y-axis
mkPoints  = map $ translate pos . flip rotate (0,h)
triangle1 = [0, 120, 240]
triangle2 = map (180+) triangle1
corners   = map (60*) [0..5]

Also note that I eschewed (drawInWindow) in favor of (overGraphic), but
I think that SOE will introduce that at some point, too.

A minor hint is to use Double instead of Float. It doesn't really
matter, but today's computers internally favor Double (double precision
floating point number).


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Composing functions with runST

2007-01-04 Thread apfelmus
Neil Mitchell wrote:
 As for beginner issues with rank-2 types, I've been learning Haskell
 for years now, and have never felt the need for a rank-2 type. If the
 interface for some feature requires rank-2 types I'd call that an
 abstraction leak in most cases. It certainly means that you can't
 properly Hoogle for it, can't compile it with Yhc, can't do full type
 inference etc.

I think that the term abstraction leak is too harsh. In some sense,
you may as well call strong typing an abstraction leak because one
can do the stuff as well in a dynamic typed language and adding strong
typing means that you can't compile it with current compilers, you need
to implement type checking/inference etc. Of course, this analogy has
flaws as higher rank types go to the edge of computability whereas
strong typing can be implemented.

Concerning interfaces, higher rank types offer crucial static checking
that cannot be achieved without them. The prominent example is ST. The
next example is the parsing library frisby. In both cases, it would be
easy to wrack havoc in case the interface would not use higher rank
types. The same analogy as above applies: one uses strong typing because
one does not want to wreak havoc. I would not call this an abstraction
leak.

Concerning implementation, higher rank types are even more widespread:
almost everything involving continuations needs them: ReadP, Exceptions
(as opposed to Either), deforestation etc. In fact, it is quite possible
to throw away algebraic data types altogether and build everything you
need with higher rank types. A prominent example is

   [a] ~= (forall b . (a - b - b) - b - b)
   ~= (forall b . (Maybe (a,b) - b) - b)

The denotational semantics do not change, but the time and space
behavior is much different.

Perhaps the above remarks misinterpret your statement and you meant
abstraction leak in the sense that, because higher rank types are
available, the interface author used them without thinking whether the
same effect can be achieved in ordinary Haskell. Alas, such problems are
not tied to higher rank types: proper interface design is an art and
does not come for free, not to mention interface documentation[1]. One
could easily berserk: why does this library use String and doesn't
abstract it with a type class? Why does that interface only provide IO,
why isn't this available as a library of pure functions? What do these
obviously crappy semantics mean? In this case, higher rank types are a
symptom, not the problem. If one wants to cure the problem by
disallowing the symptom, then I suggest to also erase the symptom IO.
Thoroughly.

Of course, the drawbacks of higher rank types you mentioned remain. In
the case of hoogleability, I'm confident that it is possible to
implement them, it's only that someone has to think hard about it.
Implementing higher rank types in YHC is even harder but not impossible.
Sure, type inference is the most difficult thing, and one has to accept
glitches and drawbacks to make it work. Compared to these difficulties,
I think that the remark

 So I can't just tell someone who's just starting to learn Haskell that
 f $ g y is equivalent to f (g y); I have to say those two are
 *almost always* equivalent, but if you use $ and the compiler complains
 about not being able to match the expected and the inferred type and a
 type signature in the error message has the word 'forall', try rewriting
 that expression without the $ and see if it compiles.  Eeeww.

posted in this tread is too harsh. That's life, every language has its
flaws and glitches: parts of the layout rule, pattern guards, I want a
better records system, views, generic programming, etc. But, when code
has to be finished, those glitches or annoying things are best countered
with a shrug: they are not life-threatening. A programming language with
nonexistent type system and ugly semantics is. And much to our joy,
Haskell is far from this.

In that sense, dear reader of this post, just rewrite that expression
without $ and see if it compiles. The complier authors don't want to
annoy you, it's just that the exact reasons why this cannot yet be put
to work are damn hard. You are encouraged to learn about System F to get
a grasp of what is going on, but spending this one $ will be much cheaper.


Regards,
apfelmus

[1] Concerning library documentation, I think that literate Haskell
sources have the drawback that they are either tied to TeX
(\begin{code}..\end{code}) or that every line has to start with a ''.
I'd suggest to add a code../code or something else. The point is
that while (La)TeX can be cranked up to a publishing system, it is not
suited for many tasks such as media-dependent processing. TeX is a macro
language, not a structured document type. And for the strongly typed
Haskell programmer used to referential transparency, macros are a nightmare.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http

[Haskell-cafe] Re: Composing functions with runST

2007-01-04 Thread apfelmus
Yitzchak Gale wrote:
 Well, it turns out that using Data.Sequence or Data.IntMap
 to shuffle a list becomes prohibitive if you might have
 more than about 10^5 elements in your list. So in that
 case you will need to use a mutable array, and you now
 need ST.
 [..]

 Wouldn't it be nice if instead you could just write:

 shuffle :: (RandomGen g, MonadState g m) = [a] - m [a]
 shuffle = stToState . shuffleST
 
 and then just use that directly inside a calculation that
 is otherwise purely non-ST?
 
 It seems, what you really want is
 shuffleST :: RandomGen g = [a] - StateT g ST [a]
 
 Actually, I tried that. It didn't help - it was just one more
 layer I had to peel away to get at the ST inside.
 
 There seems to be no way to avoid the fact that you
 think about state in two very different ways in these
 two monads. Every program is written in either one style
 or the other. Occasionally, you require an isolated use
 of the opposite style, and I am looking for ways of simplifying
 the resulting mess. StateT st ST and MonadST look like
 ways of combining the two, but in practice I find that they
 just seem to get in the way.

I don't get what exactly you want.

If you want to carry your state named MyState (f.i. type MyState =
[Cards,Players]) around in a monad, you use Control.Monad.State MyState.

If (and only if) you have an algorithm (like depth-first search) that
carries an array as state around (nodes already visited) and you know
that this array is used in a single threaded fashion, it might be worth
to update the array in place. For that, you use Control.Monad.ST and
Data.Array.ST and you can be confident that the state carrying has been
strictness analyzed and fine tuned to match the machine. In short, you
get updates in place without selling your soul to IO, runST is your
protection from evil and will keep you pure. ST does not really have
more uses than this one (besides being the foundation for IO). For more
info on ST, see
   http://research.microsoft.com/Users/simonpj/Papers/state-lasc.ps.gz

Note that the you can now achieve the array thing as well with
Data.Array.Diff. This is a purely functional interface to an array type
that uses destructible updates internally and keeps a history to become
persistent. However, I doubt that an array makes a difference over
Data.IntMap for all but the most special cases.


 I am starting to be convinced that the only way to
 write the function I want is by using unsafeRunST. 
 Or type it as
 
 stToState :: MonadState st m = (st - ST s (a, st)) - m a
 
 and then write in the documentation that the
 user is require to write
 
 do
  r - newSTRef x
  ...
  y - readSTRef r
  return (z, y)
 
 by hand every time. Yuck.

If the programmer needs to adhere to a policy, the type system may well
enforce it for him. No unsafeRunST. It's far better to struggle with the
safety device than to discover the hard way that running without it will
directly lead into the debugging hell.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Seeking advice on a style question

2007-01-07 Thread apfelmus
,
 then that's 65 pages that they have to inspect. But if the system were
 to produce ten 11-page questionnaires, even though the first five pages
 of each questionnaire are generated from exactly the same data using
 exactly the same software, that's 110 pages that they have to inspect.

X-)

 
 Thanks for all of the discussion. I think I have a lot to ponder

May the λ guide your path ;) And of course, you can always outsource
some pondering to the mailing list.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: State monad strictness - how?

2007-01-11 Thread apfelmus
. Strictness is
needed because it exerts control over memory (and time). That's
something best left to the compiler. Of course, the compiler's job is
not to turn a O(n^2) algorithm into one that runs in O(n) time, this is
something only the programmer can do. But the compiler's job is to
figure out the `seq`s, fusions and inline definitions because I am too
lazy to mark them explicitly.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: advice on architecting a library (and haskell software in general)

2007-01-12 Thread apfelmus
Yang wrote:
 hi all, i'm looking for advice on how to architect a simple widget
 library for vty/hscurses, but these beginner-level questions should
 apply to haskell applications in general. input/requests on any aspect
 of my design would be greatly appreciated - in return perhaps you'll
 get something you'd actually want to use!

 i have studied in detail various haskell curses/vty apps, including
 yi, hmp3, riot, and hscurses.widgets/contactmanager. my immediate goal
 is to produce a set of composable widgets and a flexible framework
 that - as much as possible - reduces the amount of code a user has to
 write. my eventual goal is to make a functional gui library, along the
 lines of fruit/wxfruit/fg. to this end i've also read their
 literature, but i still do not understand enough about
 arrows/yampa/afrp.

Currently, the design of a functional UI library (be it graphical or for
tty) is still an open research problem.

Somehow, the arrow based approaches like Fruit etc. are not yet
satisfying. A predecessor to this approach is FranTk. The early library
Fudgets is in essence based on arrows but precedes their invention. The
most recent development in this direction is Phooey.

In the mean time, the medium level libraries wxHaskell and gtk2hs have
gathered acceptance, mostly because they implement a full widget set
while still being reasonably succinct. A predecessor is HToolkit drawing
from ObjectIO. Despite being close to their imperative cousins, they
already supersede them in expressiveness. They make an excellent base
for further experiments, be it for arrows (wxFruit) or other recent
approaches like PropLang.

You have two options:

* Search for the grail. Most likely, this doesn't involve much coding
but much searching and has the risk of not finding it. But as the tty
doesn't have many different widgets, you can concentrate on the high
level ideas, i.e. how to get rid of IO and IORefs. Pumping your brain
with ideas - as you already do - surely helps.

* Implement an existing design. This way, you'll actually program
something. I'd propose to implement something medium level along the
lines of wxHaskell that can later be utilized in a high level approach.
Maybe you can even create a cross platform interface, i.e. one that
works for a tty and for graphical UIs at the same time. The author of
HToolkit wrote a proposal on how to transparently enclose Windows, Mac
and Gnome.


 try to use it for a real application that has little to do with
 parsing or other purported strengths of the language.

Well, Haskell's only strength is that it gives you enough power to
express your ideas, i.e. to compose great things from small ones. The
strength of your ideas is your business :) In this sense, monadic parser
combinators are not an inherent strength of the language, they happen to
be powerful by themselves.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: different performance of equivalent expression

2007-01-13 Thread apfelmus
 I've run into strange effect that I can not explain. I have simple
 expression that can be written by two equivalent ways. However one way
 give much performance gain over another. Here is an example:
 
 -- apply function many times (tail-recursive)
 many n f x = if n == 0 then x else many (n-1) f $! (f x)
 
 -- first adder function
 adder1 = let r = many 500 sin 1.0 in \x - x + r
 
 -- second adder function
 adder2 = \x - x + many 500 sin 1.0

 If you run program it think some seconds performing math, and them
 prints 4 results immediately. But with adder2 function, it perform
 calculation in every call, which can be seen visually.

 It seems that compiler is able to cache in some way long computation
 in first case, but not in second.

This is indeed the case and entirely reasonable. The effect is called
memoization, a key ingredient to lazy evaluation. To simplify the
explanation, let's take the examples

adder1 = let r = many 5000 (1+) 0 in \x - x + r
adder2 = \x - let s = many 5000 (1+) 0 in x + s

The evaluation of (adder1 3) proceeds as follows:

adder1 3
  = (let r = many 5000 (1+) 0 in \x - x + r) 3
  = let r = many 5000 (1+) 0 in 3 + r
  = let r = 5000 in 3 + r
  = 5003

Now, (r) will be updated with the result (5000) after it has been
calculated and subsequent access to (r) will retrieve the updated value
as in

adder1 4
  = (let r = 5000 in \x - x + r) 4
  = let r = 5000 in 4 + r
  = 5004

Every incarnation of (adder1) shares the same r. For (adder2), things
are different. Here, (s) will be updated as well, but different
incarnations of (adder2) do not share the same (s) because (s) is only
in scope after supplying some argument (x). Hence, every (adder2 3) and
(adder3 4) (re)calculates its own (s).

 I always thought that let a = b in x + a is just a syntactic sugar for x
 + b. Is it wrong?

This is correct but doesn't apply to the case above. The question here
is whether

let a = b in \x - x + a
  and
\x - let a = b in x + a

are equivalent. Considering the result, they are. But considering
running time and memory footprint, they are not. The first trades memory
for speed, the second trades speed for memory. In general, the compiler
is reluctant to switch between those two versions, i.e. it does not
perform much common subexpression elimination or let floating (see GHC
manual). The choice must be up to the programmer.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: strange performance of expression evaluators

2007-01-13 Thread apfelmus
 I've done some more experiments. The following program defines simple
 arithmetic expression with indexed variables. I've written four
 different ways to evaluate them:
  - eval1 is simple monadic evaluator
  - eval2 is the obvious straight-forward implentation
  - compile1 is attempt to perform compilation
  - compile2 is refined compile1 with sub-expressions explicitely
  separated via let binding.
 
 Test evaluates the same expression in 100 different environments.
 The results are:
   - eval1 - 17.47 sec
   - eval2 - 3.71 sec
   - compile1 - 3.79 sec
   - compile2 - 3.74 sec
 
 This figures are completely mysterious for me.
1) I expected eval1 and eval2 to perform equally. In fact, eval1 is
4.7 times slower for eval2.
2) I expected compile{1,2} to perform much faster then eval{1,2}.
   However, the compilation attempt does not give any speed up at
   all.

Your intention is that (compile2 test) should analyze the expression
tree of (test) only once when evaluating it for different environments.

I'm not sure whether the constructors (Add), (Mul) etc. get replaced
once and for all by (+) and (*) or whether this really matters, because
(eval2), (compile1) and (compile2) have the same running time. I think
that memoization (as explained in my previous post) only takes place for
values not of function type, i.e. partially evaluated functions aren't
memoized. It may also be that the compiler optimizes things for the
concrete expression (test) you gave in your code. So supplying the
expression interactively could show a difference between (eval2),
(compile1) and (compile2).

Ironically, (eval1) does compile as much as you intend (compile2) to
do. But it looks like the overhead imposed by appealing to
Control.Monad.Reader doesn't get inlined away completely.

Currently, you don't do much work per expression, it just gets
evaluated. To take advantage of memoization, you need to do more
expensive analysis on a per expression basis. For example, you might
want to precalculate stuff that doesn't depend on the environment:

   data ConstVar a = Const a | Var (Env - a)

   eval :: ConstVar a - Env - a
   eval (Const x) = const x
   eval (Var f)   = f

   -- addition, multiplication etc. do precalculation
   -- when the constituents are known beforehand
   instance Num a = ConstVar a where
   (Const x) + (Const y) = Const (x + y)
   x + y = Var (\e - eval x e + eval y e)
   ...

   data Expr a = Value a | Variable Name
   | Add (Expr a) (Expr a) | Mul (Expr a) (Expr a)

   compile :: Num a = Expr a - ConstVar a
   compile (Value c)= Const c
   compile (Variable v) = Var (\e - e ! v)
   compile (Add x y)= (compile x) + (compile y)
   compile (Mul x y)= (compile x) * (compile y)

   testexpr = (Mul (Value 1) (Value 2)) `Add` (Variable 1)
   test = eval . compile $ testexpr

Of course, this can be improved. For instance, it currently does not
know about the associative law like in

(Add (Value 1) (Add (Value 2) (Variable 1)))

Now, it is clear that analyzing the expression again and again every
time it needs to be evaluated (interpretation) is wasted work.


Regards,
apfelmus

PS:

 data Expr = Const !Value | Var !Int
   | Add !Expr !Expr | Sub !Expr !Expr | Mul !Expr !Expr

You'd better leave out the strictness annotations (!).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Article review: Category Theory

2007-01-19 Thread apfelmus
Ulf Norell wrote:
 In the section on the category laws you say that the identity morphism
 should satisfy
 
   f . idA = idB . f
 
 This is not strong enough. You need
 
   f . idA = f = idB . f
 
 Unfortunately, fixing this means that the category Hask is no longer a
 category since
 
   _|_ . id = \x - _|_ =/= _|_

Neil Mitchell wrote:
 Isn't _|_ = \x - _|_?
 
 _|_ `seq` () = _|_
 (\x - _|_) `seq` () = ()
 
 Whether this is the fault of seq or not is your call...

Subtle, subtle.

From the point of view of denotational semantics, the functions (x
\mapsto _|_) and _|_ are the same as equality and the semantic
approximation order are defined point-wise. Usually, the morphisms of
some category arising from a (non-normalizing or polymorphic) lambda
calculus are given by such partial functions.

The key point about lambda calculi is that the external morphisms sets
can be internalized, i.e. represented as objects of the category
themselves. So, the set of morphisms from 'Integer' to 'Integer' can be
represented by the type 'Integer - Integer'. But, as the example with
`seq` shows, this is not entirely true. Apparently, Haskell represents
function types in a boxed way, i.e. they are lifted by an extra _|_:

   newtype ClosureInt2Int = Closure (Integer - Integer)#

Thus, Hask is not a category, at least not as defined in the article.
The problem is that (either) morphisms or the morphism composition ('.')
are not internalized correctly in Haskell.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: seq (was: Article review: Category Theory)

2007-01-20 Thread apfelmus
Lennart Augustsson wrote:
 And this is why some of us think that adding polymorphic seq to Haskell
 was a mistake. :(

To ease the pain, (oca)ML has the same problem/feature: function types
are lifted:

  let rec f (x : int) : int = f x ;;
  let g y = let x = 1 / 0 in f ;;
  let const y = 1 ;;

  # const f ;;
  - : int = 1
  # const (g 1) ;;
  Exception: Division_by_zero.

The reason is, of course, that one cannot be strict in a function
argument (taking _|_ = \x - _|_) because this is undecidable (and
nonsense with extensional equality).

But because the ML-equivalent of (.) is strict, it still does a correct
internalization of morphism composition.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] [Fwd: Re: Computer Language Shootout]

2007-01-25 Thread apfelmus
Neil Mitchell wrote:
 That would be very neat. Another neat trick would be generalising
 optimisations so that there are fewer and more independant passes,
 this would make it easier to understand (and is what I was working on
 for Yhc).

Well, it's the nature of repeatedly applying local reductions that you
will neither really know what's going nor truly understand how to
improve it. One particular example is the GHC rules mechanism. It's much
better than doing nothing, but it often fails to fuse yet another list.

I think that one can only achieve deterministic optimization by
carefully choosing and exploiting the type system of the intermediate
language. For instance, short cut fusion and strictness analysis can be
expressed as type inference. If you have an intermediate language where
things like boxing and forcing of thunks is explicit and typed, you have
a chance to eliminate such expensive constructs by type inference and
conventional inlining magic.

One point is that local optimization is hard to extend across the
boundaries imposed by recursion and the fixed point combinator. But I
can imagine that switching to a core language with non-totality (lifted
types) manifest in the types, like System F with a fixed point combinator

   fix :: Pointed a = (a - a) - a

is key to crossing this boundary.

 Yhc has intermediate code that is substantially more Haskell like, and
 with the command:
 
 loadCore file.ycr = writeFile file.html . coreHtml
 
 You can generate an active Html document that lets you explore the
 Core in a more interactive way - including hyperlinks for function
 names, syntax hilighting etc.
 
 i.e: http://www.cs.york.ac.uk/fp/yhc/Roman.html
 
 All of these things make playing with Yhc Core much more fun than
 playing with GHC Core. There is absolutely no reason you couldn't add
 all these things to GHC Core, then perhaps you'd save some time when
 it does come to the examine core level.

Wow, the core looks really cool! One look and you see it all. I would
even rename the local variables to single letters like a,b,c because the
cryptic numbers are quite hard to track. This is something coreHtml can
do. Also, I'd add more color, like making punctuation marks grey, but
that's a very personal taste.


Concerning the shootout, it deserves much laud for it is certainly not
an easy task to set it up for so many language and it keep running. But
I'm not very fond of the benchmarks themselves.

The goal seems to be to measure how fast different languages can execute
a hard wired algorithm which specifies memory management and data
layout. While this is a valid goal, I don't think it is a worthy one. It
simply does not get to the interesting facts, namely how fast the
natural algorithms for each language are. It just compares highly
artificial algorithm implementations.

Random examples are the nsieves (count the prime numbers from 2 to M
[...] create a sequence of M boolean flags) and the k-nucleotide
(define a procedure/function to update a hashtable of k-nucleotide
keys) benchmarks. Both boolean flags and hash tables are completely
alien to Haskell. The former would be naturally implemented by a list of
primes, the latter naturally with a generalized trie.

Of course, disallowing unboxed arrays will certainly move Haskell down
the ranking. But what do we gain from unlimited array necromancy? Do we
get a fair account on how fast and short day to day Haskell really is?
And how to tweak the compilers to make day to day Haskell an even more
pleasant experience? I think not. (Do not misunderstand me, ByteStrings
are fine for it is the purely functional style that counts).

And sorry, but using the number of gzipped bytes for comparing the code
length is just ridiculous.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] OOP parametric polymorphism

2007-01-28 Thread apfelmus
Donald Bruce Stewart wrote:
 deliverable:
 ...In the tradition of the letters of an ignorant newbie...

 What's the consensus on the OOP in Haskell *now*?  There're some
 libraries such as OOHaskell, O'Haskell, and Haskell~98's own qualified
 type system with inheritance.

 If I have GHC, which way to do anything OOP-like is considered right 
 today?
 
 Using existentials and typeclasses to do some OO things wouldn't be
 considered unidiomatic (particularly, using existentials to package up
 interfaces to values).
 
 In general though, using a functional approach will produce better
 (simpler) Haskell code, and make it more likely others will understand it.
 Personally, I run in fear from OO Haskell ;)

Instead of OOP, Haskell uses (parametric) polymorphism which is more
powerful than OOP. For instance, the function

   length :: [a] - Int

or, with an explicit forall

   length :: forall a . [a] - Int

counts the number of elements in a list [a], regardless of what type
a those elements have. Moreover, it is guaranteed that length does
not inspect or change the elements, because it must work for all types
a the same way (this is called parametricity). Another example is

   map :: (a - b) - [a] - [b]

which maps a function over all elements in the list.

In addition, Haskell has type classes (which are similar to interfaces
in OOP). The most basic example is

   class Eq a where
  (==) :: a - a - Bool

Thus, you have an equality test available on all types that are
instances of this class. For example, you can test whether two Strings
are equal, because String is an instance of Eq. More generally, you say
whether two lists are equal if you know how to test elements for equality:

   instance Eq a = Eq [a] where
  [] == [] = True
  (x:xs) == (y:ys) = (x == y)  (xs == ys)
  _  == _  = False



The important thing I want to point out in this post is that parametric
polymorphism is indeed more powerful than OOP: already a concept like Eq
is impossible to implement in OOP. The problem is best illustrated with
the class Ord (*), which provides an ordering relation. Let's
concentrate on the smaller than function

   () :: Ord a = a - a - Bool

Can I create an interface that expresses the same thing?

   public interface Comparable {
boolean smaller_than(?? y);
   }

No, because there is no type I can attribute to the second argument of
smaller_than. The problem is that I can only compare to values of the
*same* type, i.e. the type which implements the interface.

Can I create a class the expresses the same thing?

   public class Comparable {
boolean smaller_than(Comparable y);
   }

This seems like a solution, but it is not. The problem is subtyping: if
i make integers and strings members of this class, i would be able to
compare the number 1 against the string hello, which should be
reported as a type error.

I have no formal proof, but I think that the  function cannot be
expressed in a type correct way in OOP. AFAIK, only Java Generics can
express the requirement we want:

   interface OrdA {
boolean smaller_than(A x, A y);
   }

   class String implements OrdString { ... }

But Generics are a considerable extension to OOP. In fact, there is
nothing really object oriented in here anymore, we're just on our way to
parametric polymorphism.


My final remark is about what this means for the existential quantifier
in Haskell. Because of the injection

   inject :: forall a . a - (exists a . a)

the existential quantifier can be thought of as implementing some form
of subtyping, i.e. (exists a . a) is a supertype of every a. The point
now is: given

   type ExistsOrd = exists a . Ord a = a

there is *no*

   instance Ord ExistsOrd where ...

because we could compare arbitrary subtypes of ExistsOrd then. In the
end, the existental quantifier has limited use for data abstraction,
it's forall that makes things happen.



Regards,
apfelmus


(*) We don't consider Eq because given a test on type equality, we can
generalize the signature of (==)

   (==) :: (Eq a, Eq b) = a - b - Bool

Indeed, this is what OOP equality does.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Channel9 Interview: Software Composability and the Future of Languages

2007-01-28 Thread apfelmus
Michael T. Richter wrote:
 And in, you know, the real world of programming you don't
 face mathematical problems as your bread and butter.

Can you prove that? ;)

 You face problems
 in a messy world of too-short deadlines, too-few resources,
 too-poorly-communicated requirements and too-many-hours work.

In it's essence, the way of mathematics is to solve an infinite number
of problems at once by generalizing and simplifying them until they read
1 == 1. But that's exactly the kind of stuff you need: thanks to
generalization, you already implemented all requirements before the
customer can even conceive them and thanks to simplification, needed
resources and hours of work shrink to reasonable amounts resulting in
deadlines becoming harmless :)

Well, i mean it seriously.
- Coding a complicated configuration system, no doubt baroque, can it be
simplified by basing it on a simple but powerful macro language with
simple and sane semantics? Can it be based on the lambda calculus? Is
this general enough? Maybe you want to assure that every macro terminates?
- Coding a graphical user interface with lots of forms, can they be
reduced to their essence and generated automatically from a data type?
Perhaps in the style of Functional Forms
(www.st.cs.ru.nl/papers/2005/eves2005-FFormsIFL04.pdf)? Are they general
enough? If they require immediate feedback or interactive error
checking, may tangible values (http://conal.net/papers/Eros/) be an idea
to base on?
- Coding a dynamic website and having to control caching and data
mining, can this logic be separated out, restricting yourself to a
programming model that allows those this to happen transparently? Can
you plunder Google's Map Reduce model for that
(www.cs.vu.nl/~ralf/MapReduce/paper.pdf)?
- Coding data base access or a package management system, can data
integrity be assured by again restricting yourself to a less general
programming model? Like Software Transactional Memory? Or is it just
enough to use strong typing and a simple yet clever data structure
(http://www.galois.com/cufp/slides/2006/CliffordBeshers.pdf)?

The structures behind the repetitions, the generalizations to rule them
all, the simplifications to find them, they all lie there. But they may
resist discovery and you may need cleverness and, well, mathematics to
find them. The point about Haskell is that its type system is pure and
rich enough to enable you to actually express the proof, the insight as
a program. Only few programming languages can do that. And you know:
computers and Haskell itself are products of mathematics.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Paths to tree

2007-01-29 Thread apfelmus
 of 'insertMaybe :: (v - v - v) - v
- Maybe v - Maybe v).


Now what about 'MapString v', how do we get this? Well, your
implementation corresponds to the choice

  type MapString v = [(String,v)]

But in our case, we can apply the same trick again! We have 'String =
[Char]' and given an implementation of

  data MapChar v = ...

we can use exactly the same code from 'MapPath v' to implement
'MapString v'! (This reuse can be abstracted into a type class, but I'll
not cover that here.) Of course, we need 'MapChar v' now. But yet, again
we can think of Char as

  Char ^= Int ^= [Bool]

where the '[Bool]' means the list of digits in binary representation.
So, given 'MapBool v', we can implement 'MapChar v' with yet again the
same code that we used for the preceding finite maps! Fortunately, the
recursion ends here because a finite map for 'Bool'eans is just the pair

  type MapBool v = (Maybe v, Maybe v)



In case your head does not yet hurt too much :), further information
about tries in Haskell and a detailed explanation of why the code above
works, can be found in

  Ralf Hinze. Generalizing generalized tries. Journal of Functional
  Programming, 10(4):327-351, July 2000
  http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz


Regards,
apfelmus


 import Data.Tree
 import Control.Monad
 
 data ArcData = ArcData
  { name :: String
  } deriving Show
 
 type ArcTree = Tree ArcData
 type ArcForest = Forest ArcData
 
 pathsToForest :: [[String]] - ArcForest
 pathsToForest paths = mergeForest $ concat $ map pathToTree paths
 
 
 mergeForest :: ArcForest - ArcForest
 mergeForest [] = []
 mergeForest (x:xs) = merge x (mergeForest xs)
  where
merge :: ArcTree - ArcForest - ArcForest
merge tree [] = [tree]
merge tree (y:ys) =
  if sameTreeName tree y
then
  merge
tree
{ subForest = mergeForest ((subForest tree) ++ (subForest y))
}
ys
else
  (y:merge tree ys)
 
 treeName :: ArcTree - String
 treeName tree = name $ rootLabel $ tree
 
 sameTreeName :: ArcTree - ArcTree - Bool
 sameTreeName treeLeft treeRight = treeName treeLeft == treeName treeRight
 
 pathToTree :: [String] - ArcForest
 pathToTree [] = []
 pathToTree (name:subpath) =
  [ Node
{ rootLabel = ArcData { name = name }
, subForest = pathToTree subpath
}
  ]
 
 prettyPrint' :: ArcForest - [String]
 prettyPrint' [] = []
 prettyPrint' (x:xs) =
  [name $ rootLabel $ x] ++ (map (  ++) (prettyPrint' $ subForest x))
 ++
  prettyPrint' xs
 
 prettyPrint :: ArcForest - IO ()
 prettyPrint forest = do
  forM_ (prettyPrint' forest) putStrLn

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Paths to tree

2007-01-30 Thread apfelmus
 appears to work.  I also find the union function incredibly easy to
 understand. I only hope I got it right.

 union :: (Ord k) = Trie k v - Trie k v - Trie k v
 union (Trie k0 v0) (Trie k1 v1) = Trie k0 v
  where
v = Map.unionWith union v0 v1

Well, once you found such a really concise function, it can only be
correct :)

While it is not relevant in your case, note that 'union' can be extended
to be applicable with the recursive trick. But the extension suggest
itself by noting that you used 'Map.unionWith' instead of 'Map.union'.
So, you could do

unionWith :: (Ord k) = (v - v - v)
  - Trie k v - Trie k v - Trie k v
unionWith f (Trie k0 v0) (Trie k1 v1) = Trie (f k0 k1) v
where
v = Map.unionWith (unionWith f) v0 v1

union = unionWith (\_ y - y)



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Haskell Cookbook?

2007-01-31 Thread apfelmus
Alexy Khrabrov wrote:
 Reflecting why it's harder to pick up
 Haskell than say Ruby or Python, here's what I found -- those
 languages deal with a typical domain available to any programmer --
 his own computer/system/shell.  The artifacts are files, directories,
 timestamps, etc.  The stuff every programmer understands in their
 sleep.  APIs.  So I loved the shell-script beautification thread.

Note that there is an inherent difficulty with files, directories,
pipes: they are not purely functional in style.

While it's unavoidable that

writeFile :: FilePath - String - IO ()

is in IO, I think that

readFile :: FilePath - IO String

would not need to be in IO if the file system would be persistent (not
ephemeral).

Pipes are an ugly way to implement lazy evaluation. For instance, take
the pipe ls -l | grep '*.hs'. To achieve the effect that grep
immediately feeds on the data ls -l spits out, both programs are run
concurrently and blocking keeps in line. This does not generalize so
well to multiple data sources, something lazy evaluation has no problems
with.

And one of the main problems is that current OS see programs as things
of type String - IO String. Of course, pure and strongly typed
programs would be preferred. So, instead of calling

system (ghc -O2 -c  ++ show filepath)

we want

GHC.desugar  :: GHC.Haskell' - GHC.Core
GHC.optimize :: GHC.Core - GHC.Core
GHC.assemble :: GHC.Core - OS.Program


In a sense, current OS are not ready for Haskell yet.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: ANNOUNCE: The Monad.Reader - Issue 6

2007-01-31 Thread apfelmus
Wouter Swierstra wrote:
 I pleased to announce that the latest issue of The Monad.Reader is now
 available:
 [...]

Horray, the long awaited new issue of my favorite breakfast reading is out!

 * Bernie Pope - Getting a Fix from the Right Fold
 [...]

Concerning the strictness of dwBackwards, it suffices to make the
pattern match on (ys,xs) irrefutable:

dwBackwards predicate = fst . dwPairs predicate

dwPairs :: (a - Bool) - [a] - ([a], [a])
dwPairs predicate = foldr combine base
where
-- combine next ~(ys, xs)
| predicate next = (ys, next:xs)
| otherwise = (next:xs, next:xs)
base = ([], [])


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: GADTs: the plot thickens?

2007-01-31 Thread apfelmus
 Its partial inverse, thickening has the spec
 
 thicken i i = Nothing
 thicken i (thin i j) = Just j
 
 thicken :: Fin (S n) - Fin (S n) - Maybe (Fin n)
 thicken Fz  Fz  = Nothing
 thicken Fz  (Fs j)  = Just j
 thicken (Fs i)  Fz  = Just Fz
 thicken (Fs i)  (Fs j)  = fmap Fs (thicken i j) 

 [...]

 So you really need a safety check there. Another way to do it, crudely
 but avoiding the run time numbers, is this
 
 thicken :: Fin (S n) - Fin (S n) - Maybe (Fin n)
 thicken Fz  Fz  = Nothing
 thicken Fz  (Fs j)  = Just j
 thicken (Fs Fz)  Fz  = Just Fz
 thicken (Fs Fz)  (Fs j)  = fmap Fs (thicken Fz j)
 thicken (Fs (Fs i))  Fz  = Just Fz
 thicken (Fs (Fs i))  (Fs j)  = fmap Fs (thicken (Fs i) j)

Isn't the problem simply that in the former 'thicken', the compiler can
not guarantee that the case

   thicken (Fs i)  Fz  = Just Fz

does never show up when we have 'thicken :: Fin (S Z) - ...'? I mean
the subtle point about Fin is that 'Fz' is the only inhabitant of 'Fin
(S Z)'. At least, this is what Epigram can deduce. But due to _|_, every
constructor that is declared may appear in the form of

   Fs _|_

That's why only the latter 'thicken' can be correct.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Channel9 Interview: Software Composability and the Future of Languages

2007-01-31 Thread apfelmus
Bulat Ziganshin wrote:
 there are also many other similar issues, such as lack of good syntax
 for for, while, break and other well-known statements,
 
 On the other hand you have an ability to define your own control
 structures.
 
 i have a lot, but their features are limited, both in terms of
 automatic lifting and overall syntax. let's consider
 
 while (hGetBuf h buf bufsize == bufsize)
   crc := updateCrc crc buf bufsize
   break if crc==0
   print crc

I guess that the crc is a simple fold over the single bytes:

  crc xs = foldl' (\crc word8 - crc `xor` word8) 0 xs

You do not need xs to be an inefficient String, Data.ByteString.Lazy
gives you single byte access (f.i. via fold) but internally reads stuff
in a chunked way, just like you now manually do with hGetBuf. Lazy
evaluation is very handy for separating those the two concerns of
reading the chunks of bytes and presenting them in a non-chunked way.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Let's welcome the Ruby hackers!

2007-02-02 Thread apfelmus
Alexis wrote:
 In contrast with other IT-related communities i've experienced, 
 i've found the Haskell community (both here and on IRC) to generally be 
 helpful, good-humoured and mercifully lacking in flames and alpha 
 behaviours. :-)

I have to reject this claim because there are quite many alphas in here.

For instance, ∀α.α notoriously tries to creep in every discussion, just
because he thinks that he is principally more general than the others.
Of course, he's a blatant liar.

Another well known troll is ∀α.α - α. While at least not throwing in
contradictory posts, he greatly overestimates his role. Most often, you
can just elide his contributions as he only repeats prior arguments.
Sometimes, he even signs his posts with the pseudonym (∀α.α - α)-(∀α.α
- α) to rise in rank, but this is easily seen through.

The list once tried to employ alpha-conversion to get rid of them. But
the only effect was that now, the betas annoy us as well! A particularly
 persistent offspring is ∀α.α - β - β giving rise to much debate in
regular intervals: he managed to subvert parametricity. Also, the
mischievous ∀αβ.α - β even plotted with evil IO to get the attention he
thinks he deserves.

In the end, the alphas and betas are noisy braggarts, talking very long
about what they want to do without doing anything at all. It's the
lambdas who do all the real work. Fortunately, they most often don't
need the signature from their alpha bosses.


Regards,
apfelmus

PS: This mail is best viewed with Unicode (UTF-8).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: (a - [b]) vs. [a - b]

2007-02-03 Thread apfelmus
Chad Scherrer wrote:
 Unfortunately, I was trying to give a simplification of the real
 problem, where the monad is STM instead of []. Based on apfelmus's
 observation of why they can't be isomorphic, I'm guessing I'm out of
 luck.
 
 http://www.haskell.org/pipermail/haskell-cafe/2006-December/020041.html
 
 So in reality, I'm trying to construct something like
 f :: (a - STM b) - STM (a - b)

Indeed, such an f most likely does not exist. What is the task you tried
to solve with the help of f? I guess that either there is a way without
or it just cannot be solved for mathematical reasons.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Generalizing three programs

2007-02-05 Thread apfelmus
Andrew Wagner wrote:
 Hi everyone,
 I've got an interesting problem here I'm trying to solve. Actually,
 I've got several problems which seem to have a very similar structure.
 I want to find a way to abstract them to solve other problems which
 can be thought about in the same way. Here they are:
 http://hpaste.org/307
 http://hpaste.org/308
 http://hpaste.org/309
 
 Note that these are just sketches, the programs aren't done yet. The
 general structure of the problem is that an object enters a system,
 interacts with different parts of the system, and eventually leaves,
 and we want to monitor some statistics about the interaction, so that
 we can then make some changes, and run it again, and hopefully improve
 it. Thanks in advance for any suggestions!

I'm unsure whether it's a good idea to simulate the situations, I'd
prefer a more denotational approach. To that extend,

http://haskell.org/haskellwiki/Research_papers/Data_structures#Probablistic_Programming

may help. Also, I think that in all three problems, the interesting
probability distributions (like the time a random customer has to wait
at the register) - perhaps depending on a chosen scheduling strategy -
can be calculated without sampling. At least, the sampling can be
integrated transparently into the probabilistic functional programming
framework cited above.

Besides, it's not specified what efficiency means in the grocery store
problem. The mean time a customer has to wait is not the only possible
cost measure. Customers have different processing times and one could
weight mean wait time with that, so that people buying few stuff have
much shorter waiting times than people with several full shopping carts.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization fun

2007-02-10 Thread apfelmus
Creighton Hogg wrote:
 Hello Haskell-ers,
 So a friend and I were thinking about making code faster in Haskell, and I
 was wondering if there was a way to improve the following method of
 generating the list of all prime numbers.  It takes about 13 seconds to
 run, meanwhile my friend's C version took 0.1.
 I'd love to learn a bit more about how to optimize Haskell code.

Of course, the best optimization is a better algorithm. In case this is
what you're after, have a look at

   Colin Runciman, Lazy Wheel Sieves and Spirals of Primes
   http://citeseer.ist.psu.edu/runciman97lazy.html

While Haskell makes it possible to express very complicated algorithms
in simple and elegant ways, you have to expect to pay a constant factor
(roughly 2x-10x) when competing against the same algorithm in low-level C.

 -- Naive way to calculate prime numbers, testing each new n to see if it
 has
 prime factors less than sqrt(n).
 import Data.List
 primes = 2:(foldr (\x y - if isPrime x then x:y else y) [] [3..])
where isPrime x = foldl' (\z y - z  (if x `mod` y == 0 then False
 else True)) True (take (floor $ sqrt $ fromIntegral x) primes)

Your code has two glitches and a serious flaw: foldl' is strict but not
fast, use foldr instead. Premature strictness is the root of all evil :)

To see what went wrong, I take the freedom to rewrite the code as

  primes= 2 : filter isPrime [3..]
  isPrime x = all' (\p - x `mod` p /= 0) . take sqrtn $ primes
where sqrtn = floor $ sqrt $ fromIntegral n
  all' prop = foldl' (\z y - z  prop y) True


The first and most minor glitch is the missing type signature. Every
Haskell compiler will default your integers to arbitrary precision Integers:

   :t primes
  [Integer]

I doubt that your C friend uses arbitrary precision arithmetic, so you'd
better write down

  primes  :: [Int]
  isPrime :: Int - Bool


The second glitch is that you 'take sqrtn primes'. This is not the same
as 'takeWhile (= sqrtn) primes', i.e. taking primes as long as they are
smaller than the square root of n. I guess you know that this results in
far fewer primes taken.


The glitches may have been unintentional, but the flaw intentionally
degrades performance: you should not use a strict all' but the lazy

  all prop = foldr (\y z - prop y  z) True

from the Prelude. The point is that the lazy version stops inspecting
the elements of the remaining list whenever (prop y) turns False for the
first time. This is because  is lazy:

  False  x = False

for whatever x we supply. For example, take the list

  [True, False, True, True] ++ replicate 100 True

Here, all returns False after inspecting the first two elements while
all' inspects every one of the 104 list elements just to return False
afterwards. As every second number is even, your all' is busy wasting
time like crazy.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization fun

2007-02-10 Thread apfelmus
You're right, 'fix' is *not* a fix for non-termination, this is better
fixed in the type system (with the right fixed points or you're in a fix) ;)

Fixed regards,
apfelmus

Lennart Augustsson wrote:
 This is actually a pretty good algorithm.  And also a rather subtle one
 when it comes to termination. :)

 Of course, the best optimization is a better algorithm. In case this is
 what you're after, have a look at

Colin Runciman, Lazy Wheel Sieves and Spirals of Primes
http://citeseer.ist.psu.edu/runciman97lazy.html

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Foldr tutorial, Inspired by Getting a Fix from a Fold

2007-02-12 Thread apfelmus
Bernie Pope wrote:
 Lennart Augustsson wrote:
 Sure, but we also have

 para f e xs = snd $ foldr (\ x ~(xs, y) - (x:xs, f x xs y)) ([], e) xs
 Nice one.

Nice one is an euphemism, it's exactly solution one :)

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Foldr tutorial, Inspired by Getting a Fix from a Fold

2007-02-13 Thread apfelmus
Lennart Augustsson wrote:
 para f e xs = snd $ foldr (\ x ~(xs, y) - (x:xs, f x xs y)) ([], e) xs

 I thought solution one was missing the ~ ?

Yes, that's irrefutably right ;) I mean solution one modulo the laziness
bug.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Strange memory consumption problems in something that should be tail-recursive

2007-02-13 Thread apfelmus
Jefferson Heard wrote:
 Argh, bitten by the scheme bug! Right -- NO tail recursion...  So that leaves 
 me with some rather non-intuitive strategies for achieving execution time 
 efficiency.  Anyone care to point me in the direction of a document on 
 efficiency in Haskell?

Besides, proper tail recursion in Haskell needs strictness annotations,
but the best way is to forget the two words tail recursive altogether :)

It always helps to do a rough calculation of how much time you have to
expect it to run. Processing 1TB with a 1GHz processor and 16=2^4
machine instruction in the inner loop (must be quite short, the loop) takes

 2^40 / (2^30 / 16) = 2^14 seconds ~ 4.5 hours

Of course, these 4.5 hours are quite sensitive to the 2^4 factor and
might well be 3 or 9 hours. Assuming that you ran alex on a String, the
reported 36 hours are entirely reasonable, in the sense of alex not
being overly slow.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Is lazyness make big difference?

2007-02-15 Thread apfelmus
Allan Clark wrote:
 For me one of the best examples of this is that of logging within a
 compiler.
 Consider a compiler which operates roughly as such
 
 compileProgram :: String - IO ()
 compileProgram program =
  let (log, abstract) = parse program
   (log2, typedProgram) = typeCheck abstract log
   (log3, convertedProgram) = convertToIntermediate typedProgram log2
   (log4, convertedToAssembly) = convertToAssembly convertedProgram log3
  in do writeFile a.asm (show convertedToAssembly)
   writeFile a.log (show log4)
 
 Now each of the intermediate transforming calls will produce some
 logging information which is added to the current log.

It's a bit OT (off thread) but I think that it's better to appeal to the
monoid structure of logs

  let
(log,  abstract)= parse program
(log2, typedProgram)= typeCheck abstract
(log3, convertedProgram)= convertToIntermediate typedProgram
(log4, convertedToAssembly) = convertToAssembly convertedProgram
  in show (mconcat [log,log2,log3,log4])

i.e. to use Monad.Writer in stead of Monad.State. The point is that for
example 'typedProgram' does not really depend on the contents of 'log',
but the dependencies in your code don't express this. One should switch from

  Log - (a, Log)

to

  (a, Log - Log)

or even

  (a, Log)

if Log already has a natural monoid structure.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Is lazyness make big difference?

2007-02-15 Thread apfelmus
Nick wrote:
 The question is the following: how big the gap between strict languages
 with lazy constructs and Haskell? Does the default lazyness have
 irrefutable advantage over default strictness?

Laziness is needed to achieve true compositionality. This point is
elaborated in

  John Hughes. Why functional programming matters
  http://haskell.org/haskellwiki/Research_papers#Overview

I also think that the laziness in Haskell is already so implicit that
90% of the Haskell code written so far will simply break irreparably if
you experimentally remove it.


By the way, lazy evaluation is strictly more powerful than eager
evaluation (in a pure language, that is) with respect to asymptotic
complexity:

  Richard Bird, Geraint Jones and Oege de Moor.
  More Haste, Less Speed.
  http://web.comlab.ox.ac.uk/oucl/work/geraint.jones/morehaste.html


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Is lazyness make big difference?

2007-02-16 Thread apfelmus
Nick wrote:
main= print primes
primes  = 2:filter is_prime [3,5..]
is_prime n  = all (\p- n `mod` p /= 0) (takeWhile (\p- p*p=n) primes)
 
 We can rewrite this in strict languages with lazy constructs. For
 example, in Scala (of course stream is not only lazily evaluated thing
 there)
 
def main(args: Array[String]): Unit = {
val n = Integer.parseInt(args(0))
System.out.println(primes(ints(2)) take n toList)
}
 
def primes(nums: Stream[Int]): Stream[Int] =
Stream.cons(nums.head,
primes ((nums tail) filter (x = x % nums.head != 0)) )
 
def ints(n: Int): Stream[Int] =
Stream.cons(n, ints(n+1))

Aha, I finally recovered some of the examples from which the claim
Laziness is needed to achieve true compositionality stems.


The first is already present in your example above and also showed up
some time ago in the thread Optimisation fun. The point is that the
function 'all' used in

   is_prime n  = all (\p- n `mod` p /= 0)
 (takeWhile (\p- p*p=n) primes)

works only because we have lazy *Bool*eans. Your Scala version
accidentally (?) circumvents it by using a different algorithm, namely

   primes'  = sieve [2..]
   sieve (x:xs) = x : filter (\y - y `mod` x /= 0) (sieve xs)

Thanks to laziness, 'all' stops as soon as one element does not fulfill
the condition. True compositionality allows us to define

   all p = foldr () True . map p

and get the lazy behavior. You cannot reuse a strict () in such a way.
 Of course, given some support for lazy constructs, you could define a
lazy version of () just as you define a lazy version of lists (called
Streams), but not having laziness as default means that you have to
think about whether your function is intended to be re-used (= you have
to provide lazy interface) or not *before* you write your function.


The second folklore example is lazy mergesort:

  mergesort []  = []
  mergesort xs  = foldtree1 merge $ map return xs

  foldtree1 f [x] = x
  foldtree1 f xs  = foldtree1 f $ pairs xs
 where
 pairs []= []
 pairs [x]   = [x]
 pairs (x:x':xs) = f x x' : pairs xs

  merge [] ys = ys
  merge xs [] = xs
  merge (x:xs) (y:ys) =
  if x = y then x:merge xs (y:ys) else y:merge (x:xs) ys

The point about this 'mergesort' is that while it sorts a complete list
in O(N log N) time, it may return the minimum element in O(N) time
already. Thus, we can be bold and reuse 'mergesort' as in

  minimum = head . mergesort

and still get the desired O(N) asymptotic complexity.

Note: The function 'foldtree' folds the elements of a list as if they
where in a binary tree:

  foldrtree f [1,2,3,4,5,6,7,8]
 ==
  ((1 `f` 2) `f` (3 `f` 4)) `f` ((1 `f` 2) `f` (3 `f` 4))

The O(N) stuff works because 'foldtree' constructs this expression in
O(N + N/2 + N/4 + N/8 + ..) = O(N) time. I'm not entirely sure, but I
think that the more common 'splitAt (length xs `div` 2)' and 'deal
(x:x':xs) = (x:..,x':..)' approaches both take O(N log N) time for the
same task. This makes them unusable for the point here. Besides, only
'foldtree' can easily be transformed into a proof for dependent types,
but that's another story told by Conor McBride in 'Why dependent types
matter'.


There has been another example circulating on #haskell. I think it was
something with

  substrings = concatMap tails . inits

but I can't remember it now. Cale, can you help?

Anyway, the point is that with domain specific embedded languages, the
re-usability without time penalties is crucial. So far, only default
laziness can achieve this.

 I also think that the laziness in Haskell is already so implicit that
 90% of the Haskell code written so far will simply break irreparably
 if you experimentally remove it.
   
 Yes, I understand, that the present Haskell code heavily bases on laziness,
 but I'm going into the problem in general: how much I get,
 if I switch from default strictness to default laziness in my
 hypothetical language L? Or, from other side,
 how much I throw away in the reverse case?

Yes, what I meant with laziness in Haskell is already so implicit is
that the re-use I exemplified above happens subconsciously. So indeed,
it looks like - and only looks like - one could easily turn a lazy
language into a strict one. Isn't that the good thing about laziness
that nobody notices it in the code?


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: stateful walk through a tree?

2007-02-19 Thread apfelmus
David Tolpin wrote:
 I am looking for help in design of a stateful tree walker.

I think that you can use Data.Traversable to great effect. Related to
that are Control.Applicative and Data.Foldable. The papers that are
mentioned in the Haddocks explain what these modules do and what they
are useful for.

 How would I do that in Haskell? I'd like to avoid using mutable variables for 
 now
 (mostly for didactic puproses).

Well, Haskell doesn't have mutable variables as LISP or ML do. In the
end, avoiding mutable variables is more useful for non-didactic purposes :)

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


<    1   2   3   4   5   6   7   8   9   >