Beginners Digest, Vol 27, Issue 36

beginners-request Wed, 15 Sep 2010 09:18:14 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  A question on seq (Daniel Fischer)
   2. Re:  Padding List with Zeros (Henry Olders)
   3.  Re: Padding List with Zeros (Ertugrul Soeylemez)
   4. Re:  Padding List with Zeros (Daniel Fischer)
   5. Re:  A question on seq (Klaus Gy)
   6.  try, seq, and IO (Jeroen van Maanen)


----------------------------------------------------------------------

Message: 1
Date: Wed, 15 Sep 2010 12:22:14 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] A question on seq
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="iso-8859-1"

On Tuesday 14 September 2010 23:26:47, Klaus Gy wrote:
> Hi!
>
> Inspred from the discussion
> http://www.haskell.org/pipermail/beginners/2010-February/003396.html ,
> I just try to understand the seq function a bit better. When I compare
> these two functions:
>
> firstSum :: Num a => a -> [a] -> a
> firstSum s []     = s
> firstSum s (x:xs) = seq help (firstSum help xs)
>   where help      = x + s
>
> secondSum :: Num a => [a] -> a
> secondSum []      = 0
> secondSum (x:xs)  = seq help (x + help)
>   where help      = secondSum xs
>
> What should be the difference?

That depends on the types at which you use them.
For types like Int, Integer, Double, Float, Word, ..., evaluation to WHNF, 
what seq does, is complete evaluation.
So for these types, firstSum keeps a completely evaluated accumulation 
parameter, runs through the list and delivers the result when the end is 
reached. It corresponds closely to

int firstSum(int s, intList xs){
    if (empty(xs)) return s;
    s += xs.head;
    return firstSum(s,xs.tail);
}

where an intList has an int field `head', a pointer `tail' to its tail and 
empty(xs) would be (xs == NULL) in C, (xs == null) in Java.
Since that function is tail-recursive, it doesn't need to allocate new 
stack-frames and thus can run in constant (stack) space (if the Haskell 
version doesn't it's a bug, in C, you'd probably have to tell gcc to
-foptimize-sibling-calls, then it should do fine, in Java, I don't know of 
a VM that does tail-call optimisation - but then, I don't know much about 
Java).

So for these, firstSum is well behaved, pretty much the best you can get.

secondSum is different, the seq there says evaluate the sum of the tail and 
add that to x. Of course, for Int &c, to add x to the sum of the tail, the 
latter has to be evaluated anyway, so the seq is rather pointless.
secondSum is almost equivalent to

thirdSum :: Num a => [a] -> a
thirdSum = foldr (+) 0

and it more or less corresponds to the imperative version

int secondSum(intList xs){
    if (empty(xs)) return 0;
    int tailsum = secondSum(xs.tail);
    return (xs.head + tailsum);
}

This is not tail-recursive, so it needs O(length xs) stack and marches 
twice through the list, so to say, once to the end, building the chain of 
recursive calls and back to the front adding.

So for Int &c, firstSum is vastly superior in space behaviour (always uses 
constant stack space and if the list isn't held in memory by other 
references, also constant heap space).

Things become different when you work with lazy number types.
firstSum, being tail-recursive, can't deliver anything until it has reached 
the end of the list. Keeping the accumulator evaluated to WHNF doesn't mean 
much for lazy types, so the accumulator may well build up huge thunks (but, 
for lazy types, evaluating a huge thunk can still run in small stack space, 
so that's not necessarily catastrophic).

secondSum, on the other hand, can start delivering before the list has been 
completely traversed (depends on the behaviour of (+)).
But if it can, it can probably do even better if you don't seq on the sum 
of the tail, so for those types, thirdSum would be the better option.

Example for lazy number type:

data Peano
    = Zero
    | Succ Peano
      deriving (Eq, Show)

instance Num Peano where
    Zero + n = n
    (Succ m) + n = Succ (m + n)
    -- other methods
    fromInteger n
        | n <= 0    = Zero
        | otherwise = Succ (fromInteger (n-1))

instance Ord Peano where
    compare Zero Zero = EQ
    compare Zero _      = LT
    compare _     Zero = GT
    compare (Succ m) (Succ n) = compare m n


Now try

list :: [Peano]
list = 4:replicate (10^5) 0

thirdSum list > 3
secondSum list > 3
firstSum list > 3

and then increase the length of the list (10^6, 10^7 instead of 10^5).

> In my opinion both functions do not
> return a complete unevaluated thunk (secondSum returns a thunk of the
> form (THUNK + b) where THUNK contains a single numeral value). But it
> seems to me that the first function computes the output somehow linear
> in the sense that it does just a set of substitutions while the second
> functions has to create a tree to handle all the recursive calls of
> seq (sorry, my terminology is for sure a bit poor).

Well, it's a flat tree, it's

secondSum (x:xs) =
    case secondSum xs of
      s -> x+s

where the `case' is supposed to have core semantics, i.e. evaluates the 
expression to WHNF.

So

secondSum [1,2,3]
~> case secondSum [2,3] of
        s1 -> 1 + s1
~> case case secondSum [3] of { s2 -> 2 + s2 } of
        s1 -> 1 + s1
~> case case case secondSum [] of { s3 -> 3 + s3 } of { s2 -> 2 + s2 } of
        s1 -> 1 + s1
~> case case case 0 of { s3 -> 3 + s3 } of { s2 -> 2 + s2} of
        s1 -> 1 + s1
~> case case 3 of { s2 -> 2 + s2 } of
        s1 -> 1 + s1
~> case 5 of
        s1 -> 1 + 5
~> 6

> So I would say the
> first function delivers a better performance. (In the discussion I
> mentioned, the second function was not discussed in this form with the
> seq implementation). Am I right with my thoughts?

Pretty much.

>
> Thanks, fweth



------------------------------

Message: 2
Date: Wed, 15 Sep 2010 09:15:49 -0400
From: Henry Olders <[email protected]>
Subject: Re: [Haskell-beginners] Padding List with Zeros
To: Lorenzo Isella <[email protected]>
Cc: "[email protected]" <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii


On 2010-09-14, at 19:35 , Lorenzo Isella wrote:

> Dear All,
> I still have to find my way with immutable lists and list comprehension.
> Consider the following lists
> 
> A=[0,10,20,30,40,50]
> B=[0,10,50] (i.e. B is a subset of list A; list A is already ordered in 
> increasing order and so is B).
> C=[2,1,-5] i.e. there is a corresponding element in C for every element 
> in B.
> 
> Now, I would like to define a new list D having length equal to the 
> length of A. The elements of D in the position of the elements of A in 
> common with B are equal to the corresponding entries in C, whereas the 
> other ones are zero i.e.
> D=[2,1,0,0,0,-5]. How can I achieve that? The first thought that comes 
> to my mind is to define a list of zeros which I would modify according 
> to my needs, but that is not allowed...
> Many thanks
> 
> Lorenzo

Being a real Haskell newby, I can figure out a one-line solution in Python, but 
I don't know how to do something similar in Haskell, or even if it's possible. 
Please correct me if I'm wrong, but there does not seem to be a dictionary type 
in Haskell, and I am not aware of how to specify an inline if...else inside a 
list comprehension. I would really appreciate it if someone could show me how 
to do something similar to this Python statement in Haskell.

>>> A=[0,10,20,30,40,50]
>>> B=[0,10,50]
>>> C=[2,1,-5]
>>> [dict(zip(B,C))[a] if a in B else 0 for a in A]
[2, 1, 0, 0, 0, -5]

Henry


------------------------------

Message: 3
Date: Wed, 15 Sep 2010 16:09:21 +0200
From: Ertugrul Soeylemez <[email protected]>
Subject: [Haskell-beginners] Re: Padding List with Zeros
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=US-ASCII

Henry Olders <[email protected]> wrote:

> Being a real Haskell newby, I can figure out a one-line solution in
> Python, but I don't know how to do something similar in Haskell, or
> even if it's possible. Please correct me if I'm wrong, but there does
> not seem to be a dictionary type in Haskell, and I am not aware of how
> to specify an inline if...else inside a list comprehension. I would
> really appreciate it if someone could show me how to do something
> similar to this Python statement in Haskell.
>
> >>> A=[0,10,20,30,40,50]
> >>> B=[0,10,50]
> >>> C=[2,1,-5]
> >>> [dict(zip(B,C))[a] if a in B else 0 for a in A]
> [2, 1, 0, 0, 0, -5]

There is a dictionary type, but it has a more general name than in
Python, which is quite common for Haskell data structures.  It's called
'Map' and is defined in the Data.Map module.  In general you want to
import that module qualified to avoid name clashes:

  import qualified Data.Map as M

Then you can use my solution from my other post, which is exactly
equivalent to your Python solution:

  substSubset :: (Num b, Ord a) => [a] -> [b] -> [a] -> [b]
  substSubset ys zs =
    map (\x -> M.findWithDefault 0 x (M.fromList $ zip ys zs))

I prefer to use higher order functions rather than list comprehensions,
because of better composability.  But you can just as well use a list
comprehension here:

  substSubset :: (Num b, Ord a) => [a] -> [b] -> [a] -> [b]
  substSubset ys zs xs =
    [ M.findWithDefault 0 x (M.fromList $ zip ys zs) | x <- xs ]

The findWithDefault function from Data.Map does this comparison
implicitly, but you can also do this instead:

  substSubset :: (Num b, Ord a) => [a] -> [b] -> [a] -> [b]
  substSubset ys zs xs =
    let dict = M.fromList (zip ys zs) in
    [ case M.lookup x dict of Just z -> z; Nothing -> 0 | x <- xs ]

If course a more direct translation is also possible, but you really
don't want that, because pattern matching is not only more elegant, but
also faster and doesn't require an Eq constraint.  Also note that your
Python code does two lookups per element, while mine does only one.

Anyway, it's really better to learn to use higher order functions
properly.  This greatly enhances both composability and code
readability.


Greets,
Ertugrul


-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/




------------------------------

Message: 4
Date: Wed, 15 Sep 2010 16:23:49 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Padding List with Zeros
To: [email protected]
Cc: Lorenzo Isella <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="iso-8859-1"

On Wednesday 15 September 2010 15:15:49, Henry Olders wrote:
> On 2010-09-14, at 19:35 , Lorenzo Isella wrote:
> > Dear All,
> > I still have to find my way with immutable lists and list
> > comprehension. Consider the following lists
> >
> > A=[0,10,20,30,40,50]
> > B=[0,10,50] (i.e. B is a subset of list A; list A is already ordered
> > in increasing order and so is B).
> > C=[2,1,-5] i.e. there is a corresponding element in C for every
> > element in B.
> >
> > Now, I would like to define a new list D having length equal to the
> > length of A. The elements of D in the position of the elements of A in
> > common with B are equal to the corresponding entries in C, whereas the
> > other ones are zero i.e.
> > D=[2,1,0,0,0,-5]. How can I achieve that? The first thought that comes
> > to my mind is to define a list of zeros which I would modify according
> > to my needs, but that is not allowed...
> > Many thanks
> >
> > Lorenzo
>
> Being a real Haskell newby, I can figure out a one-line solution in
> Python, but I don't know how to do something similar in Haskell, or even
> if it's possible. Please correct me if I'm wrong, but there does not
> seem to be a dictionary type in Haskell, and I am not aware of how to
> specify an inline if...else inside a list comprehension. I would really
> appreciate it if someone could show me how to do something similar to
> this Python statement in Haskell.
>

import Data.Maybe

> >>> A=[0,10,20,30,40,50]
> >>> B=[0,10,50]
> >>> C=[2,1,-5]

These have to be lowercase in Haskell, of course :)

> >>> [dict(zip(B,C))[a] if a in B else 0 for a in A]

map (fromMaybe 0 . (`lookup` zip b c)) a

or, as a list comprehension,

[fromMaybe 0 (lookup v dic) | let dic = zip b c, v <- a]

Slightly more verbose than the Python.

But this doesn't deal with multiple entries (istr that was mentioned 
previously in this thread), for

a = [0, 10, 10, 20, 30 , 40, 50]
b = [0, 10, 10, 50]
c = [2, 1, 3, -5]

neither would produce

[2, 1, 3, 0, 0, 0, -5]

which I believe would be the desired behaviour.

>
> [2, 1, 0, 0, 0, -5]
>
> Henry



------------------------------

Message: 5
Date: Wed, 15 Sep 2010 16:55:16 +0200
From: Klaus Gy <[email protected]>
Subject: Re: [Haskell-beginners] A question on seq
To: Daniel Fischer <[email protected]>
Cc: [email protected]
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1

Thank You both very much for the quick and helpful answers! The
mailing list seems to be really supportive for learning Haskell (:

2010/9/15, Daniel Fischer <[email protected]>:
> On Tuesday 14 September 2010 23:26:47, Klaus Gy wrote:
>> Hi!
>>
>> Inspred from the discussion
>> http://www.haskell.org/pipermail/beginners/2010-February/003396.html ,
>> I just try to understand the seq function a bit better. When I compare
>> these two functions:
>>
>> firstSum :: Num a => a -> [a] -> a
>> firstSum s []     = s
>> firstSum s (x:xs) = seq help (firstSum help xs)
>>   where help      = x + s
>>
>> secondSum :: Num a => [a] -> a
>> secondSum []      = 0
>> secondSum (x:xs)  = seq help (x + help)
>>   where help      = secondSum xs
>>
>> What should be the difference?
>
> That depends on the types at which you use them.
> For types like Int, Integer, Double, Float, Word, ..., evaluation to WHNF,
> what seq does, is complete evaluation.
> So for these types, firstSum keeps a completely evaluated accumulation
> parameter, runs through the list and delivers the result when the end is
> reached. It corresponds closely to
>
> int firstSum(int s, intList xs){
>     if (empty(xs)) return s;
>     s += xs.head;
>     return firstSum(s,xs.tail);
> }
>
> where an intList has an int field `head', a pointer `tail' to its tail and
> empty(xs) would be (xs == NULL) in C, (xs == null) in Java.
> Since that function is tail-recursive, it doesn't need to allocate new
> stack-frames and thus can run in constant (stack) space (if the Haskell
> version doesn't it's a bug, in C, you'd probably have to tell gcc to
> -foptimize-sibling-calls, then it should do fine, in Java, I don't know of
> a VM that does tail-call optimisation - but then, I don't know much about
> Java).
>
> So for these, firstSum is well behaved, pretty much the best you can get.
>
> secondSum is different, the seq there says evaluate the sum of the tail and
> add that to x. Of course, for Int &c, to add x to the sum of the tail, the
> latter has to be evaluated anyway, so the seq is rather pointless.
> secondSum is almost equivalent to
>
> thirdSum :: Num a => [a] -> a
> thirdSum = foldr (+) 0
>
> and it more or less corresponds to the imperative version
>
> int secondSum(intList xs){
>     if (empty(xs)) return 0;
>     int tailsum = secondSum(xs.tail);
>     return (xs.head + tailsum);
> }
>
> This is not tail-recursive, so it needs O(length xs) stack and marches
> twice through the list, so to say, once to the end, building the chain of
> recursive calls and back to the front adding.
>
> So for Int &c, firstSum is vastly superior in space behaviour (always uses
> constant stack space and if the list isn't held in memory by other
> references, also constant heap space).
>
> Things become different when you work with lazy number types.
> firstSum, being tail-recursive, can't deliver anything until it has reached
> the end of the list. Keeping the accumulator evaluated to WHNF doesn't mean
> much for lazy types, so the accumulator may well build up huge thunks (but,
> for lazy types, evaluating a huge thunk can still run in small stack space,
> so that's not necessarily catastrophic).
>
> secondSum, on the other hand, can start delivering before the list has been
> completely traversed (depends on the behaviour of (+)).
> But if it can, it can probably do even better if you don't seq on the sum
> of the tail, so for those types, thirdSum would be the better option.
>
> Example for lazy number type:
>
> data Peano
>     = Zero
>     | Succ Peano
>       deriving (Eq, Show)
>
> instance Num Peano where
>     Zero + n = n
>     (Succ m) + n = Succ (m + n)
>     -- other methods
>     fromInteger n
>         | n <= 0    = Zero
>         | otherwise = Succ (fromInteger (n-1))
>
> instance Ord Peano where
>     compare Zero Zero = EQ
>     compare Zero _      = LT
>     compare _     Zero = GT
>     compare (Succ m) (Succ n) = compare m n
>
>
> Now try
>
> list :: [Peano]
> list = 4:replicate (10^5) 0
>
> thirdSum list > 3
> secondSum list > 3
> firstSum list > 3
>
> and then increase the length of the list (10^6, 10^7 instead of 10^5).
>
>> In my opinion both functions do not
>> return a complete unevaluated thunk (secondSum returns a thunk of the
>> form (THUNK + b) where THUNK contains a single numeral value). But it
>> seems to me that the first function computes the output somehow linear
>> in the sense that it does just a set of substitutions while the second
>> functions has to create a tree to handle all the recursive calls of
>> seq (sorry, my terminology is for sure a bit poor).
>
> Well, it's a flat tree, it's
>
> secondSum (x:xs) =
>     case secondSum xs of
>       s -> x+s
>
> where the `case' is supposed to have core semantics, i.e. evaluates the
> expression to WHNF.
>
> So
>
> secondSum [1,2,3]
> ~> case secondSum [2,3] of
>         s1 -> 1 + s1
> ~> case case secondSum [3] of { s2 -> 2 + s2 } of
>         s1 -> 1 + s1
> ~> case case case secondSum [] of { s3 -> 3 + s3 } of { s2 -> 2 + s2 } of
>         s1 -> 1 + s1
> ~> case case case 0 of { s3 -> 3 + s3 } of { s2 -> 2 + s2} of
>         s1 -> 1 + s1
> ~> case case 3 of { s2 -> 2 + s2 } of
>         s1 -> 1 + s1
> ~> case 5 of
>         s1 -> 1 + 5
> ~> 6
>
>> So I would say the
>> first function delivers a better performance. (In the discussion I
>> mentioned, the second function was not discussed in this form with the
>> seq implementation). Am I right with my thoughts?
>
> Pretty much.
>
>>
>> Thanks, fweth
>
>


------------------------------

Message: 6
Date: Wed, 15 Sep 2010 18:17:57 +0200
From: Jeroen van Maanen <[email protected]>
Subject: [Haskell-beginners] try, seq, and IO
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1

The past year I have been working on a port of my machine learning project 
named LExAu from Java to Haskell. I'm still very glad I took the jump, because 
the complexity curve appears to be log shaped rather than exp shaped. In one 
year I almost got to the functionality that had taken me five years to produce 
in Java (of course it helped a lot that I had a working prototype this time).

There is one thing that still bothers me though: when I write seq or $! it 
doesn't seem to have any effect!

Currently I am trying to add some exception handling to help me debug the 
system, but the code that I managed to produce depends on the logging statement 
to produce the desired result. :-( It looks like this, and only works when I 
uncomment the line '-- logger "Check sum": [...]', otherwise the exception is 
caught by the try around the body of the thread that this code runs in:

         do logger "Received update" [showString label, logs update]
            result <-
              try $!
                do maybeUpdatedModel <- return $ f update startModel
                   theCheckSum <- return $ liftM checkSum maybeUpdatedModel
--                   logger "Check sum" [showString label, shows theCheckSum]
                   return $! seq theCheckSum maybeUpdatedModel
            maybeNextModel <-
              case result of
                Right theMaybeNextModel -> return theMaybeNextModel
                Left exception ->
                  do let exc :: SomeException
                         exc = exception
                     logger "Exception" [showString label, shows exception]
                     return Nothing
            logger "Maybe next model" [showString label, logs maybeNextModel]

For more context see:

  
http://lexau.svn.sourceforge.net/viewvc/lexau/branches/totem/src/LExAu/Pipeline/Concurrent.hs?revision=326&view=markup

after line 241.

Can someone explain why a few showStrings a shows and a putStrLn are more 
effective in forcing the check sum to be computed (which necessarily evaluates 
the complete updated model and reveals the lurking exception) than the seq on 
the line just below the logging statement?

Cheers,  Jeroen



------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 27, Issue 36
*****************************************

Beginners Digest, Vol 27, Issue 36

Reply via email to