Beginners Digest, Vol 9, Issue 12

beginners-request Wed, 11 Mar 2009 03:06:30 -0700

Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org


You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1.  Re: Integer factorization (Christian Maeder)
   2.  Re: folds -- help! (Heinrich Apfelmus)
   3.  Re: Integer factorization (Heinrich Apfelmus)
   4.  Re: folds -- help! (7stud)
   5. Re:  folds -- help! (John Dorsey)
   6. Re:  Simple data summarization (Andy Elvey)
   7. Re:  Re: Integer factorization (Francesco Bochicchio)


----------------------------------------------------------------------

Message: 1
Date: Tue, 10 Mar 2009 18:24:31 +0100
From: Christian Maeder <christian.mae...@dfki.de>
Subject: [Haskell-beginners] Re: Integer factorization
To: "Sergey V. Mikhanov" <ser...@mikhanov.com>
Cc: beginners <beginners@haskell.org>
Message-ID: <49b6a24f.2070...@dfki.de>
Content-Type: text/plain; charset=ISO-8859-1

On
http://www.haskell.org/haskellwiki/Prime_numbers

"primeFactors" should do what you want (although I don't like the
the pattern matching on "1")

Cheers Christian

Sergey V. Mikhanov wrote:
>    Hello,
> 
> I am solving a problem of finding the largest prime factor of
> 600851475143 (Project Euler is great for learning new language!), and
> came with the following solution (it uses the most ineffective
> approach to finding prime numbers, however is able to factor the above
> number in fraction of second):
> 
> factors :: Integer -> [Integer]
> 
> factors n = doFactors n (eratosthenesFilter [1..n])
> 
> doFactors n primes
>     | null newPrimes = []
>     | otherwise      =
>         let topPrime = head newPrimes in
>         if topPrime == n
>             then [topPrime]
>             else topPrime : (doFactors (n `quot` topPrime) primes)
>     where
>         newPrimes = snd $ break (\x -> (n `rem` x) == 0) primes
> 
> eratosthenesFilter :: [Integer] -> [Integer]
> 
> eratosthenesFilter [] = []
> eratosthenesFilter (num : nums)
>     | num == 1  = eratosthenesFilter nums
>     | otherwise = num : (eratosthenesFilter $ doFilter num nums)
>     where
>         doFilter num nums = filter (\x -> x > num && (x `rem` num) /= 0) nums
> 
> What would you do different (including stylistic changes)? What are
> the general comments about efficiency (not of the algorithm, but of
> the implementation: for example, is it fine to use break at every
> invocation of doFactors?) and elegance of the solution?
> 
> Thanks and regards,
> Sergey


------------------------------

Message: 2
Date: Tue, 10 Mar 2009 19:31:50 +0100
From: Heinrich Apfelmus <apfel...@quantentunnel.de>
Subject: [Haskell-beginners] Re: folds -- help!
To: beginners@haskell.org
Message-ID: <gp6bkk$r1...@ger.gmane.org>
Content-Type: text/plain; charset=UTF-8

7stud wrote:
> This is an example that shows how foldl and foldr work (from RWH p.93-94):
> 
> foldl (+) 0 (1:2:3:[])
>    == foldl (+) (0 + 1)             (2:3:[])
>    == foldl (+) ((0 + 1) + 2)       (3:[])
>    == foldl (+) (((0 + 1) + 2) + 3) []
>    ==           (((0 + 1) + 2) + 3)
> 
> 
> foldr (+) 0 (1:2:3:[])
>    ==  1 +           foldr (+) 0 (2:3:[])
>    ==  1 + (2 +      foldr (+) 0 (3:[])
>    ==  1 + (2 + (3 + foldr (+) 0 []))
>    ==  1 + (2 + (3 + 0))
> 
> The book says on p.94:
> 
> -----
> The difference between foldl and foldr should be clear from looking at where
> the parentheses and the empty list elements show up.  With foldl, the empty
> list element is on the left, and all the parentheses group to the left.
> With foldr, the zero value is on the right, and the parentheses group to the
> right.
> ----
> 
> Huh?  With foldl, the only empty list element I see is on the right.

A fold like  foldr f z  is best understood as a function that replaces
each  (:)  with  f  and each  []  with  z . See also the diagrams on

  http://en.wikipedia.org/wiki/Fold_(higher-order_function)

>From this point of view, z "corresponds to the empty list".


> Initially, it looked to me like they did the same thing, and that the only 
> difference was the way they called step.

They are only the same when the operation  f  is associative, i.e. if it
satisfies

  f x (f y z) = f (f x y) z


> But then RWH explains that you would never use foldl in practice because it
> thunks the result, which for large lists can overwhelm the maximum memory
> alloted for a thunk.  But it appears to me the same thunk problem would 
> occur with foldr.  So why is foldr used in practice but not foldl?

See also

  http://en.wikibooks.org/wiki/Haskell/Performance_Introduction#Space


Regards,
apfelmus

--
http://apfelmus.nfshost.com



------------------------------

Message: 3
Date: Tue, 10 Mar 2009 19:50:11 +0100
From: Heinrich Apfelmus <apfel...@quantentunnel.de>
Subject: [Haskell-beginners] Re: Integer factorization
To: beginners@haskell.org
Message-ID: <gp6cn0$uv...@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1

Sergey V. Mikhanov wrote:
> 
> factors :: Integer -> [Integer]
> factors n = doFactors n (eratosthenesFilter [1..n])
> 
> doFactors n primes
>     | null newPrimes = []
>     | otherwise      =
>         let topPrime = head newPrimes in
>         if topPrime == n
>             then [topPrime]
>             else topPrime : (doFactors (n `quot` topPrime) primes)
>     where
>         newPrimes = snd $ break (\x -> (n `rem` x) == 0) primes
> 
> eratosthenesFilter :: [Integer] -> [Integer]
> eratosthenesFilter [] = []
> eratosthenesFilter (num : nums)
>     | num == 1  = eratosthenesFilter nums
>     | otherwise = num : (eratosthenesFilter $ doFilter num nums)
>     where
>         doFilter num nums = filter (\x -> x > num && (x `rem` num) /= 0) nums
> 
> What would you do different (including stylistic changes)? What are
> the general comments about efficiency (not of the algorithm, but of
> the implementation: for example, is it fine to use break at every
> invocation of doFactors?) and elegance of the solution?

Stylistically, one usually uses shorter variable names in Haskell. Also,
the guards in  doFactors  are better expressed as pattern matching and
the if can be turned into guards.

    factors :: Integer -> [Integer]
    factors n = go n $ eratosthenes [2..n]
        where
        go n []              = []
        go n (p:ps)
           | n `mod` p == 0  = p : go (n `div` p) ps
           | otherwise       = go n ps


    eratosthenes :: [Integer] -> [Integer]
    eratosthenes []     = []
    eratosthenes (p:ps) = p : erathostenes ps'
       where
       ps' = filter (\x -> x > p && (x `mod` p) /= 0) ps


Other than that, efficiency is best understood as algorithmic
efficiency; there are not straightforward "tweaks" that give you the
warm fuzzy feeling of "writing efficient code".


Regards,
apfelmus

--
http://apfelmus.nfshost.com



------------------------------

Message: 4
Date: Tue, 10 Mar 2009 19:48:52 +0000 (UTC)
From: 7stud <bbxx789_0...@yahoo.com>
Subject: [Haskell-beginners] Re: folds -- help!
To: beginners@haskell.org
Message-ID: <loom.20090310t191633-...@post.gmane.org>
Content-Type: text/plain; charset=utf-8

Kurt Hutchinson <kelanslists <at> gmail.com> writes:
> You're thinking of a slightly different 'empty' here. You're thinking
> of what happens when you reach the end of the list, after folding it
> all, and there are no more elements to fold. In this example, you're
> right that the acc parameter is 6. But what if the list you *first
> gave to foldr* was empty? Then it would evaluate to 0, the initial
> seed value.
> 

Ok.

>> if f can start delivering the result without looking at its second argument,
>> you can start consuming the result before the fold has traversed the whole
>> list.
> >>
> >
> > Ok, that isn't clearly illustrated by the example in the book:
> >
> >> > foldl (+) 0 (1:2:3:[])
> >> > Â  Â == foldl (+) (0 + 1) Â  Â  Â  Â  Â  Â  (2:3:[])
> >> > Â  Â == foldl (+) ((0 + 1) + 2) Â  Â  Â  (3:[])
> >> > Â  Â == foldl (+) (((0 + 1) + 2) + 3) []
> >> > Â  Â == Â  Â  Â  Â  Â  (((0 + 1) + 2) + 3)
> >> >
> >> >
> >> > foldr (+) 0 (1:2:3:[])
> >> > Â  Â == Â 1 + Â  Â  Â  Â  Â  foldr (+) 0 (2:3:[])
> >> > Â  Â == Â 1 + (2 + Â  Â  Â foldr (+) 0 (3:[])
> >> > Â  Â == Â 1 + (2 + (3 + foldr (+) 0 []))
> >> > Â  Â == Â 1 + (2 + (3 + 0))
> >> >
> >
> > In that example, it doesn't look like anything in foldr can be evaluated
> > until the whole fold has been completed.
> 
> You're right, that example doesn't show how you could start using the
> result without fully evaluating the fold, since addition doesn't give
> partial results. The concat example is better in that regard.
> 

Ok. I guess I did understand something.  Therefore, I think 
the example in the book that uses addition with foldl and foldr is TERRIBLE.  
I think the sections on folds in RWH are a catastrophe and 
need to be rewritten.  The authors need to get rid of the "bit twiddling" 
example and provide an example like concat to clearly show the 
difference between foldl and foldr.  I think the example using foldl 
and foldr with addition should be a secondary example to demonstrate 
that sometimes it doesn't matter whether you use foldl or foldr.


> >> Common examples are things like
> >>
> >> concat = foldr (++) [],
> >> so
> >> concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
> >> and the start (l1) can be used before further reducing the fold,
> >>
> >
> > So does haskell store a thunk for everything to the right of l1?
> > You said that when using foldr you can start "consuming" the beginning of 
> > the before the whole result is reduced. Â I don't quite get that.
> 
> A thunk is used as a stand-in for most calculations before the result
> is actually calculated. That way, if you never try to use the result,
> the calculation never needs to be done, and that means less work. As
> an example that ties to the concat example above, say your program
> only wanted to test if the result of the concat fold was an empty
> list. The function 'null' takes a list and evaluates to True or False,
> based on whether the list is empty or not. So:
> 
> someFunc xs = null ( concat xs )
>   where
>   concat ys = foldr (++) [] ys
> 
> The 'null' function only needs to test whether the list that is the
> result of foldConcat has at least one element. Let's say l1 has an
> element. So it's kind of evaluated like this:
>   someFunc [ l1, l2, l3, l4, l5 ]
>   null (concat [ l1, l2, l3, l4, l5 ] )
>   null ( l1 ++ ( thunk with rest of fold ))
>   False
> 
> The rest of the fold doesn't need to be evaluated, since the beginning
> part is enough for 'null' to tell that the result would have at least
> one element (because l1 does). That's one way foldr can be used to
> start consuming the result before the entire fold is done. 
>

Ok.  The terminology is just a little confusing for me. When you say
"start consuming before the fold is done" it sounds to me like
somehow you are reading partial results off the front of the result
before foldr returns the entire result.  But as you explained above
what you actually mean is that foldr will return the entire result--
with part of the result being a thunk, and subsequently you can do 
operations on the result that may never require the thunked part to 
be evaluated.


> It depends
> completely on the accFunc: if it can return partial results, like
> concat, then you can start consuming the result before a full
> evaluation. Some accFunc's can't return partial results, like regular
> addition. In that case, it's probably better to use foldl' (note the
> apostrophe), which will force the thunks to be evaluated as they are
> generated and so use less memory. foldl is the same as foldl' except
> [foldl] does generate thunks, and then evaluates them all at the end of the
> fold, so it uses a bunch of memory to store the thunks in the
> meantime, which usually isn't useful.
> 
> Kurt
> 

Thanks to you and Daniel for the great explanations.  I'm feeling a lot
better about folds now.


....
....
......
....
...
...
....
....
....
....
...
...
...
....



------------------------------

Message: 5
Date: Tue, 10 Mar 2009 16:54:22 -0400
From: John Dorsey <hask...@colquitt.org>
Subject: Re: [Haskell-beginners] folds -- help!
To: beginners@haskell.org
Message-ID: <20090310205422.ga1...@colquitt.org>
Content-Type: text/plain; charset=us-ascii

Adrian Neumann wrote:

> Notice that there is no difference between
> 
> foldr g a
> foldl f a
> 
> (for appropriate g and f) if g and f are strict in both arguments.  

Be careful... as apfelmus noted elsewhere in this thread, that's not (in
general) true.

Prelude> foldr (^) 2 [3,5]
847288609443
Prelude> foldl (^) 2 [3,5]
32768

The reason?  Integer exponentiation (^) isn't associative and
commutative.  So the first is (3 ^ (5^2)) = 3^25, while the second is
((2 ^ 3) ^ 5) = 2^15.

Cheers,
John



------------------------------

Message: 6
Date: Wed, 11 Mar 2009 18:17:42 +1300
From: Andy Elvey <andy.el...@paradise.net.nz>
Subject: Re: [Haskell-beginners] Simple data summarization
To: Patrick LeBoutillier <patrick.leboutill...@gmail.com>
Cc: beginners@haskell.org
Message-ID: <49b74976.1010...@paradise.net.nz>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi Patrick! 

Thanks very much - that's great!  
Many thanks also to Roland and Thomas - your solutions are great too! 

Although I'm still new to Haskell, I love its power and elegance.  It 
does have a bit of a learning curve, mainly because its *so* powerful 
that its a bit like trying to fly a jumbo jet..... ;)  

Only one more question. If I wanted to do a crosstab (say, with 
ethnicity down the left-hand side, and gender across the top), how could 
that be done?  
In other words, the output would look like this -

                         F      M 
NZ European    xxx   xxx
NZ Maori          xxx   xxx 

- where the x's are the totals for each category (NZ European F), (NZ 
European M), (NZ Maori, F), NZ Maori, M). 
I think it would involve zipWith and Array, but beyond that, I find it 
hard to think this through in two dimensions.... :) 
Crosstab code would be *great* to play around with!

Thanks again - bye for now -
 - Andy


Patrick LeBoutillier wrote:
> Andy,
>
> I came up with this solution that works like you described:
>
>
> import Data.List.Split
>
> mysplit = wordsBy (==',')
>
> toPairs :: [String] -> [(String, [String])]
> toPairs (header:rows) = foldr f (initPairs header) $ splitRows rows
>     where f row acc = zipWith (\f (h,r) -> (h,f:r)) row acc
>           initPairs header = map (\h -> (h, [])) $ mysplit header
>           splitRows rows = map (mysplit) rows
>
> summarizeByWith :: String -> (Int -> Int -> Int) -> [(String,
> [String])] -> (String, Int)
> summarizeByWith var agg pairs = case (lookup var pairs) of
>     Just vals -> (var, foldl agg 0 $ map (read) vals)
>     otherwise -> ("", 0)
>
> main = interact (show . summarizeByWith "Books" (+) . toPairs . lines)
>
>
> However in my opinion a solution like that proposed by Roland is
> preferable since it can process the input line by line instead of
> storing it all in memory. It seems also simpler and propably more
> efficient.
>
> However it was interesting hacking at your algorithm because it made
> me realize how you can use lists of pairs (association lists) in
> haskell where you might have used hash tables in another language.
>
>
> Cheers,
>
> Patrick
>
>
>
> On Tue, Mar 10, 2009 at 4:33 AM, Andy Elvey <andy.el...@paradise.net.nz> 
> wrote:
>   
>> Hi all -
>> In the process of learning Haskell I'm wanting to do some simple data
>> summarization.
>> ( Btw, I'm looking at putting any submitted code for this in the "cookbook"
>> section of
>> the Haskell wiki.  Imo it would be very useful there as a "next step" up
>> from just reading
>> in a file and printing it out.  )
>> This would involve reading in a delimited file like this - ( just a
>> contrived example of how many books
>> some people own ) -
>>
>> Name,Gender,Age,Ethnicity,Books
>> Mary,F,14,NZ European, 11
>> Brian,M,13,NZ European, 6
>> Josh,M,12,NZ European, 14
>> Regan,M,14,NZ Maori, 9
>> Helen,F,15,NZ Maori, 17
>> Anna,F,14,NZ European, 16
>> Jess,F,14,NZ Maori, 21
>>
>> .... and doing some operations on it. As you can see, the file has column
>> headings - I prefer to be able to manipulate data with
>> headings (as it is what I do a lot of at work, using another programming
>> language).
>>
>> I've tried to break the problem down into small parts as follows. a) Read
>> the file into a list of pairs.
>> The first element of the pair would be the column heading.
>> The second will be a list containing the data.
>> For example, ("Name",  [Mary,  Brian,  Josh,  Regan, ..... ]  )
>> b) Select a numeric variable to summarise ( "Books" in this example) c) Do a
>> fold to summarize the variable. I think a left-fold would be the one to use
>> here, but I may
>> be wrong....
>>
>> After looking through previous postings on this list, I found some code
>> which is somewhat similar to what I'm after (although the data it was
>> crunching is very different).  This is what I've come up with so far -
>>
>> summarize [] = []
>> summarize ls = let
>>       byvariable = head ls
>>       numeric_variable = last ls
>>       sum = foldl (+) 0 $ numeric_variable
>>
>>   in (byvariable, sum) : sum ls
>>
>> main = interact (unlines . map show . summarize . lines)
>> I think this might be a useful start, but I still need to read the data into
>> a list of pairs as mentioned, and I'm unsure as to how to
>> do that.
>> Many thanks in advance for any help received.  As mentioned, I'm sure that
>> examples like this could be very useful to other beginners, so I'm keen to
>> make sure that any help given is made maximum use of (by putting any code on
>> the Haskell wiki). - Andy
>>
>> _______________________________________________
>> Beginners mailing list
>> Beginners@haskell.org
>> http://www.haskell.org/mailman/listinfo/beginners
>>
>>     
>
>
>
>   



------------------------------

Message: 7
Date: Wed, 11 Mar 2009 11:06:22 +0100
From: Francesco Bochicchio <bieff...@gmail.com>
Subject: Re: [Haskell-beginners] Re: Integer factorization
To: Heinrich Apfelmus <apfel...@quantentunnel.de>
Cc: beginners@haskell.org
Message-ID:
        <a6e7dd140903110306u60e6ed8fud4255b189d542...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

2009/3/10 Heinrich Apfelmus <apfel...@quantentunnel.de>

> Sergey V. Mikhanov wrote:
> >
>

... some code ...


>
> >
> > What would you do different (including stylistic changes)? What are
> > the general comments about efficiency (not of the algorithm, but of
> > the implementation: for example, is it fine to use break at every
> > invocation of doFactors?) and elegance of the solution?
>
> Stylistically, one usually uses shorter variable names in Haskell.


<beginner rant>
Sometime too short peraphs?  At least, this is one of the things that slows
down my understanding of code posted
on this list on or on various haskell tutorial. In any other language I
know, programmers learn to give meaningful names to variable and functions,
so when one reads a program, one can use the name to remember what the
function does. Then one cames to haskell ...
I guess the short names comes from mathematic background, but  still ...
haskell is already very succint - even more so
when you use pointfree programming - and if one also uses names like a,e,i (
look at Array function definitions ), ...
</beginner rant>

Rant apart, I notice that in my own excercises I tend to shorten names, so
maybe there is a reason for that.
Nevertheless readability tends to be a big issue in languages used in IT
industry, and my feeling is that haskell
tends to err on the laconic side of the balance.


Out of curiosity, there is any reason why you called the auxiliary function
'go' ?

Ciao
-------
FB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://www.haskell.org/pipermail/beginners/attachments/20090311/a2dae0f4/attachment.htm

------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 9, Issue 12
****************************************

Beginners Digest, Vol 9, Issue 12

Reply via email to