Beginners Digest, Vol 27, Issue 25

beginners-request Sat, 11 Sep 2010 09:28:31 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."

Today's Topics:

   1. Re:  Simple Math Function (Daniel Fischer)
   2. Re:  Again on list manipulations (Henry Olders)
   3. Re:  Again on list manipulations (Alex Rozenshteyn)
   4. Re:  Stack space overflow with foldl' (Daniel Fischer)
   5.  Re: randomize the order of a list (Heinrich Apfelmus)
   6.  Arrays in Haskell (Lorenzo Isella)
   7. Re:  Arrays in Haskell (Henry Olders)

----------------------------------------------------------------------

Message: 1
Date: Fri, 10 Sep 2010 19:45:41 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Simple Math Function
To: [email protected]
Cc: Lorenzo Isella <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="iso-8859-1"

On Friday 10 September 2010 18:35:45, Lorenzo Isella wrote:
> Dear All,
> I am trying to translate some codes of mine into haskell. It looks like
> Haskell can shorten quite a bit the implementation of mathematical
> formulas. Let us say you have two list of real numbers; let us call them
> y and t. The list t is increasing (a sort of time).
> Now consider the couples (t_a, y_a) and (t_b,y_b) where
> t_a<t_b i.e.  two (non-simultaneous) values of t and the corresponding
> y_a and y_b. There are no assumptions on y.
> What I would like to write is a function which checks the existence
> (t_c,y_c), where y_a<y_c<y_b (i.e. y_c is some intermediate time) such
> that

I suppose that should've been t_a < t_c < t_b since you speak of times.

>
> y_c<y_b+(y_a-y_b)*(t_b-t_c)/(t_b-t_a).
>
> I have already implemented this in other languages, but I ended up with
> quite lengthy functions (what if a list is empty, what if it has fewer
> than 2-3 elements and so on...).
> Also, it would be nice to iterate this function on all possible
> combinations (t_a, y_a) and (t_b,y_b), t_a<t_b I can form from my data.
> I am not asking someone to 'do my homework', rather I would like to see
> how much more compact things can get in haskell.
> Many thanks
>
> Lorenzo

-- Event has time and value, probably some other name would be more 
appropriate
type Event = (Double, Double)

combinations :: (Event -> Event -> Event -> Bool) -> [Event]
             -> [((Event, Event), [Event])]
combinations _ [] = []
combinations cond (e:es) = sat e [] es ++ combinations cond es
  where
    sat _ _ [] = []
    sat start between (now : later) = 
        ((start, now), filter (cond start now) between)
          : sat start (now : between) later

gives you a list of all pairs of events together with all intermediate 
events satisfying the condition.
To check whether for a pair of events there are any intermediate events 
satisfying the condition, use (not . null . snd). To find out how many, 
(length . snd).

If you have many events, however, using a list would probably be too 
inefficient, then instead of a [(Double, Double)], you'd better use a
Data.Map.Map Double Double
(or use a newtype for the time if you calculate the times in different 
ways, in floating point math, sin (2*x) == 2 * sin x * cos x isn't always 
true).

------------------------------

Message: 2
Date: Fri, 10 Sep 2010 13:48:17 -0400
From: Henry Olders <[email protected]>
Subject: Re: [Haskell-beginners] Again on list manipulations
To: Lorenzo Isella <[email protected]>
Cc: "[email protected]" <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii

On 2010-09-10, at 10:55 , Lorenzo Isella wrote:

> Dear All,
> I know this must be a one-liner, but it am banging my head against the wall.
> I am trying my hands at list manipulation in Haskell and a lot of useful 
> function are making my life easier but I cannot achieve something really 
> simple.
> Let us say you have the lists ml and sel
> 
> ml=[23,44,55,8,98]
> and
> sel=[1,2] .
> 
> Now, I would like simply to get a new list whose entries are the 
> elements of ml in position sel. In my case things might be a little more 
> complicated because all the elements are Integer and not Int (I noticed 
> that sometimes this means I have to resort to generic functions).
> Cheers

What about nested list comprehensions:

[x|(x,y)<-zip ml [x `elem` sel | x <- [1..maximum sel]], y]

Henry

------------------------------

Message: 3
Date: Fri, 10 Sep 2010 13:55:17 -0400
From: Alex Rozenshteyn <[email protected]>
Subject: Re: [Haskell-beginners] Again on list manipulations
To: Lorenzo Isella <[email protected]>
Cc: [email protected]
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="utf-8"

You're right, I meant (!!).  I've been playing with Array values recently,
and (!) is used there.

As has already been stated, genericIndex in the Data.List module will help
with the Integer problem, as will fromInteger.

As has also already been mentioned, lists + indexing = slow.  If you care
about efficiency, and you need indexing, you might want to look into
Data.Array.IArray if you have fixed bounds or Data.Sequence if you don't.

On Fri, Sep 10, 2010 at 11:12 AM, Lorenzo Isella
<[email protected]>wrote:

> On 09/10/2010 05:03 PM, Alex Rozenshteyn wrote:
>
>  alternatively (and more cleanly),
>>  > results = map (ml !) sel
>>
>
> Thanks Alex. I am left with this problem only: the entries of sel are (in
> my code) Integer rather than Int so...I get an error message when compiling
> sine "!!" (I believe you left out a "!") expects an Int instead of an
> Integer.
> Any idea about how to fix this without converting sel from Integer to Int?
> Many thanks
>
> Lorenzo
>

-- 
          Alex R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://www.haskell.org/pipermail/beginners/attachments/20100910/64596012/attachment-0001.html

------------------------------

Message: 4
Date: Fri, 10 Sep 2010 20:37:21 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Stack space overflow with foldl'
To: Mathew de Detrich <[email protected]>
Cc: [email protected], Ryan Prichard <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="utf-8"

On Friday 10 September 2010 17:52:59, Mathew de Detrich wrote:
> Shouldn't this be reported as a bug?

I'm not sure, but:
http://hackage.haskell.org/trac/ghc/ticket/4301

------------------------------

Message: 5
Date: Sat, 11 Sep 2010 13:14:24 +0200
From: Heinrich Apfelmus <[email protected]>
Subject: [Haskell-beginners] Re: randomize the order of a list
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed

John Dorsey wrote:
> Gabriel Scherer wrote:
> 
>>> Thanks for the link apfelmus, it's fairly interesting. The key to
>>> making it work is the weighting of lists during merging based on their
>>> lengths. I wonder if other sort algorithm can be adapted in such a
>>> way, while preserving uniformity. Quicksort for example : is it enough
>>> to choose the result position of the pivot randomly, and then placing
>>> elements on either side with a probability of 1/2 ?
> 
> to which Heinrich Apfelmus answered:
> 
>> Interesting question! Adapting quick sort is not that easy, though.
>>
>> First, you can skip choosing the pivot position because it is already  
>> entailed by the choices of elements left and right to it.
> 
> I don't think this is true...
> 
>> Second, probability 1/2 won't work, it doesn't give a uniform  
>> distribution.
> 
> .... because of this.
> 
> In fact, it appears to me that the proposed modification to quicksort is
> uniform and simple.  Why do you think otherwise?

Why should it be uniform just because it looks nice? Looks can be
deceiving, you need a mathematical proof to be certain.

Embarrassingly, the analysis in my previous message is wrong, though.
Here an actually correct assessment of the algorithm. Or rather, of the
two algorithms; the results are different depending on whether you use a
pivot *element* or just a pivot *position*. It will turn out that the
former is not uniform, while, to my surprise, the latter is uniform!

Let's being with some code for the algorithms, so we know what exactly
we are analyzing here. First, the  partition  function which divides a
list into two parts where each element has a chance of 1/2 of landing in
either part:

     partition :: [a] -> Random ([a],[a])
     partition = go ([],[])
         where
         go (ls,rs) []     = return (ls,rs)
         go (ls,rs) (x:xs) = do
             b <- uniform [True,False]   -- flip a coin
             if b
                 then go (x:ls,rs) xs    -- element goes left
                 else go (ls,x:rs) xs    -- element goes right

Now, algorithm A which puts the pivot element between the two parts:

     quickshuffleA :: [a] -> Random [a]
     quickshuffleA []     = return []
     quickshuffleA [x]    = return [x]
     quickshuffleA (x:xs) = do
         (ls, rs) <- partition xs
         sls <- quickshuffleA ls
         srs <- quickshuffleA rs
         return (sls ++ [x] ++ srs)

And then algorithm B which splits the list into two parts without
putting a pivot element in between.

     quickshuffleB :: [a] -> Random [a]
     quickshuffleB []     = return []
     quickshuffleB [x]    = return [x]
     quickshuffleB xs     = do
         (ls, rs) <- partition xs
         sls <- quickshuffleB ls
         srs <- quickshuffleB rs
         return (sls ++ srs)

Note that algorithm B does not necessarily terminate, since it repeats
itself if  ls  or  rs  become empty by chance!

Analysis of algorithm A:

Imagine the course of the algorithm from beginning to end. We want to
keep track of the set P of permutations that are still possible as
outcomes during each step. Before the algorithm starts, every
permutation is still possible. Then, for the first call of  partition ,
imagine that the set of possible permutations is divided into  2^(n-1)
disjoint classes of the form

   l x rrr...rr
   l x rrr...rr   -- different  l  than the previous one
   ...
   ll x rr...rr
   ...
   lll x r...rr
   ...
   lll...ll x r

where the  l  and  r  denote elements from the parts  ls  and  rs
respectively. The call to the  partition  function picks one of these
classes at random, with a uniform distribution.

However, the problem is that these classes contain different amounts of
permutations! Namely, the class

                       ll..ll  x  rr..rr
     k elements on the left           n-1-k elements on the right

contains  k! * (n-1-k)!  permutations. So, to be uniform, the  partition
  function would have to return each class with a probability
proportional to its size. But this is not the case, so algorithm A
cannot return of a uniform distribution of permutations.

Another way to convince yourself of the non-uniformity of algorithm A is
to actually calculate the distribution for small  n  by using one of the
probabilistic functional programming packages on Hackage:

    http://hackage.haskell.org/package/ProbabilityMonads
    http://hackage.haskell.org/package/probability

Analysis of algorithm B:

As before, we can imagine that the first call  partition  splits the set
of possible permutations into 2^n classes. But this time, the classes
are no longer disjoint, so the previous analysis does not apply!

Without the pivot element, the situation has become much more symmetric,
though, and that's the reason why algorithm B gives a uniform
distribution. In particular, imagine that we somehow manage to calculate
the probability that  quickshuffleB [1,2,3,4]  will return the trivial
permutation [1,2,3,4] (we will perform that calculation in a moment,
too). But since the algorithm is highly symmetric, the same calculation
also applies to, say, the permutation [3,4,1,2], and all the other
permutation as well! For instance, for the result [1,2,3,4], we had to
consider the case  ls = [1,2] ; but this case corresponds to the case
ls = [3,4]  which appears in the calculation for the result [3,4,1,2].
Hence, all outcomes have equal probability.

Let's calculate this from first principles as well, i.e. let  p  be the
probability, that the result is the permutation [1,2,3,4]. This is only
possible if first call to  partition  gives one of the following five
results:

   ls = [], ls = [1], ls = [1,2], ls = [1,2,3], ls = [1,2,3,4]

By mathematical induction, we can assume that smaller inputs like
quickshuffle [1,2,3]  give a uniform distribution. Then, the probability
of the outcome  [1,2,3,4]  in each of the five cases is

    ls = []        => probability  2^(-4) * 1/0! *  p
    ls = [1]       => probability  2^(-4) * 1/1! * 1/3!
    ls = [1,2]     => probability  2^(-4) * 1/2! * 1/2!
    ls = [1,2,3]   => probability  2^(-4) * 1/3! * 1/1!
    ls = [1,2,3,4] => probability  2^(-4) *  p   * 1/0!

and their sum is

    p = 2^(-4) * (1/0!*p + 1/1!*1/3! + 1/2!*1/2! + 1/3!*1/1! + p*1/0!)
      = 2^(-n) * ( sum [1/k!*1/(n-k)! | k<-[0..n]] + 2*(p - 1/n!) )

Expressing the factorials in terms of binomial coefficients and applying
the binomial theorem, we can see that the sum is equal to  2^n / n!  and
we obtain

    p = 1/n! + 2^(-n)*2*(p - 1/n!)

This implies that  p - 1/n! = 0, i.e.  p = 1/n!  as desired.
Furthermore, the calculation does not depend on the fact that we were
considering the trivial permutation [1,2,3,4].

Regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com

------------------------------

Message: 6
Date: Sat, 11 Sep 2010 16:09:09 +0200
From: Lorenzo Isella <[email protected]>
Subject: [Haskell-beginners] Arrays in Haskell
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Dear All,
The recent feedback I got on the mailing list led me to think about the 
data structure I need for my computations and array manipulations 
(loosely speaking, let us say that I need indexing and slicing tables of 
data).
Coming from Python, I am a bit confused: let me say that in my Python 
scripts I almost never use lists, but rather NumPy arrays.
In that case, it is an easy choice (almost every decent software for 
numerics/visualization etc... in Python relies explicitly or implicitly 
on NumPy). On top of that, NumPy is fast, has a lot of inbuilt functions 
and interfaces nicely with SciPy, matplotlib, Mayavi2 etc...
It seems to me (please correct me if I am mistaken) that the situation 
in Haskell is a bit more 'open' to choices.
At least I think so when I look at

http://en.wikibooks.org/wiki/Haskell/Hierarchical_libraries/Arrays

I may want to drop lists at some point for performance reasons but also 
because in my work I really have tables of numbers I find convenient to 
think of as arrays/matrices (again, loosely speaking, I mean matrices as 
arrays + linear algebra like taking the inverse, the determinant and so on).
Bottom line, I would need a data type that
(1) is decently fast (OK, there is more than performance to scripting, 
but you see my point)
(2) allows slicing/indexing (e.g. take 3rd row, second column, flatten 
it out, take every element larger than 34 in a row, find its unique 
elements, sort them etc...) without having to re-invent the wheel 
myself. This is more on the manipulation side than the linear algebra. 
As you can see, I would like to be able to find something similar to the 
very useful functions in Data.List for an array.
(3) it would be nice if these data type could have either integers of 
real numbers as entries. If the original data is a made up of integer 
numbers, conversion to real number is always suspicious (I have horrible 
memories (at least in other languages) of numbers that were equal as 
integers and no longer when read as real numbers because of one last 
digit changing...which can give you a headache in some cases.
(4) again it would be nice if I could feed these arrays/vectors to tools 
that take integrals, derivatives, inverse etc...

I am considering for this reason several possibilities

http://www.haskell.org/tutorial/arrays.html
http://users.skynet.be/jyp/html/base/Data-Array.html
http://code.haskell.org/hmatrix/hmatrix.pdf

hmatrix is probably what I am looking for (linear algebra, interface to 
gnuplot for plotting, ODE solver etc...) but there is no possibility of 
using arrays of integers, so I am concerned about using it to read and 
compare data files filled with integers where I check if certain entries 
are equal or not. Also I wonder if I can find any extra documentation 
other than the well written tutorial (which explains a lot, but cannot 
do everything in less than 30 pages and I would have plenty of questions 
about array manipulations there).

Any suggestion/clarification is more than welcome.
Cheers

Lorenzo

------------------------------

Message: 7
Date: Sat, 11 Sep 2010 12:28:21 -0400
From: Henry Olders <[email protected]>
Subject: Re: [Haskell-beginners] Arrays in Haskell
To: Lorenzo Isella <[email protected]>
Cc: "[email protected]" <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii

Hi, Lorenzo!

I am also a Haskell beginner, coming from Python, where I also used numpy for 
speed.

In my python applications, I used lists and list operations (especially list 
comprehensions) which I found an elegant way of doing things. A numpy array was 
simply, for my needs, a list of lists.

More experienced users can correct me if I'm wrong, but there are some 
fundamental differences between Haskell and Python which might influence your 
choices of lists vs arrays. Haskell, being primarily a compiled language, 
should give you a speed advantage over interpreted python when you're doing 
basically the same operations, so you may not need (at least initially) to look 
for ways to improve speed. My approach basically in any language was to first 
get the program to work correctly, and to look for speed optimizations only as 
a secondary step if the program was too slow, and then only for the processes 
which were the speed bottlenecks.

The second issue is that in Haskell, lists are mutable, whereas arrays are 
(mostly) immutable. Generally, operations which need to work on every element 
of a data structure will go faster (and take less memory) if they do not result 
in the creation of a copy of the data structure, as they need to do for 
immutable data.

So my tendency would be to look for ways of doing things in Haskell which make 
use of lists (including lists of lists), and resort to arrays/matrices only for 
operations which cannot be decomposed into list operations.

Henry

On 2010-09-11, at 10:09 , Lorenzo Isella wrote:

> Dear All,
> The recent feedback I got on the mailing list led me to think about the 
> data structure I need for my computations and array manipulations 
> (loosely speaking, let us say that I need indexing and slicing tables of 
> data).
> Coming from Python, I am a bit confused: let me say that in my Python 
> scripts I almost never use lists, but rather NumPy arrays.
> In that case, it is an easy choice (almost every decent software for 
> numerics/visualization etc... in Python relies explicitly or implicitly 
> on NumPy). On top of that, NumPy is fast, has a lot of inbuilt functions 
> and interfaces nicely with SciPy, matplotlib, Mayavi2 etc...
> It seems to me (please correct me if I am mistaken) that the situation 
> in Haskell is a bit more 'open' to choices.
> At least I think so when I look at
> 
> http://en.wikibooks.org/wiki/Haskell/Hierarchical_libraries/Arrays
> 
> I may want to drop lists at some point for performance reasons but also 
> because in my work I really have tables of numbers I find convenient to 
> think of as arrays/matrices (again, loosely speaking, I mean matrices as 
> arrays + linear algebra like taking the inverse, the determinant and so on).
> Bottom line, I would need a data type that
> (1) is decently fast (OK, there is more than performance to scripting, 
> but you see my point)
> (2) allows slicing/indexing (e.g. take 3rd row, second column, flatten 
> it out, take every element larger than 34 in a row, find its unique 
> elements, sort them etc...) without having to re-invent the wheel 
> myself. This is more on the manipulation side than the linear algebra. 
> As you can see, I would like to be able to find something similar to the 
> very useful functions in Data.List for an array.
> (3) it would be nice if these data type could have either integers of 
> real numbers as entries. If the original data is a made up of integer 
> numbers, conversion to real number is always suspicious (I have horrible 
> memories (at least in other languages) of numbers that were equal as 
> integers and no longer when read as real numbers because of one last 
> digit changing...which can give you a headache in some cases.
> (4) again it would be nice if I could feed these arrays/vectors to tools 
> that take integrals, derivatives, inverse etc...
> 
> I am considering for this reason several possibilities
> 
> http://www.haskell.org/tutorial/arrays.html
> http://users.skynet.be/jyp/html/base/Data-Array.html
> http://code.haskell.org/hmatrix/hmatrix.pdf
> 
> hmatrix is probably what I am looking for (linear algebra, interface to 
> gnuplot for plotting, ODE solver etc...) but there is no possibility of 
> using arrays of integers, so I am concerned about using it to read and 
> compare data files filled with integers where I check if certain entries 
> are equal or not. Also I wonder if I can find any extra documentation 
> other than the well written tutorial (which explains a lot, but cannot 
> do everything in less than 30 pages and I would have plenty of questions 
> about array manipulations there).
> 
> Any suggestion/clarification is more than welcome.
> Cheers
> 
> Lorenzo
> 
> 
> 
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners

------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners

End of Beginners Digest, Vol 27, Issue 25
*****************************************

Beginners Digest, Vol 27, Issue 25

Reply via email to