Beginners Digest, Vol 22, Issue 50

beginners-request Fri, 30 Apr 2010 09:33:38 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  Re: Iterating through a list of char... (Daniel Fischer)
   2.  Re: Re: Converting an imperative program to      haskell
      (Maciej Piechotka)
   3. Re:  Data.Binary.Get for large files (MAN)
   4. Re:  Data.Binary.Get for large files (Daniel Fischer)
   5. Re:  Data.Binary.Get for large files (Daniel Fischer)
   6. Re:  Re: Iterating through a list of char... (Daniel Fischer)


----------------------------------------------------------------------

Message: 1
Date: Fri, 30 Apr 2010 02:32:35 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Re: Iterating through a list of
        char...
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="iso-8859-1"

Am Freitag 30 April 2010 01:27:54 schrieb Daniel Fischer:
> ----------------------------------------------------------------------
> {-# LANGUAGE BangPatterns #-}
>
> import qualified Data.ByteString.Lazy as L
> import qualified Data.ByteString as S
> import Data.ByteString.Unsafe (unsafeAt)

Oops,

import Data.ByteString.Unsafe (unsafeIndex)

>
> escape :: Word8 -> Word8
> escape = (+150)
>
> normal :: Word8 -> Word8
> normal = (+214)
>
> decodeW :: L.ByteString -> [Word8]
> decodeW = dec False . L.toChunks
>     where
>       dec _ [] = []
>       dec esc (str:more) = go esc 0
>         where
>           !len = S.length str
>           {-# INLINE charAt #-}
>           charAt :: Int -> Word8
>           charAt i = unsafeAt str i

          charAt i = unsafeIndex str i

>           go !b !i
>             | i == len  = dec b more
>             | b         = escape (charAt i) : go False (i+1)
>             | otherwise = case charAt i of
>                             61 -> go True (i+1)
>                             c  -> normal c : go False (i+1)
>
> word8ToChar :: Word8 -> Char
> word8ToChar = toEnum . fromIntegral
>
> decodeC :: L.ByteString -> String
> decodeC = map word8ToChar . decodeW
>
> decodeBS :: L.ByteString -> L.ByteString
> decodeBS = L.pack . decodeW
> ----------------------------------------------------------------------



------------------------------

Message: 2
Date: Fri, 30 Apr 2010 01:43:47 +0100
From: Maciej Piechotka <[email protected]>
Subject: [Haskell-beginners] Re: Re: Converting an imperative program
        to      haskell
To: [email protected]
Message-ID: <1272588227.2077.63.ca...@localhost>
Content-Type: text/plain; charset="utf-8"

On Thu, 2010-04-29 at 14:49 -0700, Hein Hundal wrote:
> --- On Thu, 4/29/10, Maciej Piechotka <[email protected]> wrote:
> > Hein Hundal wrote:
> > >
> > >    I figured I should try a larger program
> > in Haskell, so I am
> > > converting one of my Mathematica programs, a simulator
> > for the
> > > card game Dominion, over to Haskell.  Most of
> > that is going well
> > > except for one patch of imperative code.  My
> > Haskell version of
> > > this code is ugly.  I was hoping someone could
> > recommend a better
> > > way to do it.  I will paste a simplified version
> > of the code
> > > below.  If necessary, I can provide all the other
> > code used for
> >
> > 1. Use strong typing. Or any typing
> 
> I simplified the code for the post.  In the real version, I use strong 
> typing.  The Card type is enumerated.  I have been using [Card] instead of 
> calling it a Deck.  I could change that.
> 

Deck is not important - it's rather eye candy ;) [Card] is as clear as
Deck.

> > 3. Don't use too much variables. 6-8 is probably the
> > maximum you should
> > deal with (human short-term memory holds about 5-10 points
> > of entry.
> > Split functions into smaller functions (even in where).
> 
> I do have to get the information into the functions, so the only way I can 
> avoid having lots of variables is by introducing new structures.  I can do 
> that.
> 

New structures annotates types easily.

data ComplicatedData = ComplicatedData {
    turn :: Int,
    deck :: [Cards],
    ...
  }

There is syntax sugar:

doSomething :: ComplicatedData -> [Cards]
doSomething cd = drop (turn cd) (deck cd)
-- takes deck drops as meny cards as turns passed and returns it

doSomethingCrazy :: ComplicatedData -> ComplicatedData
doSomethingCrazy cd = cd {deck = drop (turn cd) (deck cd)}
-- creates new ComplicatedData which have everything as
-- the argument except that desk is missing as many cards
-- as turn currently is


> > 6. While Haskell have long tradition of having short
> > namesit is not
> > always good (see 3). Use them only if you are sure it is
> > clear what they
> > mean:
> 
> In the original version, I had longer variable names where they seemed 
> necessary. 
> 
> 
> The main sources of ugliness are the long lists of variables.  Every time I 
> call doAct or construct a LoopState variable, I am repeating all those 
> variables.  I will try changing the type of doAct to
> 
> doAct :: LoopState -> LoopState
> 
> Cheers,
> Hein
> 

See the record syntax. Or refactor it to use a helper functions.

Depending on purpose and advancement you can play with pointless style. 

Regards

PS. Consider using helper functions. Even if they are longer:

func n | 128 `mod` n == 0 = 3
       | otherwise        = 2

vs.

func n | n `divides` 128 = 3
       | otherwise       = 2

k `divides` n = n `mod` k == 0

In first example you have to think what I meant. In second it is
self-commention (n `divides` 128 - it is just n divides 128 with strange
apostrophes).

doSomething (State (c:cs) t ph ta ...)
    | c == Ace `of` Hearths = State cs (t+1) (c:ph) ta ...
    | otherwise             = State cs (t+1) ph (c:ta) ...
                                  
vs.

putOnTable :: Card -> State -> State
putOnTable c s = s {table = c:table s}

putIntoPlayerHand :: Card -> State -> State
putIntoPlayerHand c s = s {playerHand = c:playerHand s}

drawCard :: State -> (Card, State)
drawCard s = let (c:cs) = deck s
             in (x, s {deck = s})

nextTurn :: State -> State
nextTurn s = s {turn = turn s + 1}

doSomething s
  | c == Ace `of` Hearths = nextTurn $ putIntoPlayerHand c s'
  | otherwise             = nextTurn $ putOnTable c s'
  where (c, s') = drawCard s

or

doSomething s = let (c, s') = drawCard s
                    s'' | c == Ace `of` Hearths = putIntoPlayerHand c s'
                        | otherwise             = putOnTable c s'
                in nextTurn s''

Regards

PS. For 'advanced' only - and many advanced users dislikes this approach
and would recommend not to use it. Anyway - don't bother with it until
later if you don't understands.

data GameState = GameState {
    deck :: [Card],
    turn :: Int,
    playerHand :: [Card],
    table :: [Card],
    ...
  }

putOnTable :: Card -> State GameState ()
putOnTable c = modify (\s -> s {table = c:table s})

putIntoPlayerHand :: Card -> State GameState ()
putIntoPlayerHand c = modify (\s -> s {playerHand = c:playerHand s})

drawCard :: State GameState Card
drawCard = do (c:cs) <- gets deck
              modify (\s -> s {deck = cs})
              return c

nextTurn :: State GameState ()
nextTurn = modify (\s -> s {turn = turn s + 1})

doSomething :: State GameState ()
doSomething = do c <- drawCard
                 if c == Ace `of` Hearths
                   then putIntoPlayerHand c
                   else putOnTable c
                 nextTurn


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
Url : 
http://www.haskell.org/pipermail/beginners/attachments/20100429/21d7367b/attachment-0001.bin

------------------------------

Message: 3
Date: Thu, 29 Apr 2010 21:46:08 -0300
From: MAN <[email protected]>
Subject: Re: [Haskell-beginners] Data.Binary.Get for large files
To: [email protected]
Cc: [email protected]
Message-ID: <1272588368.2929.0.ca...@dy-book>
Content-Type: text/plain; charset="UTF-8"


I can't find the error in your code (assuming there is an error), so I'm
checking the code you didn't write, and the only thing that set off an
alarm was...

getFloat64le :: Get Double
getFloat64le = getFloat (ByteCount 8) $ splitBytes . reverse

splitBytes :: [Word8] -> RawFloat

...that every chunk read in the Get monad is being reversed, so that you
can take one float (and you are taking in over 26 million floats) in
little endian. I really don't know if this hits performance, but I
assume the C equivalent would be reading an array in reverse order.
I am more than willing to believe this is not the cause of such
performance loss, but can't find a reason.


PS1: "(e == True)" == "e"

PS2:
I know it's not important, but I can't help it: that is not an average
you're computing...

El jue, 29-04-2010 a las 23:37 +0100, Philip Scott escribiÃ³:
> Hello again folks, 
> 
> Sorry to keep troubling you - I'm very appreciative of the help you've
> given so far. I've got one more for you that has got me totally
> stumped. I'm writing a program which deals with largish-files, the one
> I am using as a test case is not stupidly large at about 200mb. After
> three evenings, I have finally gotten rid of all the stack overflows,
> but I am unfortunately left with something that is rather unfeasably
> slow. I was hoping someone with some keener skills than I could take a
> look, I've tried to distill it to the simplest case. 
> 
> This program just reads in a file, interpreting each value as a
> double, and does a sort of running average on them. The actual
> function doesn't matter too much, I think it is the reading it in that
> is the problem. Here's the code: 
> 
> import Control.Exception 
> import qualified Data.ByteString.Lazy as BL 
> import Data.Binary.Get 
> import System.IO 
> import Data.Binary.IEEE754 
> 
> myGetter acc = do 
>     e <- isEmpty 
>     if e == True 
>         then 
>             return acc 
>         else do 
>             t <- getFloat64le 
>             myGetter $! ((t+acc)/2) 
> 
> myReader file = do 
>     h <- openBinaryFile file ReadMode 
>     bs <- BL.hGetContents h 
>     return $ runGet (myGetter 0)  bs 
> 
> main = do 
>     d <- myReader "data.bin" 
>     evaluate d 
> 
> This takes about three minutes to run on my (fairly modern) laptop..
> The equivilant C program takes about 5 seconds. 
> 
> I'm sure I am doing something daft, but I can't for the life of me see
> what. Any hints about how to get the profiler to show me useful stuff
> would be much appreciated! 
> 
> All the best, 
> 
> Philip 
> 
> PS: If, instead of computing a single value I try and build a list of
> the values, the program ends up using over 2gb of memory to read a
> 200mb file.. any ideas on that one? 
> 
> 
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners




------------------------------

Message: 4
Date: Fri, 30 Apr 2010 03:12:50 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Data.Binary.Get for large files
To: [email protected], [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="utf-8"

Am Freitag 30 April 2010 00:37:59 schrieb Philip Scott:
> Hello again folks,
>
> Sorry to keep troubling you - I'm very appreciative of the help you've
> given so far. I've got one more for you that has got me totally stumped.
> I'm writing a program which deals with largish-files, the one I am using
> as a test case is not stupidly large at about 200mb. After three
> evenings, I have finally gotten rid of all the stack overflows, but I am
> unfortunately left with something that is rather unfeasably slow. I was
> hoping someone with some keener skills than I could take a look, I've
> tried to distill it to the simplest case.
>
> This program just reads in a file, interpreting each value as a double,
> and does a sort of running average on them. The actual function doesn't
> matter too much, I think it is the reading it in that is the problem.

Replace getFloat64le with e.g. getWord64le to confirm.
The reading of IEEE754 floating point numbers seems rather complicated.
Maybe doing it differently could speed it up, maybe not.

> This takes about three minutes to run on my (fairly modern) laptop.. The
> equivilant C program takes about 5 seconds.

Are you sure that it's really equivalent?

>
> I'm sure I am doing something daft, but I can't for the life of me see
> what. Any hints about how to get the profiler to show me useful stuff
> would be much appreciated!
>
> All the best,
>
> Philip
>
> PS: If, instead of computing a single value I try and build a list of
> the values, the program ends up using over 2gb of memory to read a 200mb
> file.. any ideas on that one?

Hm, 200MB file => ~25 million Doubles, such a list needs at least 400MB.
Still a long way to 2GB. I suspect you construct a list of thunks, not 
Doubles.



------------------------------

Message: 5
Date: Fri, 30 Apr 2010 03:18:05 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Data.Binary.Get for large files
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="utf-8"

Am Freitag 30 April 2010 02:46:08 schrieb MAN:
> I can't find the error in your code (assuming there is an error), so I'm
> checking the code you didn't write, and the only thing that set off an
> alarm was...
>
> getFloat64le :: Get Double
> getFloat64le = getFloat (ByteCount 8) $ splitBytes . reverse
>
> splitBytes :: [Word8] -> RawFloat
>
> ...that every chunk read in the Get monad is being reversed, so that you
> can take one float (and you are taking in over 26 million floats) in
> little endian. I really don't know if this hits performance, but I
> assume the C equivalent would be reading an array in reverse order.
> I am more than willing to believe this is not the cause of such
> performance loss, but can't find a reason.

The reversing doesn't matter much, using getFloat64be takes very nearly the 
same time.

>
>
> PS1: "(e == True)" == "e"

Yes!

>
> PS2:
> I know it's not important, but I can't help it: that is not an average
> you're computing...

Sort of a weighted average, where double #k is weighted 2^(k-1-n).



------------------------------

Message: 6
Date: Fri, 30 Apr 2010 03:20:39 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Re: Iterating through a list of
        char...
To: "Jean-Nicolas Jolivet" <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain;  charset="iso-8859-1"

Am Friday 30 April 2010 03:13:45 schrieben Sie:
> Sorry to bug you again :\
> I'm getting the following when trying to run it this time...Absolutely
> no idea why!
>
> Not in scope: type constructor or class `Word8'

I forgot

import Data.Word (Word8)

:(

>
> (I replied to you directly since I didn't want to make the already-large
> thread on the mailing list even larger ;)

I do :)

>
> Jean-Nicolas
>
> On 2010-04-29, at 8:29 PM, Daniel Fischer wrote:
> > Am Freitag 30 April 2010 02:03:44 schrieb Jean-Nicolas Jolivet:
> >> I tried to run your code, however, I'm getting a:
> >> "Module `Data.ByteString.Unsafe' does not export `unsafeAt'"
> >
> > Yes, of course. It's "unsafeIndex" (unsafeAt is for arrays), d'oh.
> >
> >> (Using the latest ghc...
> >> http://www.haskell.org/ghc/docs/latest/html/libraries/bytestring-0.9.
> >>1.6 /Data-ByteString-Unsafe.html )
> >>
> >> Jean-Nicolas Jolivet



------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 22, Issue 50
*****************************************

Beginners Digest, Vol 22, Issue 50

Reply via email to