Send Beginners mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."
Today's Topics:
1. Re: Re: Iterating through a list of char... (Stephen Tetley)
2. Re: Re: Iterating through a list of char... (Daniel Fischer)
3. Re: Re: Converting an imperative program to haskell (Hein Hundal)
4. Re: Re: Converting an imperative program to haskell
(Daniel Fischer)
5. Re: [Int] oddness with summing large lists (Philip Scott)
6. Data.Binary.Get for large files (Philip Scott)
7. Re: Re: Iterating through a list of char... (Daniel Fischer)
----------------------------------------------------------------------
Message: 1
Date: Thu, 29 Apr 2010 22:16:47 +0100
From: Stephen Tetley <[email protected]>
Subject: Re: [Haskell-beginners] Re: Iterating through a list of
char...
Cc: [email protected]
Message-ID:
<[email protected]>
Content-Type: text/plain; charset=ISO-8859-1
On 29 April 2010 21:39, Jean-Nicolas Jolivet <[email protected]> wrote:
[Snip]
> Every solution I found was iterating to a list of character one character at
> a time, and outputting a character as a result (whether it is by recursion,
> mapping, etc..), however, what I was trying to explain in my previous post
> was that; when I am processing the escape character, this character should
> not be added to the resulting string... This is where I decided to use the
> Maybe monad..I'm just wondering if there is a better way...!
Hi Jean-Nicolas
The 'hand crafted lexer' style is quite common for this, with
different functions for different states (here body and escape).
I use a Hughes list as an explicit accumulator (a Hughes list is
sometimes called a difference list and familiar for strings as the
ShowS type synonym). This has more efficient right-addition (snoc)
than normal lists [a].
> toString :: ShowS -> String
> toString = ($ [])
> snoc :: ShowS -> Char -> ShowS
> snoc f a = f . (a:)
> emptyH :: ShowS
> emptyH = id
> escape :: String -> ShowS -> ShowS
> escape (_:xs) acc = body xs acc -- drop1, change state
> escape _ acc = acc -- oh dear escape to the end-of-string
> body :: String -> ShowS -> ShowS
> body ('\\':xs) acc = escape xs acc -- change state
> body (x:xs) acc = body xs (acc `snoc` x)
> body _ acc = acc
> run :: String -> String
> run xs = toString (body xs emptyH)
> demo1 = run "\\s1234" -- "1234"
Best wishes
Stephen
------------------------------
Message: 2
Date: Thu, 29 Apr 2010 23:23:52 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Re: Iterating through a list of
char...
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="utf-8"
Am Donnerstag 29 April 2010 22:39:05 schrieb Jean-Nicolas Jolivet:
> Every solution I found was iterating to a list of character one
> character at a time, and outputting a character as a result (whether it
> is by recursion, mapping, etc..),
Well, that's necessary for what you want to do, isn't it?
> however, what I was trying to explain
> in my previous post was that; when I am processing the escape character,
> this character should not be added to the resulting string... This is
> where I decided to use the Maybe monad..I'm just wondering if there is a
> better way...!
Depends on your criteria.
For raw performance, I think the directly coded recursion is a bit better
(btw,
fooGen e esc norm = go False
where
go b (x:xs)
| x == e = go True xs
| b = esc x : go False xs
| otherwise = norm x : go False xs
go _ [] = []
).
For code cleanliness, catMaybes&zipWith and a good directly coded recursion
are about equal.
For leetness, the foldr is beats them, but surely one could do much
'better'.
------------------------------
Message: 3
Date: Thu, 29 Apr 2010 14:49:16 -0700 (PDT)
From: Hein Hundal <[email protected]>
Subject: Re: [Haskell-beginners] Re: Converting an imperative program
to haskell
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii
--- On Thu, 4/29/10, Maciej Piechotka <[email protected]> wrote:
> Hein Hundal wrote:
> >
> > I figured I should try a larger program
> in Haskell, so I am
> > converting one of my Mathematica programs, a simulator
> for the
> > card game Dominion, over to Haskell. Most of
> that is going well
> > except for one patch of imperative code. My
> Haskell version of
> > this code is ugly. I was hoping someone could
> recommend a better
> > way to do it. I will paste a simplified version
> of the code
> > below. If necessary, I can provide all the other
> code used for
>
> 1. Use strong typing. Or any typing
I simplified the code for the post. In the real version, I use strong typing.
The Card type is enumerated. I have been using [Card] instead of calling it a
Deck. I could change that.
> 2. User guards instead of nested if's:
In the "procloop" function, I have if's nested quite deeply. I could easily
use guards. I will try that.
> 3. Don't use too much variables. 6-8 is probably the
> maximum you should
> deal with (human short-term memory holds about 5-10 points
> of entry.
> Split functions into smaller functions (even in where).
I do have to get the information into the functions, so the only way I can
avoid having lots of variables is by introducing new structures. I can do that.
> 4. Discard result you don't need:
>
> isDone (LSt gs strat iA iTh i aA iB iC bDone) =
> bDone
> isDone (LSt _ _ _ _ _ _ _ _ _ bDone) = bDone
Yes, that's much better.
proc gs' strat = let
iA = 1
iTh = 0
i = 0
aA = replicate 20 0
iB = 1
iC = 0
bDone = False
gs = apps ("<<<"++show(gs')++">>>") gs'
lst = LSt gs strat iA iTh i aA iB iC bDone
lst2 = until isDone procloop lst
isDone (LSt _ _ _ _ _ _ _ _ bDone) = bDone
output (LSt gs _ _ _ _ aA iB iC _) = (iB, iC, appd aA gs)
in output lst2
> 5. Don't reuse the same name in the same scope (like iA as
> 1 and iA as parameter to isDone).
I did hesitate to use the same parameter name twice.
> 6. While Haskell have long tradition of having short
> namesit is not
> always good (see 3). Use them only if you are sure it is
> clear what they
> mean:
In the original version, I had longer variable names where they seemed
necessary.
The main sources of ugliness are the long lists of variables. Every time I
call doAct or construct a LoopState variable, I am repeating all those
variables. I will try changing the type of doAct to
doAct :: LoopState -> LoopState
Cheers,
Hein
------------------------------
Message: 4
Date: Fri, 30 Apr 2010 00:16:20 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Re: Converting an imperative program
to haskell
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
Am Donnerstag 29 April 2010 23:49:16 schrieb Hein Hundal:
> --- On Thu, 4/29/10, Maciej Piechotka <[email protected]> wrote:
> > Hein Hundal wrote:
> > > I figured I should try a larger program
> >
> > in Haskell, so I am
> >
> > > converting one of my Mathematica programs, a simulator
> >
> > for the
> >
> > > card game Dominion, over to Haskell. Most of
> >
> > that is going well
> >
> > > except for one patch of imperative code. My
> >
> > Haskell version of
> >
> > > this code is ugly. I was hoping someone could
> >
> > recommend a better
> >
> > > way to do it. I will paste a simplified version
> >
> > of the code
> >
> > > below. If necessary, I can provide all the other
> >
> > code used for
> >
> > 1. Use strong typing. Or any typing
>
> I simplified the code for the post. In the real version, I use strong
> typing. The Card type is enumerated. I have been using [Card] instead
> of calling it a Deck. I could change that.
>
Whether you use [Card],
type Deck = [Card]
or
newtype Deck = D [Card]
is not important. Using enumerations for suite and value instead of Ints
is.
> > 2. User guards instead of nested if's:
>
> In the "procloop" function, I have if's nested quite deeply.
Nested ifs (beyond, say, three levels) are a pain to decipher. Using guards
(and maybe 'case's) will probably be a great improvement.
> I could easily use guards. I will try that.
>
> > 3. Don't use too much variables. 6-8 is probably the
> > maximum you should
> > deal with (human short-term memory holds about 5-10 points
> > of entry.
> > Split functions into smaller functions (even in where).
>
> I do have to get the information into the functions, so the only way I
> can avoid having lots of variables is by introducing new structures.
> I can do that.
>
As long as they're meaningful. Just smashing a couple of values together
into a structure to reduce the variable count isn't good.
> > 4. Discard result you don't need:
> >
> > isDone (LSt gs strat iA iTh i aA iB iC bDone) =
> > bDone
> > isDone (LSt _ _ _ _ _ _ _ _ _ bDone) = bDone
>
> Yes, that's much better.
Perhaps use named field syntax,
data LoopState
= LSt
{ gameState :: GameState
, strat :: Strategy
, ...
, isDone :: Bool
}
Then in procloop, where you just update one or two fields,
procloop lst@(LSt gs' strat iA iTh i aA iB iC bDone)
| iA<=0 || i>=20 || actcd gs <=0 || iCd == -1
= lst{ idx = idx lst + 1, isDone = True }
| iCd == 1
= lst{ idx = idx lst + 1, iThFld = iThFld lst + 1, isDone = False }
| iThFld lst > 0
= lst{ ... }
| otherwise
= lst{ ... }
where
iCd = stratchoose ...
-- other let-bindings, only those needed will be evaluated
>
> proc gs' strat = let
> iA = 1
> iTh = 0
> i = 0
> aA = replicate 20 0
> iB = 1
> iC = 0
> bDone = False
> gs = apps ("<<<"++show(gs')++">>>") gs'
> lst = LSt gs strat iA iTh i aA iB iC bDone
> lst2 = until isDone procloop lst
> isDone (LSt _ _ _ _ _ _ _ _ bDone) = bDone
> output (LSt gs _ _ _ _ aA iB iC _) = (iB, iC, appd aA gs)
> in output lst2
>
> > 5. Don't reuse the same name in the same scope (like iA as
> > 1 and iA as parameter to isDone).
>
> I did hesitate to use the same parameter name twice.
>
> > 6. While Haskell have long tradition of having short
> > namesit is not
> > always good (see 3). Use them only if you are sure it is
> > clear what they
> > mean:
>
> In the original version, I had longer variable names where they seemed
> necessary.
>
>
> The main sources of ugliness are the long lists of variables. Every
> time I call doAct or construct a LoopState variable, I am repeating all
> those variables. I will try changing the type of doAct to
>
> doAct :: LoopState -> LoopState
That looks promising.
>
> Cheers,
> Hein
------------------------------
Message: 5
Date: Thu, 29 Apr 2010 23:17:05 +0100
From: Philip Scott <[email protected]>
Subject: Re: [Haskell-beginners] [Int] oddness with summing large
lists
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi ho,
> I can deduce that you have a 32-bit system (as I do).
> Because:
>
> Prelude> 1000000000000 :: Int
> -727379968
>
> So ([1 .. 1000000000000] :: [Int]) == [] and sum [] == 0 isn't surprising
> at all.
>
>
Once again, your explanation makes perfect sense :)
>> And now the really odd part... take away one zero
>>
>> Prelude> let d = [1..100000000000] :: [Int]
>>
> Prelude> 100000000000 :: Int
> 1215752192
>
> , which is a pretty large number. Since you gave the list a name, it stays
> in memory. For each Int in the list, at least 3 machine words are needed
> (one for the Int, pointer from cons-cell to value, pointer to next cons-
> cell; actually, I think the overhead is larger), so if you have less than
> 14GB of RAM, your ghci can't do it.
>
>
Fair enough, I don't think my laptop is quite up to that :)
Thanks for your help,
- Philip
------------------------------
Message: 6
Date: Thu, 29 Apr 2010 23:37:59 +0100
From: Philip Scott <[email protected]>
Subject: [Haskell-beginners] Data.Binary.Get for large files
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
Hello again folks,
Sorry to keep troubling you - I'm very appreciative of the help you've
given so far. I've got one more for you that has got me totally stumped.
I'm writing a program which deals with largish-files, the one I am using
as a test case is not stupidly large at about 200mb. After three
evenings, I have finally gotten rid of all the stack overflows, but I am
unfortunately left with something that is rather unfeasably slow. I was
hoping someone with some keener skills than I could take a look, I've
tried to distill it to the simplest case.
This program just reads in a file, interpreting each value as a double,
and does a sort of running average on them. The actual function doesn't
matter too much, I think it is the reading it in that is the problem.
Here's the code:
import Control.Exception
import qualified Data.ByteString.Lazy as BL
import Data.Binary.Get
import System.IO
import Data.Binary.IEEE754
myGetter acc = do
e <- isEmpty
if e == True
then
return acc
else do
t <- getFloat64le
myGetter $! ((t+acc)/2)
myReader file = do
h <- openBinaryFile file ReadMode
bs <- BL.hGetContents h
return $ runGet (myGetter 0) bs
main = do
d <- myReader "data.bin"
evaluate d
This takes about three minutes to run on my (fairly modern) laptop.. The
equivilant C program takes about 5 seconds.
I'm sure I am doing something daft, but I can't for the life of me see
what. Any hints about how to get the profiler to show me useful stuff
would be much appreciated!
All the best,
Philip
PS: If, instead of computing a single value I try and build a list of
the values, the program ends up using over 2gb of memory to read a 200mb
file.. any ideas on that one?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.haskell.org/pipermail/beginners/attachments/20100429/5862ff1a/attachment-0001.html
------------------------------
Message: 7
Date: Fri, 30 Apr 2010 01:27:54 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Re: Iterating through a list of
char...
To: "Jean-Nicolas Jolivet" <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
Am Donnerstag 29 April 2010 23:13:21 schrieb Jean-Nicolas Jolivet:
> > -- assuming that only characters > 'i' (chr 105) are escaped (or the
> > escape character itself, but that should be dropped regardless).
>
> Sorry, this is my fault for not being clearer about it: I am decoding
> text that is already encoded with a Binary to ASCII encoding... the
> exact encoding process is this:
>
> 1. Fetch a character from the input stream.
> 2. Increment the character's ASCII value by 42, modulo 256
> 3. If the result is a critical character ('\NUL', '\r', '\n' or '='),
> write the escape character ('=') to the output stream and increment
> character's ASCII value by 64, modulo 256. 4. Output the character to
> the output stream.
Okay, so there are no two escape characters in succession.
>
> I am writing a decoder here so obviously I am reversing the process....
> (I remove 42 for regular characters, 106 for special characters...and if
> the result is < 0, I add 256 to it...)
>
> Just adding (yet again) more context information here ;) Still trying to
> fully understand your last suggestions Daniel! :) Thanks again!
>
> Also, I wouldn't want anyone to think I just want someone to write the
> algorithm for me! :) Like I said, I already have a fully working
> algorithm, but being new to Haskell, I'm looking for ways to optimize it
> (or, really just different ways that I could do it!)
>
> Here's how I am doing it right now (I am using ByteString here,
Then the foldr is not the best option.
> but it should be pretty straightforward anyway)...
> I'm warning you, this is pretty ugly :)
>
> -- Decode an encoded ByteString
> -- the zip + tail method removes the first character so I am adding it
> afterward (ugly)
> decodeByteString :: L.ByteString -> L.ByteString
> decodeByteString str = do
> let str1 = mapMaybe decodepair (L.zip str(L.tail str))
> let firstChar = decodechar (ord(L.head str) - 42)
> L.pack (firstChar:str1)
Is it really necessary to pack it? That's a relatively expensive operation,
it may be better to have it return a String if you're not doing anything
with it after decoding except writing it to a file or so.
The zipping is also not the optimal choice, it takes a lot of checking for
the chunk-boundaries (I assume you're using lazy ByteStrings, since you
chose the prefix L), and you construct pairs only to deconstruct them
immediately (the compiler *may* optimise them away, but I'm skeptical).
>
> -- Decode a pair of character, returning either the
> -- decoded character or Nothing
> decodepair :: (Char, Char) -> Maybe Char
> decodepair cs
>
> | snd(cs) == '=' = Nothing
> | fst(cs) == '=' = Just (decodechar(ord(snd cs) - 106))
> | otherwise = Just (decodechar(ord(snd cs) - 42))
>
> -- Reverse the modulo 256...
> decodechar :: Int -> Char
> decodechar i
>
> | i < 0 = chr (i + 256)
> | otherwise = chr i
Since you're doing arithmetic modulo 256, that stuff can be done faster and
simpler with Word8.
----------------------------------------------------------------------
{-# LANGUAGE BangPatterns #-}
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString as S
import Data.ByteString.Unsafe (unsafeAt)
escape :: Word8 -> Word8
escape = (+150)
normal :: Word8 -> Word8
normal = (+214)
decodeW :: L.ByteString -> [Word8]
decodeW = dec False . L.toChunks
where
dec _ [] = []
dec esc (str:more) = go esc 0
where
!len = S.length str
{-# INLINE charAt #-}
charAt :: Int -> Word8
charAt i = unsafeAt str i
go !b !i
| i == len = dec b more
| b = escape (charAt i) : go False (i+1)
| otherwise = case charAt i of
61 -> go True (i+1)
c -> normal c : go False (i+1)
word8ToChar :: Word8 -> Char
word8ToChar = toEnum . fromIntegral
decodeC :: L.ByteString -> String
decodeC = map word8ToChar . decodeW
decodeBS :: L.ByteString -> L.ByteString
decodeBS = L.pack . decodeW
----------------------------------------------------------------------
>
> Jean-Nicolas Jolivet
>
> On 2010-04-29, at 4:50 PM, Daniel Fischer wrote:
> > Am Donnerstag 29 April 2010 21:37:15 schrieb Jean-Nicolas Jolivet:
> >> First I would like to thank everyone for the very interesting replies
> >> and suggestions I got so far!...
> >>
> >> I tried to implement (and at the very least understand) most of
> >> them!...
> >>
> >> To add to the context here, what I am trying to do is:
> >>
> >> -apply a "transformation" to a character (in my case, subtracting 42
> >> to its ASCII value, which I obtain with chr(ord(c) - 42) -if the
> >> character is preceded by a specific character (that would be, an
> >> escape character, in this case '=') then subtract 106 to its value
> >> instead of 42... -if the character is the escape character itself,
> >> '=', then skip it altogether
> >
> > Ah, that complicates matters a little.
> > - What happens if ord c < 42 (ord c < 106, if c is preceded by the
> > escape character?)
> > - What about escaped escape characters?
> >
> > However,
> >
> > foo xs = catMaybes $ zipWith f (' ':xs) xs
> > where
> > f _ '=' = Nothing
> > f '=' c = Just (chr $ ord c - 106)
> > f _ c = Just (chr $ ord c - 42)
> >
> > is still pretty simple, as is the direct recursion
> >
> > foo = go ' '
> > where
> > go _ ('=' :cs) = go '=' cs
> > go '=' (c:cs) = chr (ord c - 106) : go c cs
> > go _ (c:cs) = chr (ord c - 42) : go c cs
> > go _ _ = []
> >
> > -- assuming that only characters > 'i' (chr 105) are escaped (or the
> > escape character itself, but that should be dropped regardless).
> >
> > fooGen :: Char -> (Char -> Char) -> (Char -> Char) -> String -> String
> > fooGen e esc norm str = catMaybes $ zipWith f (d:str) str
> > where
> > d = if e == maxBound then pred e else succ e
> > f x y
> >
> > | y == e = Nothing
> > | x == e = Just (esc y)
> > | otherwise = Just (norm y)
> >
> > is an easy generalisation.
> >
> >> (keeping in mind that the next character needs to be
> >> escaped)...
> >>
> >> I managed to do it, however I'm not totally satisfied in the way I
> >> did it... the problem was that... as I just explained, in some cases,
> >> the character that is being processed has to be "skipped" (and by
> >> that I mean, not added to the resulting string). This happens when
> >> the processed character IS the escape character...
> >>
> >> What I did was to build a List of Maybe Char.... my function does the
> >> proper operation on the character and returns a "Just Char" when the
> >> character is processed, or Nothing when it is the escaped
> >> character... so basically I would end up with something like: [Just
> >> 'f', Just 'o', Just 'o', Nothing]... I am mapping this using mapMaybe
> >> to end up with a proper String...
> >>
> >> Would there be any more efficient way of doing this?
> >
> > That is already pretty efficient. The direct recursion is probably a
> > bit more efficient, but I don't think the difference will be large.
> >
> >> Considering that
> >> the escape character should NOT be added to the resulting string, is
> >> there any way I can avoid using the Maybe monad?
> >
> > Sure, apart from the direct recursion,
> >
> > fooGen e esc norm str = tail $ foldr f [] (d:str)
> > where
> > d = if e == maxBound then pred e else succ e
> > f x (y:zs)
> >
> > | y == e = x:zs
> > | x == e = x:esc y:zs
> > | otherwise = x:norm y:zs
> >
> > f x [] = [x]
> >
> > catMaybes and zipWith is clearer, though, and I don't think the foldr
> > will perform better.
> >
> >> Once again, thanks everyone for all the suggestions!
> >>
> >> Jean-Nicolas Jolivet
------------------------------
_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners
End of Beginners Digest, Vol 22, Issue 49
*****************************************