Beginners Digest, Vol 40, Issue 15

beginners-request Wed, 12 Oct 2011 03:00:37 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  How would you improve this program? (Ovidiu Deac)
   2. Re:  How would you improve this program? (Brent Yorgey)
   3.  using record in aeson (Rick Murphy)
   4. Re:  quickCheck generation question (Joe Van Dyk)
   5.  tokenizing a string and parsing the string (kolli kolli)
   6. Re:  tokenizing a string and parsing the string (Stephen Tetley)
   7. Re:  tokenizing a string and parsing the string
      (Erik de Castro Lopo)
   8. Re:  tokenizing a string and parsing the string (Christian Maeder)


----------------------------------------------------------------------

Message: 1
Date: Tue, 11 Oct 2011 14:04:23 +0300
From: Ovidiu Deac <[email protected]>
Subject: Re: [Haskell-beginners] How would you improve this program?
To: Lorenzo Bolla <[email protected]>
Cc: [email protected]
Message-ID:
        <CAKVsE7u6PjfDTPq5TCmbdx+eZtKRW=wwvssw+-1m_6ct_ut...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

You have your main function which calls printTokens which calls again
printToken. Both of the print* functions are IO (). I see that as
being bad style.

It would be cleaner if you would build the string to be printed in
some nice pure functions and then print that string in main directly.
That way you isolate the IO actions in the main function only.

I'm thinking something like this:

tokenToString :: Int -> Int -> Token -> String
tokenToString maxLength maxCount (Token w c) =
    ...

tokenListToString :: [Token] -> String
tokenListToString tokens =
    join "\n" result -- from Data.List.Utils
    where
        result = map (tokenToString maxLength maxCount) sortedTokens
        ...

main = do
    words <- getContents
    let output = tokenListToString $ countTokens words
    putStr output

This way your function types actually mean something instead of just
having functions IO () which could do whatever they like

ovidiu

On Sun, Oct 9, 2011 at 11:11 PM, Lorenzo Bolla <[email protected]> wrote:
> Hi all,
> I'm new to Haskell and I'd like you to take a look at one of my programs and
> tell me how you would improve it (in terms of efficiency, style, and so
> on!).
> The source code is
> here:?https://github.com/lbolla/stanford-cs240h/blob/master/lab1/lab1.hs
> The program is an implementation of this
> problem:?http://www.scs.stanford.edu/11au-cs240h/labs/lab1.html?(basically,
> counting how many times a word appear in a text.)
> (I'm not a Stanford student, so by helping me out you won't help me to cheat
> my exam, don't worry!)
> I've implemented 3 versions of the algorithm:
>
> a Haskell version using the standard "sort": read all the words from stdin,
> sort them and group them.
> a Haskell version using map: read all the words from stdin, stick each word
> in a Data.Map incrementing a counter if the word is already present in the
> map.
> a Python version using defaultdict.
>
> I timed the different versions and the results are
> here:?https://github.com/lbolla/stanford-cs240h/blob/master/lab1/times.png.
> The python version is the quickest (I stripped out the fancy formatting
> before benchmarking, so IO is not responsible for the time difference).
> Any comments on the graph, too?
> Thanks a lot!
> L.
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>
>



------------------------------

Message: 2
Date: Tue, 11 Oct 2011 11:28:49 -0400
From: Brent Yorgey <[email protected]>
Subject: Re: [Haskell-beginners] How would you improve this program?
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii

On Tue, Oct 11, 2011 at 02:04:23PM +0300, Ovidiu Deac wrote:

>     join "\n" result -- from Data.List.Utils

By the way, 'join "\n" result' is better written 'unlines result' (and
'join' is better written 'intercalate'). (Otherwise I completely agree
with your email.)

-Brent



------------------------------

Message: 3
Date: Tue, 11 Oct 2011 21:17:55 -0400
From: Rick Murphy <[email protected]>
Subject: [Haskell-beginners] using record in aeson
To: [email protected]
Message-ID: <1318382275.15216.4.camel@metho-laptop>
Content-Type: text/plain; charset="UTF-8"

Hi All:

I've been elaborating on aeson examples and wondered whether someone
could clarify the syntax for using a record in a pair. My goal is to
substitute a record for the list of pairs created through the data
constructor O [(T.Text, Value)] in MyPair below. Reason being to embed
the semantics of the json file into the record. To reproduce, just
uncomment the lines in the source below.

The json file structure is as follows:
{"outer":{"type":"literal","value":"rick"}}

Note my naive attempt in the commented lines returns the following
message from ghci. 'f0 b0' doesn't give me much to go on.

-- E1.hs:35:41:
--     Couldn't match expected type `MyRecord' with actual type `f0 b0'
--     In the expression: MyRecord <$> o'' .: "type" <*> o'' .: "value"
--     In the first argument of `R', namely
--       `(t, MyRecord <$> o'' .: "type" <*> o'' .: "value")'
--     In the expression: R (t, MyRecord <$> o'' .: "type" <*> o'' .:
"value")
-- Failed, modules loaded: none.

{-# LANGUAGE OverloadedStrings #-}

module Main where

import Control.Applicative
import Control.Monad (mzero)

import qualified Data.ByteString as B
import qualified Data.Map as M
import qualified Data.Text as T

import Data.Aeson
import qualified Data.Aeson.Types as J
import Data.Attoparsec

-- data MyRecord = MyRecord {s :: String, u :: String} deriving (Show)

data MyPair = O (T.Text, [(T.Text, Value)])
           -- | R (T.Text, MyRecord) 
              deriving (Show)

data ExifObject = ExifObject [MyPair]
                deriving Show

data Exif       = Exif [ExifObject]
                deriving Show

instance FromJSON ExifObject
  where
    parseJSON (Object o) = ExifObject <$> parseObject o
      where
        parseObject o' = return $ map toMyPair (M.assocs o')

        toMyPair (t, Object o'')= O (t, M.assocs o'')
--      toMyPair (t, Object o'')= R (t, MyRecord <$> o'' .: "type" <*>
o'' .: "value")
        toMyPair _              = error "unexpected"

    parseJSON _          = mzero

parseAll :: B.ByteString -> [ExifObject]
parseAll s = case (parse (fromJSON <$> json) s) of
  Done _ (Error err)  -> error err
  Done ss (Success e) -> e:(parseAll ss)
  _                   -> []

main :: IO ()
main = do s <- B.readFile "e1.json"
          let p = Exif $ parseAll s
          print p

--
Rick




------------------------------

Message: 4
Date: Tue, 11 Oct 2011 20:11:01 -0700
From: Joe Van Dyk <[email protected]>
Subject: Re: [Haskell-beginners] quickCheck generation question
To: Christian Maeder <[email protected]>
Cc: beginners <[email protected]>
Message-ID:
        <CACfv+pLM14MjhhgvyN=BrmR_yYSec0=oUHiDJ7ZTxi=wyny...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Mon, Oct 10, 2011 at 6:26 AM, Christian Maeder
<[email protected]> wrote:
> Am 08.10.2011 01:40, schrieb Joe Van Dyk:
>>
>> I'm going through the 99 Haskell problems and am trying to write
>> quickCheck properties for each one.
>>
>> -- 3 (find k"th element of a list)
>> element_at xs x = xs !! x
>> prop_3a xs x = (x< ?length xs&& ?x>= 0) ==> ?element_at xs (x::Int) ==
>> (xs !! x::Int)
>
> The definition and test look very similar (basically "=" is replaced by
> "=="). So this seems to test reliability of definitions and Eq instances.
> Such tests should not be necessary. (Testing different implementations for
> equality makes more sense.)

Well, yes, of course.  The point is learning how the tests work, not
the code under test.

Joe



------------------------------

Message: 5
Date: Tue, 11 Oct 2011 23:28:42 -0600
From: kolli kolli <[email protected]>
Subject: [Haskell-beginners] tokenizing a string and parsing the
        string
To: [email protected]
Message-ID:
        <CAE7D9k7tQCYU02xif0RDUO=tgotfd8nfwxyivm0df_radbr...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

Can anyone help how to tokenize  a string and parse it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20111011/9ba00a8f/attachment-0001.htm>

------------------------------

Message: 6
Date: Wed, 12 Oct 2011 06:34:43 +0100
From: Stephen Tetley <[email protected]>
Subject: Re: [Haskell-beginners] tokenizing a string and parsing the
        string
Cc: [email protected]
Message-ID:
        <cab2tprb1ay_qfcayj1kebc8kmionqto8s30zxeovcpviush...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

In combinator parsing with say Parsec, you don't tokenize the input
the parsing - this is an instance of so called "scannerless" parsing
(slight exaggeration for sake of simplicity).

If you needed to tokenize then parse, this is the model followed by
Alex and Happy.

On 12 October 2011 06:28, kolli kolli <[email protected]> wrote:
...
> Can anyone help how to tokenize ?a string and parse it.



------------------------------

Message: 7
Date: Wed, 12 Oct 2011 19:39:10 +1100
From: Erik de Castro Lopo <[email protected]>
Subject: Re: [Haskell-beginners] tokenizing a string and parsing the
        string
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=US-ASCII

Stephen Tetley wrote:

> In combinator parsing with say Parsec, you don't tokenize the input
> the parsing - this is an instance of so called "scannerless" parsing
> (slight exaggeration for sake of simplicity).
> 
> If you needed to tokenize then parse, this is the model followed by
> Alex and Happy.

It is actually possible to use alex to split the input into tokens and
then use Parsec to parse the stream of tokens. Token parsers tend
to run a bit faster than Char parsers.

Erik
-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/



------------------------------

Message: 8
Date: Wed, 12 Oct 2011 11:24:03 +0200
From: Christian Maeder <[email protected]>
Subject: Re: [Haskell-beginners] tokenizing a string and parsing the
        string
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Despite the term "scannerless" parsing you'll typically have "lexical 
rules" for the tokens (like identifiers, numbers, separators, etc.) and 
normal parser/grammar rules.

I recommend to use parsec also as scanner (and avoid a separate 
tokenizer). I don't think, speed matters that much.

The point is that after every token the spaces or comments until the 
next token starts must be consumed from the input. (I call this 
"skipping", Daan Leijen has a "lexeme" parser for this in his 
Parsec.Token module.)

HTH Christian

Am 12.10.2011 10:39, schrieb Erik de Castro Lopo:
> Stephen Tetley wrote:
>
>> In combinator parsing with say Parsec, you don't tokenize the input
>> the parsing - this is an instance of so called "scannerless" parsing
>> (slight exaggeration for sake of simplicity).
>>
>> If you needed to tokenize then parse, this is the model followed by
>> Alex and Happy.
>
> It is actually possible to use alex to split the input into tokens and
> then use Parsec to parse the stream of tokens. Token parsers tend
> to run a bit faster than Char parsers.
>
> Erik



------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 40, Issue 15
*****************************************

Beginners Digest, Vol 40, Issue 15

Reply via email to