Beginners Digest, Vol 39, Issue 1

beginners-request Thu, 01 Sep 2011 03:00:50 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  Help me improve my code (Brent Yorgey)
   2.  Help with enumerator pipeline (Michael Xavier)
   3. Re:  Help with enumerator pipeline (David McBride)
   4. Re:  Help with enumerator pipeline (Felipe Almeida Lessa)
   5. Re:  Help with enumerator pipeline (Michael Xavier)


----------------------------------------------------------------------

Message: 1
Date: Wed, 31 Aug 2011 11:17:41 -0400
From: Brent Yorgey <[email protected]>
Subject: Re: [Haskell-beginners] Help me improve my code
To: Neuman Vong <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=iso-8859-1

On Wed, Aug 31, 2011 at 01:28:36AM -0700, Neuman Vong wrote:
> ... I like that this version is clearer to read, but
> unfortunately I've doubled the number of lines. ...

Of course, "clearer to read" depends on what you are experienced at
reading.  Personally, I find this version less clear to read. =)  But
trying lots of different ways to do something is always good.

> find :: Int -> String -> [Rate] -> Maybe Rate
> find _ _ []               = Nothing
> find index key rates
>     | length rates == 1   = Just (last rates)

In general, try to avoid testing the length of a list (which is
expensive if the list is long), and you should definitely never test
if the length of a list is equal to a small number.  You should also
avoid using 'last' (which can crash if the list is empty).  You can
solve both these problems at once by pattern matching the list:

  find index key [rate] = Just rate

>     | length key <= index = Just (longestPrefix rates)
>     | otherwise           = find (index + 1) key (match index key rates)

My general advice is to avoid writing explicitly recursive functions
as much as possible, instead preferring to write algorithms in terms
of existing recursive combinators like map, foldr, filter... In
general it makes your programs more modular and easier to understand
(once you get used to it).  However, in this particular case I am
having trouble figuring out exactly what 'find' does so I don't have a
good suggestion of a different way to write it.  Generally, the idea
is to think about incrementally transforming a data structure through
a number of intermediate steps until finally reaching the answer,
rather than computing the answer "all in one go".  In strict languages
this sort of repeated transformation of a data structure can be less
efficient, but Haskell's laziness makes it viable.  If you'd like an
example of the sort of thing I'm talking about let me know and I could
make one up.

> 
> match :: Int -> String -> [Rate] -> [Rate]
> match index key rates = do
>     rate <- rates
>     when (index < length (prefix rate))
>          (guard $ (prefix rate !! index) == (key !! index))
>     return rate
> 
> longestPrefix :: [Rate] -> Rate
> longestPrefix = maximumBy (compare `on` (length . prefix))
> 
> toRates :: [String] -> [Rate]
> toRates [] = []
> toRates rates = map toRate rates

toRates = map toRate.  map can correctly handle the empty list case
already.

> 
> toRate :: String -> Rate
> toRate line = let prefix:price:_ = splitOn' ", " line in Rate prefix price
> 
> splitOn' :: String -> String -> [String]
> splitOn' delim str = fmap unpack (splitOn (pack delim) (pack str))

If you are just going to be using strings anyway, it's silly to
convert to Text or Bytestring or whatever it is you are converting to
here just to get access to splitOn.  Instead you could use the splitOn
function from the Data.List.Split module ('cabal install split').

-Brent

> 
> main = getArgs >>= \args ->
>     case args of
>         []    -> getProgName >>= \progName -> error ("Usage: " ++
> progName ++ " <key>")
>         key:_ -> interact ((++ "\n") . show' . (find 0 key) . toRates . lines)
>         where show' Nothing = "Couldn't find a match"
>               show' (Just rate) = show rate
> 
> 
> On Tue, Aug 30, 2011 at 7:48 AM, Brent Yorgey <[email protected]> wrote:
> > On Mon, Aug 29, 2011 at 09:52:28PM -0700, Neuman Vong wrote:
> >> Hi Haskell people,
> >>
> >> I'm pretty new to Haskell still. There were a bunch of things I didn't
> >> know how to do in the following script, I'm hoping some people on this
> >> list can help with. For example, I had trouble returning an ExitCode
> >> and using getProgName without getting a compile-time type error. I
> >> feel like I'm doing something wrong with the Text/[Char] conversions
> >> too. I'd also really appreciate any style tips. Thanks in advance!
> >>
> >
> > Hi Neuman,
> >
> > This looks pretty good. ?I'm not very familiar with the Text library,
> > so perhaps someone else can comment on the conversions between Text
> > and String. ?But I can offer a few comments on style:
> >
> >> {-# LANGUAGE OverloadedStrings #-}
> >> module Main where
> >>
> >> import System (getArgs)
> >> import System.IO (hPutStrLn, stderr)
> >> import Data.Text (pack, splitOn, length, isPrefixOf, Text)
> >> import Prelude hiding (length)
> >>
> >> data Rate = Rate Text Text deriving (Show)
> >
> > If you use record syntax:
> >
> > ?data Rate = Rate { getPrefix :: Text, getPrice :: Text }
> >
> > then you get the selector functions getPrefix and getPrice for free
> > (so you don't have to write them in the 'where' clause below).
> >
> >>
> >> findBestPrefix number rates = foldl1 longestPrefix $ matching rates
> >> ? ?where
> >> ? ? ? ?getLength rate = length $ getPrefix rate
> >> ? ? ? ?getPrefix (Rate prefix _) = prefix
> >> ? ? ? ?getPrice (Rate _ price) = price
> >> ? ? ? ?longestPrefix r1 r2 = if getLength r1 > getLength r2 then r1
> >> else r2
> >> ? ? ? ?matching rates = [ rate | rate <- rates, getPrefix rate
> >> `isPrefixOf` number ]
> >
> > The matching function can also be implemented with a call to 'filter'. The
> > following three definitions are all equivalent, showing a progression
> > of simplification:
> >
> > ?matching rates = filter (\rate -> getPrefix rate `isPrefixOf` number) rates
> >
> > ?matching = filter (\rate -> getPrefix rate `isPrefixOf` number)
> >
> > ?matching = filter ((`isPrefixOf` number) . getPrefix)
> >
> > At this point you could even inline the definition of matching if you
> > wanted. ?You should use whichever of these definitions you find
> > clearest, I just wanted to show what is possible.
> >
> >>
> >> makeRates = map $ \line ->
> >> ? ?let (prefix:rate:_) = splitOn ", " (pack line) in Rate prefix rate
> >>
> >> main = getArgs >>= \args ->
> >> ? ?let findBest number rates = findBestPrefix number $ makeRates rates
> >> ? ?in case args of
> >> ? ? ? ?(arg:_) -> interact $ (++ "\n") . show . findBest (pack arg) . lines
> >> ? ? ? ?_ -> hPutStrLn stderr "Pass in a number as the first argument"
> >
> > One general tip: it helps a lot, especially when learning, to give
> > explicit type signatures to all your top-level functions. ?In fact, I
> > still do this. ?I *first* write down a type signature, and *then*
> > write an implementation. ?This may help you with your issues using
> > getProgName and and ExitCode, although without more information about
> > exactly what you were trying it's hard to know.
> >
> > -Brent
> >
> > _______________________________________________
> > Beginners mailing list
> > [email protected]
> > http://www.haskell.org/mailman/listinfo/beginners
> >
> 
> 
> 
> -- 
> neuman
> 



------------------------------

Message: 2
Date: Wed, 31 Aug 2011 12:40:42 -0700
From: Michael Xavier <[email protected]>
Subject: [Haskell-beginners] Help with enumerator pipeline
To: Haskell Beginners <[email protected]>
Message-ID:
        <CANk=zmgqeekr5vijfo63qx_vh_a3ztw--80yqxc6yyjxmqz...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

I've just recently started learning how to use the enumerator library. I'm
designing a utility that parses a file where each line is a JSON object.
I've designed it to look like:

source enumerator (enumHandle pretty much) ->
chunk by lines (enumeratee) ->
parse a line into an Object (enumeratee) ->
filter objects based on a criteria (enumeratee) ->
limit some keys from each object (enumeratee) ->
encode the object into a lazy bytestring (enumeratee) ->
output the file to stdout (iteratee)

I'm having difficulties with the types, particularly composing the
enumeratees in the middle. Someone in #haskell said that's a good case for
>=> from Control.Monad but that seems to not like it if an enumeratee
changes the type between the input and output.

Here's the relevant types:

type Object = Map Text Value

pipeline :: MonadIO m => (Enumerator ByteString IO ()) -> [Text] -> [Filter]
-> Iteratee a m ()
pipeline s rfs fs = s $$ splitLines    >=>
                         parseLine            >=>
                         (filterObjects fs)   >=>
                         (restrictFields rfs) >=>
                         encoder              $$
                         output
splitLines :: Monad m => Enumeratee ByteString ByteString m b
parseLine :: Monad m => Enumeratee ByteString Object m b
filterObjects :: Monad m => [Filter] -> Enumeratee Object Object m b
restrictObjects :: Monad m => [Text] -> Enumeratee Object Object m b
encoder :: Monad m => Enumeratee Object LBS.ByteString m b
output :: MonadIO m => Iteratee LBS.ByteString m ()

I'm using >=> because I'd prefer to compose them left-right for readability

Here's the error I get

 Couldn't match expected type `ByteString'
                with actual type `M.Map Text Value'
    Expected type: Data.Enumerator.Step ByteString m b1
                   -> Iteratee ByteString m b2
      Actual type: Enumeratee ByteString Object m0 b0
    In the first argument of `(>=>)', namely `parseLine'
    In the second argument of `(>=>)', namely
      `parseLine
     >=> (filterObjects fs) >=> (restrictFields rfs) >=> encoder'

Any ideas?

-- 
Michael Xavier
http://www.michaelxavier.net
LinkedIn <http://www.linkedin.com/pub/michael-xavier/13/b02/a26>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20110831/8b2e61a9/attachment-0001.htm>

------------------------------

Message: 3
Date: Wed, 31 Aug 2011 18:03:19 -0400
From: David McBride <[email protected]>
Subject: Re: [Haskell-beginners] Help with enumerator pipeline
To: Michael Xavier <[email protected]>
Cc: Haskell Beginners <[email protected]>
Message-ID:
        <can+tr41vhehvwm2+_2dcewkrg58q5yljqy8ijvw6d7+gnn3...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

I wanted to solve this but the only way I could get it to compile was:

run_ ((((((enumHandle undefined undefined $= splitLines) $= parseLine)
$= filterObjects undefined) $= restrictObjects undefined) $= encoder)
$$ output)

This looks terrible, and I'm sure there is a better way, but I can't
find it.  If you remove any of the parenthesis it complains.  I would
love to heard the "right" way to do this.


On Wed, Aug 31, 2011 at 3:40 PM, Michael Xavier <[email protected]> wrote:
> I've just recently started learning how to use the enumerator library. I'm
> designing a utility that parses a file where each line is a JSON object.
> I've designed it to look like:
> source enumerator (enumHandle pretty much) ->
> chunk by lines (enumeratee) ->
> parse a line into an Object (enumeratee) ->
> filter objects based on a criteria (enumeratee) ->
> limit some keys from each object (enumeratee) ->
> encode the object into a lazy bytestring (enumeratee) ->
> output the file to stdout (iteratee)
> I'm having difficulties with the types, particularly composing the
> enumeratees in the middle. Someone in #haskell said that's a good case for
>>=> from Control.Monad but that seems to not like it if an enumeratee
> changes the type between the input and output.
> Here's the relevant types:
> type Object = Map Text Value
> pipeline :: MonadIO m => (Enumerator ByteString IO ()) -> [Text] -> [Filter]
> -> Iteratee a m ()
> pipeline s rfs fs = s $$ splitLines ? ?>=>
> ? ? ? ? ? ? ? ? ? ? ? ? ?parseLine ? ? ? ? ? ?>=>
> ? ? ? ? ? ? ? ? ? ? ? ? ?(filterObjects fs) ? >=>
> ? ? ? ? ? ? ? ? ? ? ? ? ?(restrictFields rfs) >=>
> ? ? ? ? ? ? ? ? ? ? ? ? ?encoder ? ? ? ? ? ? ?$$
> ? ? ? ? ? ? ? ? ? ? ? ? ?output
> splitLines :: Monad m => Enumeratee ByteString ByteString m b
> parseLine :: Monad m => Enumeratee ByteString Object m b
> filterObjects :: Monad m => [Filter] -> Enumeratee Object Object m b
> restrictObjects :: Monad m => [Text] -> Enumeratee Object Object m b
> encoder :: Monad m => Enumeratee Object LBS.ByteString m b
> output :: MonadIO m => Iteratee LBS.ByteString m ()
> I'm using >=> because I'd prefer to compose them left-right for readability
> Here's the error I get
>
> ?Couldn't match expected type `ByteString'
> ? ? ? ? ? ? ? ? with actual type `M.Map Text Value'
> ? ? Expected type: Data.Enumerator.Step ByteString m b1
> ? ? ? ? ? ? ? ? ? ?-> Iteratee ByteString m b2
> ? ? ? Actual type: Enumeratee ByteString Object m0 b0
> ? ? In the first argument of `(>=>)', namely `parseLine'
> ? ? In the second argument of `(>=>)', namely
> ? ? ? `parseLine
> ? ? ?>=> (filterObjects fs) >=> (restrictFields rfs) >=> encoder'
> Any ideas?
> --
> Michael Xavier
> http://www.michaelxavier.net
> LinkedIn
>
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>
>



------------------------------

Message: 4
Date: Wed, 31 Aug 2011 22:50:45 -0300
From: Felipe Almeida Lessa <[email protected]>
Subject: Re: [Haskell-beginners] Help with enumerator pipeline
To: Michael Xavier <[email protected]>
Cc: Haskell Beginners <[email protected]>
Message-ID:
        <CANd=OGEjVUo5EVEn6vDbOjRLoUOUO=ssvnbk-n7wnchyegg...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

What about

pipeline :: MonadIO m => Enumerator ByteString m () -> [Text] ->
[Filter] -> Iteratee ByteString m ()
pipeline s rfs fs = s $$ splitLines
                      =$ parseLine
                      =$ filterObjects fs
                      =$ restrictObjects rfs
                      =$ encoder
                      =$ output

Cheers,

-- 
Felipe.



------------------------------

Message: 5
Date: Wed, 31 Aug 2011 19:59:08 -0700
From: Michael Xavier <[email protected]>
Subject: Re: [Haskell-beginners] Help with enumerator pipeline
To: Felipe Almeida Lessa <[email protected]>
Cc: Haskell Beginners <[email protected]>
Message-ID:
        <CANk=zmhaxxhbqw5oihgjvzb7ivno8tv26ywhytivq65cqry...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

It compiles! I'm glad I got to preserve the ordering of the enumeratees.

Thanks!

On Wed, Aug 31, 2011 at 6:50 PM, Felipe Almeida Lessa <
[email protected]> wrote:

> What about
>
> pipeline :: MonadIO m => Enumerator ByteString m () -> [Text] ->
> [Filter] -> Iteratee ByteString m ()
> pipeline s rfs fs = s $$ splitLines
>                      =$ parseLine
>                      =$ filterObjects fs
>                       =$ restrictObjects rfs
>                      =$ encoder
>                      =$ output
>
> Cheers,
>
> --
> Felipe.
>



-- 
Michael Xavier
http://www.michaelxavier.net
LinkedIn <http://www.linkedin.com/pub/michael-xavier/13/b02/a26>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20110831/5b7ff304/attachment-0001.htm>

------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 39, Issue 1
****************************************

Beginners Digest, Vol 39, Issue 1

Reply via email to