Beginners Digest, Vol 52, Issue 17

beginners-request Fri, 12 Oct 2012 13:19:16 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."

Today's Topics:

   1.  parsec problem (Tobias)
   2.  conduit and happstack dependence problem (??????? ???)
   3. Re:  calling inpure functions from pure code (Sean Perry)
   4. Re:  calling inpure functions from pure code (Emmanuel Touzery)
   5. Re:  conduit and happstack dependence problem (Antoine Latter)
   6. Re:  calling inpure functions from pure code (Antoine Latter)
   7. Re:  calling inpure functions from pure code (Sean Perry)

----------------------------------------------------------------------

Message: 1
Date: Fri, 12 Oct 2012 19:20:42 +0200
From: Tobias <[email protected]>
Subject: [Haskell-beginners] parsec problem
To: "[email protected]" <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"

Hello,

I would like to parse the input "word1 word2 word3 ." into 
["word1","word2","word3"] using Parsec.

My code below fails with:
/> unexpected "."//
//> expecting letter or digit

/I guess the problem is that the blank before the dot is considered as 
belonging to the "sepBy word blank" parsing and therefore a next word is 
expected, and it is missing.
I would like the "sepBy word blank" parsing to stop after "word3".
How can I do this?
/
//sentence :: Parser [String]//
//sentence =  do words <- word `sepBy` blank//
//                         blank//
//                         oneOf ".?!"//
//                         return words//
//
//word :: Parser String//
//word = many1 (letter <|> digit) //
//
//blank :: Parser String//
//blank = string " "

/Regards,
Tobias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/48a1d031/attachment-0001.htm>

------------------------------

Message: 2
Date: Sat, 13 Oct 2012 01:52:32 +0700
From: ??????? ??? <[email protected]>
Subject: [Haskell-beginners] conduit and happstack dependence problem
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="us-ascii"

An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121013/684ffd43/attachment-0001.htm>

------------------------------

Message: 3
Date: Fri, 12 Oct 2012 11:58:20 -0700
From: Sean Perry <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: Haskell Beginer <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252

On Oct 12, 2012, at 7:19 AM, Emmanuel Touzery wrote:
>> 
>> Overall, splitting your algorithm into simple steps ? steps that would
>> do just a part of work and return incomplete objects ? is the way to go.
>> 
> 
> You have a point, about splitting code for smaller functions. I would just 
> rather have getDetails called from getProgramme rather than a parent calling 
> both separately. And the parent must do the connection by doing the IO if I 
> want both pieces to be pure. That is what is bothering me mostly.
> 

Think about this from a testing perspective. How do you verify that your code 
which identifies links is working? If the link finding is mixed in with the 
link retrieving you end up having to dummy out the IO. Think of this as the 
code becomes more complicated and like Alexander suggests you want to later 
retrieve images too. Now you need to mock out the image retrieval as well.

Perhaps you should think of this as creating a matching DOM like structure. 
First you tree starts out empty. Then you parse the top level and return a new 
tree with data and dangling nodes that are links needing to be followed. You 
check "have I gone as deep as I would like?". If not, pass in the new partial 
tree to the retrieval routine and start filling it in. Now you are back to the 
depth check. When the retrieval has reached its goal the tree is returned and 
it is as populated as it can be. Now the rest of your code can use the tree for 
whatever it needs.

Remember to always ask "how do I test this?". One of the key reasons to keep 
purity is it makes the testing so much easier. Every small piece can be 
verified.

------------------------------

Message: 4
Date: Fri, 12 Oct 2012 21:44:16 +0200
From: Emmanuel Touzery <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: [email protected]
Message-ID:
        <CAC42RenkmX3jbiO5Jw59=eJv==7pyAJ1BX1j=kf6r2dvnn1...@mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"

On Fri, Oct 12, 2012 at 8:58 PM, Sean Perry <[email protected]> wrote:

> On Oct 12, 2012, at 7:19 AM, Emmanuel Touzery wrote:
> >>
> >> Overall, splitting your algorithm into simple steps ? steps that would
> >> do just a part of work and return incomplete objects ? is the way to go.
> >>
> >
> > You have a point, about splitting code for smaller functions. I would
> just rather have getDetails called from getProgramme rather than a parent
> calling both separately. And the parent must do the connection by doing the
> IO if I want both pieces to be pure. That is what is bothering me mostly.
> >
>
> Think about this from a testing perspective. How do you verify that your
> code which identifies links is working? If the link finding is mixed in
> with the link retrieving you end up having to dummy out the IO. Think of
> this as the code becomes more complicated and like Alexander suggests you
> want to later retrieve images too. Now you need to mock out the image
> retrieval as well.
>
> Perhaps you should think of this as creating a matching DOM like
> structure. First you tree starts out empty. Then you parse the top level
> and return a new tree with data and dangling nodes that are links needing
> to be followed. You check "have I gone as deep as I would like?". If not,
> pass in the new partial tree to the retrieval routine and start filling it
> in. Now you are back to the depth check. When the retrieval has reached its
> goal the tree is returned and it is as populated as it can be. Now the rest
> of your code can use the tree for whatever it needs.
>
> Remember to always ask "how do I test this?". One of the key reasons to
> keep purity is it makes the testing so much easier. Every small piece can
> be verified.
>

Thank you for your opinion, it does bring another set of concerns.

What you suggest is the approach suggested by Daniel Trstenjak, the very
first answer, and it definitely has value, but the question is code
readability. It's a fine balance. That's what I was asking at the
beginning, how hard should we try to strive for pure code, and here the
balance seems to depend on the person (while I thought it's a dogma in the
Haskell community, as much pure code as possible).

In this case I think the purity means more code to be written (re-reading
and re-writing data structures instead of writing them just once) and I'm
not sure it's worth the cost, I'd say Daniel Trstenjak's second answer
convinced me, but I'm just starting with Haskell and I guess I'll get a
clearer sense of this with time.

But it's also good to see that there is consensus on how to code this, if
we want to maximize pure code.

Emmanuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/d8c6cf4b/attachment-0001.htm>

------------------------------

Message: 5
Date: Fri, 12 Oct 2012 14:59:06 -0500
From: Antoine Latter <[email protected]>
Subject: Re: [Haskell-beginners] conduit and happstack dependence
        problem
To: ??????? ??? <[email protected]>
Cc: [email protected], [email protected]
Message-ID:
        <cakjsnqeejjklrkywdf9tafrpnyhze1sss1nuqrm+iqdp6sq...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

CCing happstack mailing list.

On Fri, Oct 12, 2012 at 1:52 PM, ??????? ??? <[email protected]> wrote:
> I'm using conduit to download and parsing file. Conduit depens on
> transformers >= 0.4.
> Now I'm going to use happstack, but happstack-server depends on transformers
> < 0.4.
> Is there a way to use conduit and happstack together?
>
>
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>

------------------------------

Message: 6
Date: Fri, 12 Oct 2012 15:06:13 -0500
From: Antoine Latter <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: Emmanuel Touzery <[email protected]>
Cc: [email protected]
Message-ID:
        <CAKjSnQGaBkbQVAQoXs=ahueAn7b=pmiqu1kzlhqceky+ffy...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Fri, Oct 12, 2012 at 8:28 AM, Emmanuel Touzery <[email protected]> wrote:
> Hi,
>
>
>> when parsing the string representing a page, you could
>> save all the links you encounter.
>>
>> After the parsing you would load the linked pages and start
>> again parsing.
>>
>> You would redo this until no more links are returned or a
>> maximum deepness is reached.
>
>
> Thanks for the tip. That sounds much more reasonable than what I mentioned.
> It seems a bit "spaghetti" to me though in a way (but maybe I just have to
> get used to the Haskell way).
>
> To be more specific about what I want to do: I want to parse TV programs. On
> the first page I have the daily listing for a channel. start/end hour,
> title, category, and link or not.
> To fully parse one TV program I can follow the link if it's present and get
> the extra info which is there (summary, pictures..).
>

If this were me, I would write the following:

data ChannelListing = ChannelListing [BasicProgramInfo]

-- | Summary of a program
data BasicProgramInfo =
  BasicProgramInfo
    { basicStartTime :: ...
    , basicEndTime :: ...
    , basicTitle :: ...
    , basicUrl :: URL
    }

-- | Full details of a program
data ProgramInfo = ...

fetchChannelListing :: ChannelId -> IO ChannelListing

fetchProgramInfo :: BasicProgramInfo -> IO ProgramInfo

And then I would string my program together from these primitives.
That way large portions of the code can be built up from the pure data
types, but the top-level can load them up as needed with impure
functions.

This is just my first impression, though.

Antoine

------------------------------

Message: 7
Date: Fri, 12 Oct 2012 13:22:36 -0700
From: Sean Perry <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: Haskell Beginer <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii

On Oct 12, 2012, at 12:44 PM, Emmanuel Touzery wrote:
> 
> But it's also good to see that there is consensus on how to code this, if we 
> want to maximize pure code.
> 

I am still working my way through Haskell as well. I still code more in Python 
or C++. In my experience mixed code and I/O is faster to develop early on. Then 
features start coming in, the code size grows and you start to think about 
maintaining it. In Python or C++ this is when you would start thinking about 
refactoring the I/O out of the code. I like that Haskell gives me strong nudges 
in this direction from the beginning.

Typical first draft of a Python function:

def foo(filename):
    # open file
    # read data
    # work on data
    # return result

typical second draft:

def foo2(handle):
    # read data from handle, now it works with all kinds of I/O sources
    # work on data
    # return result

common ending point:

def foo3(data):
    # work on data, now we do not care where the data came from
    # return result

In programming we often trade one efficiency for another. If the parsing takes 
a few more seconds but I can hand the code off to someone else to add the next 
feature then that is a worthwhile trade off for me. Or more often, I only touch 
the code a few times a year. The smaller and better contained the pieces the 
quicker I can get back to work.

A frequent problem I have as a paid developer is dealing with code bases where 
management fears change because testing was not a consideration. So now a few 
years in there is a pile of code that was well thought out originally but now 
is a jumbled mess and no one knows what happens when you twiddle just one part 
of it. In my opinion every developer needs to internalize this and plan for the 
future from the beginning. Now, we know that some things are going to be 30 
lines long and not change much later. In those cases over engineering is not 
worth it, obviously. But avoiding better design out of concern for performance 
is a path to failure.

------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners

End of Beginners Digest, Vol 52, Issue 17
*****************************************

Beginners Digest, Vol 52, Issue 17

Reply via email to