Beginners Digest, Vol 52, Issue 16

beginners-request Fri, 12 Oct 2012 07:20:32 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  calling inpure functions from pure code (Emmanuel Touzery)
   2. Re:  calling inpure functions from pure code (Daniel Trstenjak)
   3. Re:  calling inpure functions from pure code (Alexander Batischev)
   4. Re:  calling inpure functions from pure code (Emmanuel Touzery)
   5. Re:  calling inpure functions from pure code (Emmanuel Touzery)


----------------------------------------------------------------------

Message: 1
Date: Fri, 12 Oct 2012 15:54:23 +0200
From: Emmanuel Touzery <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"

Hello,

Thanks for the tip!

I'm in fact using dom-selector:
http://hackage.haskell.org/package/dom-selector
which is based on xml-conduit and html-conduit. The reason being that it 
offers CSS selectors and is generally much higher-level than what I 
would do with parsec.

So I'm not sure whether what you wrote applies.
Actually your function doing the parsing here is not pure as such, it's 
a do block and ordered. What I have done so far is that dom-selector 
gives me the DOM structure of the page (so that parsing part is done for 
me), and then I give to my function that DOM structure and the 
examination of that DOM structure is completely without a do block, it's 
not ordered, it's pure. In that way my "parsing" (really examination of 
the DOM tree) is completely split of any IO or other monad.

I think when you are within parsec as you mentioned, you are within the 
parsec monad (bear in mind I don't really understand all of this for 
now), and to do IO you need to go to the IO monad, and for that you use 
liftIO. In that case that's another problem than the one I'm having.

Emmanuel

On 12.10.2012 15:39, David McBride wrote:
> There's a better option in my opinion.  Use the monad transformer 
> capability of the parser you are using (I'm assuming you are using 
> parsec for parsing).
>
> If you check the hackage docs for parsec you'll see that the ParsecT 
> is an instance of MonadIO.  That means at any point during the parsing 
> you can go liftIO $ <any IO action> and use the result in your 
> parsing.  Here's an example of what that would might look like.
>
> import Control.Monad.IO.Class
> import Control.Monad (when)
> import Text.Parsec
> import Text.Parsec.Char
>
> parseTvStuff :: (MonadIO m) => ParsecT String u m (Char,Maybe ())
> parseTvStuff = do
>   string "tvshow:"
>   c <- anyChar
>   morestuff <- if c == 'x'
>     then fmap Just $ liftIO $ putStrLn "run an http request, parse the 
> result, and store the result in morestuff as a maybe"
>     else return Nothing
>   return (c,morestuff)
>
> So you will run an http request if you get back something that seems 
> like it could be worth further parsing.  Then you just parse that 
> stuff with a separate parser and store it in your data structure and 
> continue parsing the rest of the first page with the original parser 
> if you wish.
>
> On Fri, Oct 12, 2012 at 9:28 AM, Emmanuel Touzery <[email protected] 
> <mailto:[email protected]>> wrote:
>
>     Hi,
>
>
>         when parsing the string representing a page, you could
>         save all the links you encounter.
>
>         After the parsing you would load the linked pages and start
>         again parsing.
>
>         You would redo this until no more links are returned or a
>         maximum deepness is reached.
>
>
>     Thanks for the tip. That sounds much more reasonable than what I
>     mentioned. It seems a bit "spaghetti" to me though in a way (but
>     maybe I just have to get used to the Haskell way).
>
>     To be more specific about what I want to do: I want to parse TV
>     programs. On the first page I have the daily listing for a
>     channel. start/end hour, title, category, and link or not.
>     To fully parse one TV program I can follow the link if it's
>     present and get the extra info which is there (summary, pictures..).
>
>     So the first scheme that comes to mind is a method which takes the
>     DOM tree of the daily page and returns the list of programs for
>     that day.
>
>     Instead, what I must then do, is to return the incomplete
>     programs: the data object would have the link filled in, if it's
>     available, but the summary, picture... would be empty.
>     Then I have a "second pass" in the caller function, where for
>     programs which have a link, I would fetch the extra page, and call
>     a second function, which will fill in the extra data (thankfully
>     if pictures are present I only store their URL so it would stop
>     there, no need for a third pass for pictures).
>
>     It annoys me that the first function returns "incomplete"
>     objects... It somehow feels wrong.
>
>     Now that I mentioned my problem with more details, maybe you can
>     think of a better way of doing that?
>
>     And otherwise I guess this is the policy when writing Haskell
>     code: absolutely avoid spreading impure/IO tainted code, even if
>     it maybe negatively affects the general structure of the program?
>
>     Thanks again for the tip though! That's definitely what I'll do if
>     nothing better is suggested. It is actually probably the best way
>     to do that if you want to separate IO from "pure" code.
>
>     Emmanuel
>
>
>     _______________________________________________
>     Beginners mailing list
>     [email protected] <mailto:[email protected]>
>     http://www.haskell.org/mailman/listinfo/beginners
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/dcaeabed/attachment-0001.htm>

------------------------------

Message: 2
Date: Fri, 12 Oct 2012 16:01:56 +0200
From: Daniel Trstenjak <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: [email protected]
Message-ID: <20121012140156.GA23787@machine>
Content-Type: text/plain; charset=us-ascii


Hi Emmanuel,

> Now that I mentioned my problem with more details, maybe you can
> think of a better way of doing that?

In this case I don't think that it's worth to separate it.  

> And otherwise I guess this is the policy when writing Haskell code:
> absolutely avoid spreading impure/IO tainted code, even if it maybe
> negatively affects the general structure of the program?

There should be a reason for separating pure and impure code. If your
code doesn't get easier to reason about or more reusable, than
there's little reason for separation.

At the end the separation should result in better programs.


Greetings,
Daniel



------------------------------

Message: 3
Date: Fri, 12 Oct 2012 17:07:18 +0300
From: Alexander Batischev <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: Emmanuel Touzery <[email protected]>
Cc: [email protected]
Message-ID: <20121012140718.GA25842@antaeus>
Content-Type: text/plain; charset="utf-8"

Hi,

On Fri, Oct 12, 2012 at 03:28:39PM +0200, Emmanuel Touzery wrote:
> It annoys me that the first function returns "incomplete" objects...
> It somehow feels wrong.

Maybe you would feel better about it if you put both functions under one
"umbrella" function like this:

> parseProgramme = getDetails . getProgramme
>   where
>   getProgramme = ...
>   getDetails = ...

That way, your "incomplete" objects would never be exposed to "end user"
(even though it's just you). It also gives you an abstraction that may
gain you in a future when, say, you would want to fetch pictures as
well?? it would be just a matter of adding one more function under the
"umbrella".

Overall, splitting your algorithm into simple steps?? steps that would
do just a part of work and return incomplete objects ? is the way to go.

-- 
Regards,
Alexander Batischev

PGP key 356961A20C8BFD03
Fingerprint: CE6C 4307 9348 58E3 FD94  A00F 3569 61A2 0C8B FD03

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/8c4e37b6/attachment-0001.pgp>

------------------------------

Message: 4
Date: Fri, 12 Oct 2012 16:09:00 +0200
From: Emmanuel Touzery <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi,

> Then I have a "second pass" in the caller function, where for programs 
> which have a link, I would fetch the extra page, and call a second 
> function, which will fill in the extra data (thankfully if pictures 
> are present I only store their URL so it would stop there, no need for 
> a third pass for pictures).
>
> It annoys me that the first function returns "incomplete" objects... 
> It somehow feels wrong. 

I just realized i have the wrong way of thinking about it, in Haskell 
data is immutable therefore the first function wouldn't return 
incomplete "objects" that would be completed later: the second function 
will re-create completely the data anyway. So I would have duplicate 
data structures, once without the extra data, once with. Or something 
like that.

>> And otherwise I guess this is the policy when writing Haskell code:
>> absolutely avoid spreading impure/IO tainted code, even if it maybe
>> negatively affects the general structure of the program?
> There should be a reason for separating pure and impure code. If your
> code doesn't get easier to reason about or more reusable, than
> there's little reason for separation.

Yes.. I thought the goal is to strive for as much pure code as possible 
(which is easier to test and so on), but in this case (and obviously 
it's a small program) it doesn't seem tractable. I wonder what is the 
ratio pure/impure in bigger programs.

Thank you...

Emmanuel



------------------------------

Message: 5
Date: Fri, 12 Oct 2012 16:19:57 +0200
From: Emmanuel Touzery <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hello,

> Maybe you would feel better about it if you put both functions under one
> "umbrella" function like this:
>
>> parseProgramme = getDetails . getProgramme
>>    where
>>    getProgramme = ...
>>    getDetails = ...
> That way, your "incomplete" objects would never be exposed to "end user"
> (even though it's just you). It also gives you an abstraction that may
> gain you in a future when, say, you would want to fetch pictures as
> well ? it would be just a matter of adding one more function under the
> "umbrella".
>
> Overall, splitting your algorithm into simple steps ? steps that would
> do just a part of work and return incomplete objects ? is the way to go.
>

You have a point, about splitting code for smaller functions. I would 
just rather have getDetails called from getProgramme rather than a 
parent calling both separately. And the parent must do the connection by 
doing the IO if I want both pieces to be pure. That is what is bothering 
me mostly.

Emmanuel



------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 52, Issue 16
*****************************************

Beginners Digest, Vol 52, Issue 16

Reply via email to