commit ghc-tagsoup for openSUSE:Factory

h_root Sat, 26 Mar 2016 07:26:49 -0700

Hello community,

here is the log from the commit of package ghc-tagsoup for openSUSE:Factory 
checked in at 2016-03-26 15:26:13
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/ghc-tagsoup (Old)
 and      /work/SRC/openSUSE:Factory/.ghc-tagsoup.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "ghc-tagsoup"

Changes:
--------
--- /work/SRC/openSUSE:Factory/ghc-tagsoup/ghc-tagsoup.changes  2016-01-22 
01:08:22.000000000 +0100
+++ /work/SRC/openSUSE:Factory/.ghc-tagsoup.new/ghc-tagsoup.changes     
2016-03-26 15:26:19.000000000 +0100
@@ -1,0 +2,9 @@
+Wed Mar 16 09:27:33 UTC 2016 - mimi...@gmail.com
+
+- update to 0.13.9
+* fix a space leak
+* fix the demo examples
+* make IsString a superclass of StringLike
+* make flattenTree O(n) instead of O(n^2)
+
+-------------------------------------------------------------------

Old:
----
  tagsoup-0.13.8.tar.gz

New:
----
  tagsoup-0.13.9.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ ghc-tagsoup.spec ++++++
--- /var/tmp/diff_new_pack.l42AYE/_old  2016-03-26 15:26:21.000000000 +0100
+++ /var/tmp/diff_new_pack.l42AYE/_new  2016-03-26 15:26:21.000000000 +0100
@@ -19,7 +19,7 @@
 %global pkg_name tagsoup
 
 Name:           ghc-tagsoup
-Version:        0.13.8
+Version:        0.13.9
 Release:        0
 Summary:        Parsing and extracting information from (possibly malformed) 
HTML/XML documents
 License:        BSD-3-Clause

++++++ tagsoup-0.13.8.tar.gz -> tagsoup-0.13.9.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/CHANGES.txt 
new/tagsoup-0.13.9/CHANGES.txt
--- old/tagsoup-0.13.8/CHANGES.txt      2016-01-10 22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/CHANGES.txt      2016-03-15 13:07:28.000000000 +0100
@@ -1,5 +1,10 @@
 Changelog for TagSoup
 
+0.13.9
+    #50, fix a space leak
+    #36, fix the demo examples
+    #35, make IsString a superclass of StringLike
+    #33, make flattenTree O(n) instead of O(n^2)
 0.13.8
     #30, add parse/render functions directly to the Tree module
 0.13.7
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/Main.hs new/tagsoup-0.13.9/Main.hs
--- old/tagsoup-0.13.8/Main.hs  2016-01-10 22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/Main.hs  2016-03-15 13:07:28.000000000 +0100
@@ -34,7 +34,7 @@
           ,("bench","Benchmark the parsing",Left time)
           ,("benchfile","Benchmark the parsing of a file",Right timefile)
           ,("validate","Validate a page",Right validate)
-          ,("hitcount","Get the Haskell.org hit count",Left haskellHitCount)
+          ,("lastmodifieddate","Get the wiki.haskell.org last modified 
date",Left haskellLastModifiedDateTime)
           ,("spj","Simon Peyton Jones' papers",Left spjPapers)
           ,("ndm","Neil Mitchell's papers",Left ndmPapers)
           ,("time","Current time",Left currentTime)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/README.md new/tagsoup-0.13.9/README.md
--- old/tagsoup-0.13.8/README.md        2016-01-10 22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/README.md        2016-03-15 13:07:28.000000000 +0100
@@ -4,8 +4,8 @@
 
 The library provides a basic data type for a list of unstructured tags, a 
parser to convert HTML into this tag type, and useful functions and combinators 
for finding and extracting information. This document gives two particular 
examples of scraping information from the web, while a few more may be found in 
the 
[Sample](https://github.com/ndmitchell/tagsoup/blob/master/TagSoup/Sample.hs) 
file from the source repository. The examples we give are:
 
-* Obtaining the Hit Count from Haskell.org
-* Obtaining a list of Simon Peyton-Jones' latest papers
+* Obtaining the last modified date of the Haskell wiki
+* Obtaining a list of Simon Peyton Jones' latest papers
 * A brief overview of some other examples
 
 The intial version of this library was written in Javascript and has been used 
for various commercial projects involving screen scraping. In the examples 
general hints on screen scraping are included, learnt from bitter experience. 
It should be noted that if you depend on data which someone else may change at 
any given time, you may be in for a shock!
@@ -22,28 +22,33 @@
 
 There are two things that may go wrong with these examples:
 
-* _The Websites being scraped may change._ There is nothing I can do about 
this, but if you suspect this is the case let me know, and I'll update the 
examples and tutorials. I have already done so several times, its only a few 
minutes work.
+* _The Websites being scraped may change._ There is nothing I can do about 
this, but if you suspect this is the case let me know, and I'll update the 
examples and tutorials. I have already done so several times, it's only a few 
minutes work.
 * _The `openURL` method may not work._ This happens quite regularly, and 
depending on your server, proxies and direction of the wind, they may not work. 
The solution is to use `wget` to download the page locally, then use `readFile` 
instead. Hopefully a decent Haskell HTTP library will emerge, and that can be 
used instead.
 
 
-## Haskell Hit Count
+## Last modified date of Haskell wiki
 
-Our goal is to develop a program that displays the Haskell.org hit count. This 
example covers all the basics in designing a basic web-scraping application.
+Our goal is to develop a program that displays the date that the wiki at
+[`wiki.haskell.org`](http://wiki.haskell.org/Haskell) was last modified. This
+example covers all the basics in designing a basic web-scraping application.
 
 ### Finding the Page
 
-We first need to find where the information is displayed, and in what format. 
Taking a look at the [front web 
page](http://www.haskell.org/haskellwiki/Haskell), when not logged in, we see:
-
-    <ul id="f-list">
-        <li id="lastmod"> This page was last modified on 9 September 2013, at 
22:38.</li>
-        <li id="viewcount">This page has been accessed 6,985,922 times.</li>
-        <li id="copyright">Recent content is available under <a 
href="/haskellwiki/HaskellWiki:Copyrights" title="HaskellWiki:Copyrights">a 
simple permissive license</a>.</li>
-        <li id="privacy"><a href="/haskellwiki/HaskellWiki:Privacy_policy" 
title="HaskellWiki:Privacy policy">Privacy policy</a></li>
-        <li id="about"><a href="/haskellwiki/HaskellWiki:About" 
title="HaskellWiki:About">About HaskellWiki</a></li>
-        <li id="disclaimer"><a 
href="/haskellwiki/HaskellWiki:General_disclaimer" title="HaskellWiki:General 
disclaimer">Disclaimers</a></li>
-    </ul>
+We first need to find where the information is displayed and in what format.
+Taking a look at the [front web page](http://wiki.haskell.org/Haskell), when
+not logged in, we see:
+
+```html
+<ul id="f-list">
+  <li id="lastmod"> This page was last modified on 9 September 2013, at 
22:38.</li>
+  <li id="copyright">Recent content is available under <a 
href="/HaskellWiki:Copyrights" title="HaskellWiki:Copyrights">a simple 
permissive license</a>.</li>
+  <li id="privacy"><a href="/HaskellWiki:Privacy_policy" 
title="HaskellWiki:Privacy policy">Privacy policy</a></li>
+  <li id="about"><a href="/HaskellWiki:About" title="HaskellWiki:About">About 
HaskellWiki</a></li>
+  <li id="disclaimer"><a href="/HaskellWiki:General_disclaimer" 
title="HaskellWiki:General disclaimer">Disclaimers</a></li>
+</ul>
+```
 
-So we see the hit count is available. This leads us to rule 1:
+So, we see that the last modified date is available. This leads us to rule 1:
 
 **Rule 1:** Scrape from what the page returns, not what a browser renders, or 
what view-source gives.
 
@@ -53,43 +58,88 @@
 
 We can write a simple HTTP downloader with using the [HTTP 
package](http://hackage.haskell.org/package/HTTP):
 
-    import Network.HTTP
-    
-    openURL x = getResponseBody =<< simpleHTTP (getRequest x)
-    
-    main = do src <- openURL "http://www.haskell.org/haskellwiki/Haskell";
-              writeFile "temp.htm" src
+```haskell
+module Main where
+
+import Network.HTTP
+
+openURL :: String -> IO String
+openURL x = getResponseBody =<< simpleHTTP (getRequest x)
+
+main :: IO ()
+main = do
+    src <- openURL "http://wiki.haskell.org/Haskell";
+    writeFile "temp.htm" src
+```
 
 Now open `temp.htm`, find the fragment of HTML containing the hit count, and 
examine it.
 
 #### Using the `tagsoup` Program
 
-Tagsoup installs both as a library and a program. The program contains all the 
examples mentioned on this page, along with a few other useful functions. In 
order to download a URL to a file:
-
-    $ tagsoup grab http://www.haskell.org/haskellwiki/Haskell > temp.htm
+TagSoup installs both as a library and a program. The program contains all the
+examples mentioned on this page, along with a few other useful functions. In
+order to download a URL to a file:
+
+```bash
+$ tagsoup grab http://wiki.haskell.org/Haskell > temp.htm
+```
 
 ### Finding the Information
 
-Now we examine both the fragment that contains our snippet of information, and 
the wider page. What does the fragment has that nothing else has? What 
algorithm would we use to obtain that particular element? How can we still 
return the element as the content changes? What if the design changes? But 
wait, before going any further:
+Now we examine both the fragment that contains our snippet of information, and
+the wider page. What does the fragment have that nothing else has? What
+algorithm would we use to obtain that particular element? How can we still
+return the element as the content changes? What if the design changes? But
+wait, before going any further:
 
 **Rule 2:** Do not be robust to design changes, do not even consider the 
possibility when writing the code.
 
 If the user changes their website, they will do so in unpredictable ways. They 
may move the page, they may put the information somewhere else, they may remove 
the information entirely. If you want something robust talk to the site owner, 
or buy the data from someone. If you try and think about design changes, you 
will complicate your design, and it still won't work. It is better to write an 
extraction method quickly, and happily rewrite it when things change.
 
-So now, lets consider the fragment from above. It is useful to find a tag 
which is unique just above your snippet - something with a nice "id" property, 
or a "class" - something which is unlikely to occur multiple times. In the 
above example, "viewcount" as the id seems perfect.
-
-    haskellHitCount = do
-        src <- openURL "http://haskell.org/haskellwiki/Haskell";
-        let count = fromFooter $ parseTags src
-        putStrLn $ "haskell.org has been hit " ++ count ++ " times"
-        where fromFooter = filter isDigit . innerText . take 2 . dropWhile 
(~/= "<li id=viewcount>")
+So now, let's consider the fragment from above. It is useful to find a tag
+which is unique just above your snippet - something with a nice `id` or `class`
+attribute - something which is unlikely to occur multiple times. In the above
+example, an `id` with value  `lastmod` seems perfect.
+
+```haskell
+module Main where
+
+import Data.Char
+import Network.HTTP
+import Text.HTML.TagSoup
+
+openURL :: String -> IO String
+openURL x = getResponseBody =<< simpleHTTP (getRequest x)
+
+haskellLastModifiedDateTime :: IO ()
+haskellLastModifiedDateTime = do
+    src <- openURL "http://wiki.haskell.org/Haskell";
+    let lastModifiedDateTime = fromFooter $ parseTags src
+    putStrLn $ "wiki.haskell.org was last modified on " ++ lastModifiedDateTime
+    where fromFooter = unwords . drop 6 . words . innerText . take 2 . 
dropWhile (~/= "<li id=lastmod>")
+
+main :: IO ()
+main = haskellLastModifiedDateTime
+```
 
 Now we start writing the code! The first thing to do is open the required URL, 
then we parse the code into a list of `Tag`s with `parseTags`. The `fromFooter` 
function does the interesting thing, and can be read right to left:
 
-* First we throw away everything (`dropWhile`) until we get to an `li` tag 
containing `id=viewcount`. The `(~==)` operator is different from standard 
equality, allowing additional attributes to be present. We write `"<li 
id=viewcount>"` as syntactic sugar for `TagOpen "li" [("id","viewcount")]`. If 
we just wanted any open tag with the given id we could have written `(~== 
TagOpen "" [("id","viewcount")])` and this would have matched. Any empty 
strings in the second element of the match are considered as wildcards.
-* Next we take two elements, the `<li>` tag and the text node immediately 
following.
-* We call the `innerText` function to get all the text values from inside, 
which will just be the text node following the `viewcount`.
-* We keep only the numbers, getting rid of the surrounding text and the commas.
+* First we throw away everything (`dropWhile`) until we get to an `li` tag
+  containing `id=lastmod`. The `(~==)` and `(~/=)` operators are different from
+standard equality and inequality since they allow additional attributes to be
+present. We write `"<li id=lastmod>"` as syntactic sugar for `TagOpen "li"
+[("id","lastmod")]`. If we just wanted any open tag with the given `id`
+attribute we could have written `(~== TagOpen "" [("id","lastmod")])` and this
+would have matched.  Any empty strings in the second element of the match are
+considered as wildcards.
+* Next we take two elements: the `<li>` tag and the text node immediately
+  following.
+* We call the `innerText` function to get all the text values from inside,
+  which will just be the text node following the `lastmod`.
+* We split the string into a series of words and drop the first six, i.e. the
+  words `This`, `page`, `was`, `last`, `modified` and `on`
+* We reassemble the remaining words into the resulting string `9 September
+  2013, at 22:38.`
 
 This code may seem slightly messy, and indeed it is - often that is the nature 
of extracting information from a tag soup.
 
@@ -104,34 +154,53 @@
 
 First we spot that the page helpfully has named anchors, there is a current 
work anchor, and after that is one for Haskell. We can extract all the 
information between them with a simple `take`/`drop` pair:
 
-    takeWhile (~/= "<a name=haskell>") $
-    drop 5 $ dropWhile (~/= "<a name=current>") tags
+```haskell
+takeWhile (~/= "<a name=haskell>") $
+drop 5 $ dropWhile (~/= "<a name=current>") tags
+```
 
 This code drops until you get to the "current" section, then takes until you 
get to the "haskell" section, ensuring we only look at the important bit of the 
page. Next we want to find all hyperlinks within this section:
 
-    map f $ sections (~== "<A>") $ ...
+```haskell
+map f $ sections (~== "<A>") $ ...
+```
 
 Remember that the function to select all tags with name "A" could have been 
written as `(~== TagOpen "A" [])`, or alternatively `isTagOpenName "A"`. 
Afterwards we map each item with an `f` function. This function needs to take 
the tags starting just after the link, and find the text inside the link.
 
-    f = dequote . unwords . words . fromTagText . head . filter isTagText
+```haskell
+f = dequote . unwords . words . fromTagText . head . filter isTagText
+```
 
 Here the complexity of interfacing to human written markup comes through. Some 
of the links are in italic, some are not - the `filter` drops all those that 
are not, until we find a pure text node. The `unwords . words` deletes all 
multiple spaces, replaces tabs and newlines with spaces and trims the front and 
back - a neat trick when dealing with text which has spacing at the source code 
but not when displayed. The final thing to take account of is that some papers 
are given with quotes around the name, some are not - dequote will remove the 
quotes if they exist.
 
 For completeness, we now present the entire example:
-    
-    spjPapers :: IO ()
-    spjPapers = do
-            tags <- fmap parseTags $ openURL 
"http://research.microsoft.com/en-us/people/simonpj/";
-            let links = map f $ sections (~== "<A>") $
-                        takeWhile (~/= "<a name=haskell>") $
-                        drop 5 $ dropWhile (~/= "<a name=current>") tags
-            putStr $ unlines links
-        where
-            f :: [Tag] -> String
-            f = dequote . unwords . words . fromTagText . head . filter 
isTagText
-    
-            dequote ('\"':xs) | last xs == '\"' = init xs
-            dequote x = x
+
+```haskell
+module Main where
+
+import Network.HTTP
+import Text.HTML.TagSoup
+
+openURL :: String -> IO String
+openURL x = getResponseBody =<< simpleHTTP (getRequest x)
+
+spjPapers :: IO ()
+spjPapers = do
+        tags <- parseTags <$> openURL 
"http://research.microsoft.com/en-us/people/simonpj/";
+        let links = map f $ sections (~== "<A>") $
+                    takeWhile (~/= "<a name=haskell>") $
+                    drop 5 $ dropWhile (~/= "<a name=current>") tags
+        putStr $ unlines links
+    where
+        f :: [Tag String] -> String
+        f = dequote . unwords . words . fromTagText . head . filter isTagText
+
+        dequote ('\"':xs) | last xs == '\"' = init xs
+        dequote x = x
+
+main :: IO ()
+main = spjPapers
+```
 
 ## Other Examples
 
@@ -139,30 +208,54 @@
 
 ### My Papers
 
-    ndmPapers :: IO ()
-    ndmPapers = do
-            tags <- fmap parseTags $ openURL 
"http://community.haskell.org/~ndm/downloads/";
-            let papers = map f $ sections (~== "<li class=paper>") tags
-            putStr $ unlines papers
-        where
-            f :: [Tag] -> String
-            f xs = fromTagText (xs !! 2)
+```haskell
+module Main where
+
+import Network.HTTP
+import Text.HTML.TagSoup
+
+openURL :: String -> IO String
+openURL x = getResponseBody =<< simpleHTTP (getRequest x)
+
+ndmPapers :: IO ()
+ndmPapers = do
+        tags <- parseTags <$> openURL 
"http://community.haskell.org/~ndm/downloads/";
+        let papers = map f $ sections (~== "<li class=paper>") tags
+        putStr $ unlines papers
+    where
+        f :: [Tag String] -> String
+        f xs = fromTagText (xs !! 2)
+
+main :: IO ()
+main = ndmPapers
+```
 
 ### UK Time
 
-    currentTime :: IO ()
-    currentTime = do
-        tags <- fmap parseTags $ openURL 
"http://www.timeanddate.com/worldclock/city.html?n=136";
-        let time = fromTagText (dropWhile (~/= "<strong id=ct>") tags !! 1)
-        putStrLn time
+```haskell
+module Main where
+
+import Network.HTTP
+import Text.HTML.TagSoup
+
+openURL :: String -> IO String
+openURL x = getResponseBody =<< simpleHTTP (getRequest x)
+
+currentTime :: IO ()
+currentTime = do
+    tags <- parseTags <$> openURL 
"http://www.timeanddate.com/worldclock/uk/london";
+    let time = fromTagText (dropWhile (~/= "<span id=ct>") tags !! 1)
+    putStrLn time
+
+main :: IO ()
+main = currentTime
+```
         
-<h2>Related Projects</h2>
+## Related Projects
 
-<ul>
-    <li><a href="http://tagsoup.info/";>TagSoup for Java</a> - an independently 
written malformed HTML parser for Java. Including <a 
href="http://tagsoup.info/#other";>links to other</a> HTML parsers.</li>
-    <li><a href="http://www.fh-wedel.de/~si/HXmlToolbox/";>HXT: Haskell XML 
Toolbox</a> - a more comprehensive XML parser, giving the option of using 
TagSoup as a lexer.</li>
-    <li><a href="http://www.fh-wedel.de/~si/HXmlToolbox/#rel";>Other Related 
Work</a> - as described on the HXT pages.</li>
-    <li><a href="http://therning.org/magnus/archives/367";>Using TagSoup with 
Parsec</a> - a nice combination of Haskell libraries.</li>
-    <li><a 
href="http://hackage.haskell.org/packages/tagsoup-parsec";>tagsoup-parsec</a> - 
a library for easily using TagSoup as a token type in Parsec.</li>
-    <li><a 
href="http://hackage.haskell.org/packages/archive/wraxml/latest/doc/html/Text-XML-WraXML-Tree-TagSoup.html";>WraXML</a>
 - construct a lazy tree from TagSoup lexemes.</li>
-</ul>
+* [TagSoup for Java](http://tagsoup.info/) - an independently written 
malformed HTML parser for Java. Including [links to 
other](http://tagsoup.info/#other) HTML parsers.
+* [HXT: Haskell XML Toolbox](http://www.fh-wedel.de/~si/HXmlToolbox/) - a more 
comprehensive XML parser, giving the option of using TagSoup as a lexer.
+* [Other Related Work](http://www.fh-wedel.de/~si/HXmlToolbox/#rel) - as 
described on the HXT pages.
+* [Using TagSoup with Parsec](http://therning.org/magnus/archives/367) - a 
nice combination of Haskell libraries.
+* [tagsoup-parsec](http://hackage.haskell.org/packages/tagsoup-parsec) - a 
library for easily using TagSoup as a token type in Parsec.
+* 
[WraXML](http://hackage.haskell.org/packages/archive/wraxml/latest/doc/html/Text-XML-WraXML-Tree-TagSoup.html)
 - construct a lazy tree from TagSoup lexemes.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/Setup.hs new/tagsoup-0.13.9/Setup.hs
--- old/tagsoup-0.13.8/Setup.hs 2016-01-10 22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/Setup.hs 2016-03-15 13:07:28.000000000 +0100
@@ -1,3 +1,2 @@
-#! /usr/bin/env runhaskell
 import Distribution.Simple
 main = defaultMain
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/TagSoup/Sample.hs 
new/tagsoup-0.13.9/TagSoup/Sample.hs
--- old/tagsoup-0.13.8/TagSoup/Sample.hs        2016-01-10 22:15:15.000000000 
+0100
+++ new/tagsoup-0.13.9/TagSoup/Sample.hs        2016-03-15 13:07:28.000000000 
+0100
@@ -5,7 +5,6 @@
 
 import Control.Exception
 import Control.Monad
-import Data.Char
 import Data.List
 import System.Cmd
 import System.Directory
@@ -47,13 +46,14 @@
 
 
 {-
-<li id="viewcount">This page has been accessed 6,985,922 times.</li>
+<li id="lastmod"> This page was last modified on 9 September 2013, at 
22:38.</li>
 -}
-haskellHitCount = do
-    src <- openItem "http://haskell.org/haskellwiki/Haskell";
-    let count = fromFooter $ parseTags src
-    putStrLn $ "haskell.org has been hit " ++ count ++ " times"
-    where fromFooter = filter isDigit . innerText . take 2 . dropWhile (~/= 
"<li id=viewcount>")
+haskellLastModifiedDateTime :: IO ()
+haskellLastModifiedDateTime = do
+    src <- openItem "http://wiki.haskell.org/Haskell";
+    let lastModifiedDateTime = fromFooter $ parseTags src
+    putStrLn $ "wiki.haskell.org was last modified on " ++ lastModifiedDateTime
+    where fromFooter = unwords . drop 6 . words . innerText . take 2 . 
dropWhile (~/= "<li id=lastmod>")
 
 
 googleTechNews :: IO ()
@@ -75,7 +75,7 @@
 
 spjPapers :: IO ()
 spjPapers = do
-        tags <- fmap parseTags $ openItem 
"http://research.microsoft.com/en-us/people/simonpj/";
+        tags <- parseTags <$> openItem 
"http://research.microsoft.com/en-us/people/simonpj/";
         let links = map f $ sections (~== "<A>") $
                     takeWhile (~/= "<a name=haskell>") $
                     drop 5 $ dropWhile (~/= "<a name=current>") tags
@@ -90,7 +90,7 @@
 
 ndmPapers :: IO ()
 ndmPapers = do
-        tags <- fmap parseTags $ openItem 
"http://community.haskell.org/~ndm/downloads/";
+        tags <- parseTags <$> openItem 
"http://community.haskell.org/~ndm/downloads/";
         let papers = map f $ sections (~== "<li class=paper>") tags
         putStr $ unlines papers
     where
@@ -100,9 +100,9 @@
 
 currentTime :: IO ()
 currentTime = do
-        tags <- fmap parseTags $ openItem 
"http://www.timeanddate.com/worldclock/city.html?n=136";
-        let res = fromTagText (dropWhile (~/= "<strong id=ct>") tags !! 1)
-        putStrLn res
+    tags <- parseTags <$> openItem 
"http://www.timeanddate.com/worldclock/uk/london";
+    let time = fromTagText (dropWhile (~/= "<span id=ct>") tags !! 1)
+    putStrLn time
 
 
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/Text/HTML/TagSoup/Implementation.hs 
new/tagsoup-0.13.9/Text/HTML/TagSoup/Implementation.hs
--- old/tagsoup-0.13.8/Text/HTML/TagSoup/Implementation.hs      2016-01-10 
22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/Text/HTML/TagSoup/Implementation.hs      2016-03-15 
13:07:28.000000000 +0100
@@ -46,7 +46,7 @@
 
 
 expand :: Position -> String -> S
-expand p text = res
+expand p text = p `seq` res
     where res = S{s = res
                  ,tl = expand (positionChar p (head text)) (tail text)
                  ,hd = if null text then '\0' else head text
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/Text/HTML/TagSoup/Render.hs 
new/tagsoup-0.13.9/Text/HTML/TagSoup/Render.hs
--- old/tagsoup-0.13.8/Text/HTML/TagSoup/Render.hs      2016-01-10 
22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/Text/HTML/TagSoup/Render.hs      2016-03-15 
13:07:28.000000000 +0100
@@ -1,4 +1,4 @@
-{-# LANGUAGE PatternGuards #-}
+{-# LANGUAGE PatternGuards, OverloadedStrings #-}
 {-|
     This module converts a list of 'Tag' back into a string.
 -}
@@ -29,7 +29,6 @@
 escapeHTML :: StringLike str => str -> str
 escapeHTML = fromString . escapeXML . toString
 
-
 -- | The default render options value, described in 'RenderOptions'.
 renderOptions :: StringLike str => RenderOptions str
 renderOptions = RenderOptions escapeHTML (\x -> toString x == "br") (\x -> 
toString x == "script")
@@ -50,34 +49,32 @@
 renderTagsOptions :: StringLike str => RenderOptions str -> [Tag str] -> str
 renderTagsOptions opts = strConcat . tags
     where
-        s = fromString
-        ss x = [s x]
-    
+        ss x = [x]
+
         tags (TagOpen name atts:TagClose name2:xs)
-            | name == name2 && optMinimize opts name = open name atts (s " /") 
++ tags xs
+            | name == name2 && optMinimize opts name = open name atts " /" ++ 
tags xs
         tags (TagOpen name atts:xs)
-            | Just ('?',_) <- uncons name = open name atts (s " ?") ++ tags xs
+            | Just ('?',_) <- uncons name = open name atts " ?" ++ tags xs
             | optRawTag opts name =
                 let (a,b) = break (== TagClose name) (TagOpen name atts:xs)
                 in concatMap (\x -> case x of TagText s -> [s]; _ -> tag x) a 
++ tags b
         tags (x:xs) = tag x ++ tags xs
         tags [] = []
 
-        tag (TagOpen name atts) = open name atts (s "")
-        tag (TagClose name) = [s "</", name, s ">"]
+        tag (TagOpen name atts) = open name atts ""
+        tag (TagClose name) = ["</", name, ">"]
         tag (TagText text) = [txt text]
         tag (TagComment text) = ss "<!--" ++ com text ++ ss "-->"
         tag _ = ss ""
 
         txt = optEscape opts
-        open name atts shut = [s "<",name] ++ concatMap att atts ++ [shut,s 
">"]
-        att (x,y) | xnull && ynull = [s " \"\""]
-                  | ynull = [s " ", x]
-                  | xnull = [s " \"",txt y,s "\""]
-                  | otherwise = [s " ",x,s "=\"",txt y,s "\""]
-            where (xnull, ynull) = (strNull x, strNull y)
+        open name atts shut = ["<",name] ++ concatMap att atts ++ [shut,">"]
+        att ("","") = [" \"\""]
+        att (x ,"") = [" ", x]
+        att ("", y) = [" \"",txt y,"\""]
+        att (x , y) = [" ",x,"=\"",txt y,"\""]
 
-        com xs | Just ('-',xs) <- uncons xs, Just ('-',xs) <- uncons xs, Just 
('>',xs) <- uncons xs = s "-- >" : com xs
+        com xs | Just ('-',xs) <- uncons xs, Just ('-',xs) <- uncons xs, Just 
('>',xs) <- uncons xs = "-- >" : com xs
         com xs = case uncons xs of
             Nothing -> []
             Just (x,xs) -> fromChar x : com xs
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/Text/HTML/TagSoup/Tree.hs 
new/tagsoup-0.13.9/Text/HTML/TagSoup/Tree.hs
--- old/tagsoup-0.13.8/Text/HTML/TagSoup/Tree.hs        2016-01-10 
22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/Text/HTML/TagSoup/Tree.hs        2016-03-15 
13:07:28.000000000 +0100
@@ -14,6 +14,7 @@
 import Text.HTML.TagSoup (parseTags, parseTagsOptions, renderTags, 
renderTagsOptions, ParseOptions(..), RenderOptions(..))
 import Text.HTML.TagSoup.Type
 import Control.Arrow
+import GHC.Exts (build)
 
 
 data TagTree str = TagBranch str [Attribute str] [TagTree str]
@@ -57,11 +58,15 @@
 parseTreeOptions opts str = tagTree $ parseTagsOptions opts str
 
 flattenTree :: [TagTree str] -> [Tag str]
-flattenTree xs = concatMap f xs
+flattenTree xs = build $ flattenTreeFB xs
+
+flattenTreeFB :: [TagTree str] -> (Tag str -> lst -> lst) -> lst -> lst
+flattenTreeFB xs cons nil = flattenTreeOnto xs nil
     where
-        f (TagBranch name atts inner) =
-            TagOpen name atts : flattenTree inner ++ [TagClose name]
-        f (TagLeaf x) = [x]
+        flattenTreeOnto [] tags = tags
+        flattenTreeOnto (TagBranch name atts inner:trs) tags =
+            TagOpen name atts `cons` flattenTreeOnto inner (TagClose name 
`cons` flattenTreeOnto trs tags)
+        flattenTreeOnto (TagLeaf x:trs) tags = x `cons` flattenTreeOnto trs 
tags
 
 renderTree :: StringLike str => [TagTree str] -> str
 renderTree = renderTags . flattenTree
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/Text/StringLike.hs 
new/tagsoup-0.13.9/Text/StringLike.hs
--- old/tagsoup-0.13.8/Text/StringLike.hs       2016-01-10 22:15:15.000000000 
+0100
+++ new/tagsoup-0.13.9/Text/StringLike.hs       2016-03-15 13:07:28.000000000 
+0100
@@ -5,8 +5,9 @@
 --   This module provides an abstraction for String's as used inside TagSoup. 
It allows
 --   TagSoup to work with String (list of Char), ByteString.Char8, 
ByteString.Lazy.Char8,
 --   Data.Text and Data.Text.Lazy.
-module Text.StringLike where
+module Text.StringLike (StringLike(..), fromString, castString) where
 
+import Data.String
 import Data.Typeable
 
 import qualified Data.ByteString.Char8 as BS
@@ -17,7 +18,7 @@
 
 -- | A class to generalise TagSoup parsing over many types of string-like 
types.
 --   Examples are given for the String type.
-class (Typeable a, Eq a) => StringLike a where
+class (Typeable a, Eq a, IsString a) => StringLike a where
     -- | > empty = ""
     empty :: a
     -- | > cons = (:)
@@ -28,8 +29,6 @@
 
     -- | > toString = id
     toString :: a -> String
-    -- | > fromString = id
-    fromString :: String -> a
     -- | > fromChar = return
     fromChar :: Char -> a
     -- | > strConcat = concat
@@ -49,7 +48,6 @@
     uncons [] = Nothing
     uncons (x:xs) = Just (x, xs)
     toString = id
-    fromString = id
     fromChar = (:[])
     strConcat = concat
     empty = []
@@ -60,7 +58,6 @@
 instance StringLike BS.ByteString where
     uncons = BS.uncons
     toString = BS.unpack
-    fromString = BS.pack
     fromChar = BS.singleton
     strConcat = BS.concat
     empty = BS.empty
@@ -71,7 +68,6 @@
 instance StringLike LBS.ByteString where
     uncons = LBS.uncons
     toString = LBS.unpack
-    fromString = LBS.pack
     fromChar = LBS.singleton
     strConcat = LBS.concat
     empty = LBS.empty
@@ -82,7 +78,6 @@
 instance StringLike T.Text where
     uncons = T.uncons
     toString = T.unpack
-    fromString = T.pack
     fromChar = T.singleton
     strConcat = T.concat
     empty = T.empty
@@ -93,7 +88,6 @@
 instance StringLike LT.Text where
     uncons = LT.uncons
     toString = LT.unpack
-    fromString = LT.pack
     fromChar = LT.singleton
     strConcat = LT.concat
     empty = LT.empty
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/tagsoup-0.13.8/tagsoup.cabal 
new/tagsoup-0.13.9/tagsoup.cabal
--- old/tagsoup-0.13.8/tagsoup.cabal    2016-01-10 22:15:15.000000000 +0100
+++ new/tagsoup-0.13.9/tagsoup.cabal    2016-03-15 13:07:28.000000000 +0100
@@ -1,6 +1,6 @@
 cabal-version:  >= 1.6
 name:           tagsoup
-version:        0.13.8
+version:        0.13.9
 copyright:      Neil Mitchell 2006-2016
 author:         Neil Mitchell <ndmitch...@gmail.com>
 maintainer:     Neil Mitchell <ndmitch...@gmail.com>
@@ -11,7 +11,7 @@
 license-file:   LICENSE
 build-type:     Simple
 synopsis:       Parsing and extracting information from (possibly malformed) 
HTML/XML documents
-tested-with:    GHC==7.10.1, GHC==7.8.4, GHC==7.6.3, GHC==7.4.2, GHC==7.2.2
+tested-with:    GHC==8.0.1, GHC==7.10.3, GHC==7.8.4, GHC==7.6.3, GHC==7.4.2
 description:
     TagSoup is a library for parsing HTML/XML. It supports the HTML 5 
specification,
     and can be used to parse either well-formed XML, or unstructured and 
malformed HTML

commit ghc-tagsoup for openSUSE:Factory

Reply via email to