Re: [Haskell-cafe] HXT: how to get sibling element

2012-03-16 Thread Никитин Лев
Thanx to all. I've done it!

===

import Text.XML.HXT.Core
import Text.XML.HXT.Curl
import Text.XML.HXT.HTTP
import Control.Arrow.ArrowNavigatableTree

pageURL = http://localhost/test.xml;

main = do
r - runX (configSysVars [withCanonicalize no, withValidate no, withTrace 
0, withParseHTML no] 
  readDocument [withErrors no, withWarnings no, withHTTP []] 
pageURL 
  getChildren  isElem  hasName div  (getTitle + 
getSections))

putStrLn Articles:
putStrLn  
mapM_ putStrLn $ map (\i - (fst i) ++  is  ++ (snd i) ++ \n) r
putStrLn  

getTitle = listA (getChildren  isElem  hasName span)  arr head  
getChildren  getText  arr trim   arr (Title,)

getSections = addNav  
listA (getChildren  withoutNav (isElem  hasName span)) 
 
arr tail  unlistA  
((getChildren  remNav  getText)  
(listA followingSiblingAxis  arr head  remNav  getText 
 arr (rc . trim)))


  ltrim [] = []
  ltrim (' ':x) = ltrim x
  ltrim ('\n':x) = ltrim x
  ltrim ('\r':x) = ltrim x
  ltrim ('\t':x) = ltrim x
  ltrim x = x 

  rtrim = reverse . ltrim . reverse

  trim = ltrim . rtrim

  rc (':':' ':x) = x 
  rc x = x 


==

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] HXT: how to get sibling element

2012-03-15 Thread Никитин Лев
Hello, haskellers.

Suppose we have this xml doc (maybe, little stupid):

div
  spanSome story/span
  spanDescription/span: This story about...
  spanAuthor/span: Tom Smith
/div

In the end I whant to get list: [(Title, Some story), (Description,This 
story about...), (Author, Tom Smith)],
or, maybe this: Book  Some story [(description,This story about...), 
(Author, Tom Smith)] (Book = Book String [(String, String)].

First span is a special case then others and I undestand how to process it:

===

import Text.XML.HXT.Core
import Text.XML.HXT.Curl
import Text.XML.HXT.HTTP

pageURL = http://localhost/test.xml;

main = do
r - runX (configSysVars [withCanonicalize no, withValidate no, withTrace 
0, withParseHTML no] 
  readDocument [withErrors no, withWarnings no, withHTTP []] 
pageURL 
  getChildren  isElem  hasName div  listA (getChildren 
 hasName span)  getTitle + getSections)
   putStrLn Статьи:
putStr 
mapM_ putStr $ map (\i - (fst i) ++ :  ++ (snd i) ++ | ) r
putStrLn 

getTitle = arr head  getChildren  getText  arr trim  arr (Title,)

getSections = arr tail  unlistA  ((getChildren  getText  arr trim) 
 (getChildren  getText  arr trim))

ltrim [] = []
ltrim (' ':x) = ltrim x
ltrim ('\n':x) = ltrim x
ltrim ('\r':x) = ltrim x
ltrim ('\t':x) = ltrim x
ltrim x = x

rtrim = reverse . ltrim . reverse

trim = ltrim . rtrim

===

And I' get list:  [(Title, Some story), (Description,Description), 
(Author, Author)]

(Maybe, there is a better way to get this list?)

But I cannot find a way to get text that followes some span.

I suppose that I have to use function from  
Data.Tree.NavigatableTree.XPathAxis, but I don't puzzle out how to do it.

Please, help me.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HXT: how to get sibling element

2012-03-15 Thread Wilfried van Asten
I'm am not really familiar with XML parsing in Haskell, but I am
wondering why, if you have an xml file, you not simply name the
element after the type of contents:

book
  titleSome Story/title
  descriptionThis story about../description
  authorTom Smith/author
/book

Alternatively you could use the class or some type attribute to
indicate the type.

Regards,

Wilfried van Asten


2012/3/15 Никитин Лев leon.v.niki...@pravmail.ru:
 Hello, haskellers.

 Suppose we have this xml doc (maybe, little stupid):

 div
  spanSome story/span
  spanDescription/span: This story about...
  spanAuthor/span: Tom Smith
 /div

 In the end I whant to get list: [(Title, Some story), 
 (Description,This story about...), (Author, Tom Smith)],
 or, maybe this: Book  Some story [(description,This story about...), 
 (Author, Tom Smith)] (Book = Book String [(String, String)].

 First span is a special case then others and I undestand how to process it:

 ===

 import Text.XML.HXT.Core
 import Text.XML.HXT.Curl
 import Text.XML.HXT.HTTP

 pageURL = http://localhost/test.xml;

 main = do
    r - runX (configSysVars [withCanonicalize no, withValidate no, withTrace 
 0, withParseHTML no] 
              readDocument [withErrors no, withWarnings no, withHTTP []] 
 pageURL 
              getChildren  isElem  hasName div  listA (getChildren 
  hasName span)  getTitle + getSections)
   putStrLn Статьи:
    putStr 
    mapM_ putStr $ map (\i - (fst i) ++ :  ++ (snd i) ++ | ) r
    putStrLn 

 getTitle = arr head  getChildren  getText  arr trim  arr 
 (Title,)

 getSections = arr tail  unlistA  ((getChildren  getText  arr 
 trim)  (getChildren  getText  arr trim))

 ltrim [] = []
 ltrim (' ':x) = ltrim x
 ltrim ('\n':x) = ltrim x
 ltrim ('\r':x) = ltrim x
 ltrim ('\t':x) = ltrim x
 ltrim x = x

 rtrim = reverse . ltrim . reverse

 trim = ltrim . rtrim

 ===

 And I' get list:  [(Title, Some story), (Description,Description), 
 (Author, Author)]

 (Maybe, there is a better way to get this list?)

 But I cannot find a way to get text that followes some span.

 I suppose that I have to use function from  
 Data.Tree.NavigatableTree.XPathAxis, but I don't puzzle out how to do it.

 Please, help me.


 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] HXT: how to get sibling element

2012-03-15 Thread Никитин Лев
I absolutly agree with you but unfortunetly, it is not my xml file.It is extraction from html page of public web server. I cannot to change format of this html page.Sorry. I had to explain it  in first letter. But than what about to get sibling text (geting sibling is an separate interesting tasks with no matter for my contrete case).  

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HXT: how to get sibling element

2012-03-15 Thread Никитин Лев
Oh, yes!
In this situation with so poor structured source I can try to use tagsoup. (or 
I'll take a look at xml-conduit).

Nevertheless  for better undestanding HXT it will be interesting to solve this 
problem in HXT. Or is it impossible?



15.03.2012, 20:08, Asten, W.G.G. van (Wilfried, Student B-TI) 
w.g.g.vanas...@student.utwente.nl:
 You might want to check out the xml-conduit package. It has preceding
 and following sibling Axis. I am not sure how the package works
 exactly, but it seems to be a good starting point.

 2012/3/15 Никитин Лев leon.v.niki...@pravmail.ru:

  I absolutly agree with you but unfortunetly, it is not my xml file.
  It is extraction from html page of public web server. I cannot to change
  format of this html page.
  Sorry. I had to explain it  in first letter.

  But than what about to get sibling text (geting sibling is an separate
  interesting tasks with no matter for my contrete case).

  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HXT: how to get sibling element

2012-03-15 Thread Wilfried van Asten
ArrowNavigatableTree can also get a following sibling Axis:

http://hackage.haskell.org/packages/archive/hxt/9.2.2/doc/html/Control-Arrow-ArrowNavigatableTree.html

Wilfried

2012/3/15 Никитин Лев leon.v.niki...@pravmail.ru:
 Oh, yes!
 In this situation with so poor structured source I can try to use tagsoup. 
 (or I'll take a look at xml-conduit).

 Nevertheless  for better undestanding HXT it will be interesting to solve 
 this problem in HXT. Or is it impossible?



 15.03.2012, 20:08, Asten, W.G.G. van (Wilfried, Student B-TI) 
 w.g.g.vanas...@student.utwente.nl:
 You might want to check out the xml-conduit package. It has preceding
 and following sibling Axis. I am not sure how the package works
 exactly, but it seems to be a good starting point.

 2012/3/15 Никитин Лев leon.v.niki...@pravmail.ru:

  I absolutly agree with you but unfortunetly, it is not my xml file.
  It is extraction from html page of public web server. I cannot to change
  format of this html page.
  Sorry. I had to explain it  in first letter.

  But than what about to get sibling text (geting sibling is an separate
  interesting tasks with no matter for my contrete case).

  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe