I believe that the canonical way of working with XML documents in Guile is through the (sxml simple) module (and others): https://www.gnu.org/software/guile/manual/html_node/SXML.html
It contains xml->sxml function which allows to convert XML strings to a more familiar s-expression based format. śr., 23 sty 2019 o 17:41 swedebugia <[email protected]> napisał(a): > I just found this LGPL3 parser by Neil Van Dyke (see attachment) > > Do we have something similar in guile? > > If not is anybody interested in porting it? (I have no idea how much > work it would be, but Racket seems quite close to guile) > > Here is the introduction: > "The html-parsing library provides a permissive HTML parser. The parser > is useful for software agent extraction of information from Web pages, > for programmatically transforming HTML files, and for implementing > interactive Web browsers. html-parsing emits SXML/xexp, so that > conventional HTML may be processed with XML tools such as SXPath. Like > Oleg Kiselyov’s SSAX-based HTML parser, html-parsing provides a > permissive tokenizer, but html-parsing extends this by attempting to > recover syntactic structure. > The html-parsing parsing behavior is permissive in that it accepts > erroneous HTML, handling several classes of HTML syntax errors > gracefully, without yielding a parse error. This is crucial for parsing > arbitrary real-world Web pages, since many pages actually contain syntax > errors that would defeat a strict or validating parser. html-parsing’s > handling of errors is intended to generally emulate popular Web > browsers’ interpretation of the structure of erroneous HTML." > https://docs.racket-lang.org/html-parsing/index.html > > -- > Cheers Swedebugia >
