I find that when I use the html library I have to make a few simple changes to html-spec.rkt. It seems that <ins> and <del> are not treated like <b> and <i> . You can see in this example that while <b> remains in the enclosing <p>, <ins> does not. I also find that I have to allow pcdata as a child of <ol> and <ul>. I don't know whether pcdata is "supposed to" appear there but it often does in the wild.

Jon



#lang racket

(require (prefix-in h: html)  (prefix-in x: xml))

(define (xml->list x)
  (cond
        [(x:pcdata? x) (x:pcdata-string x)]
        [(x:entity? x) (list)]
        [(x:element? x)
         (list (x:element-name x)
               (map xml->list (x:element-content x)))]
        [(list? x) (map xml->list x)]))

(printf "~s\n" (xml->list (h:read-html-as-xml (open-input-string "<p>Hello world <b>Testing</b>!</p>")))) (printf "~s\n" (xml->list (h:read-html-as-xml (open-input-string "<p>Hello world <ins>Testing</ins>!</p>"))))

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to