Re: [racket-users] html-parsing package 5.0 changes

Neil Van Dyke Tue, 22 May 2018 20:49:00 -0700

FYI, I made another potentially breaking change to `html-parsing`, andso have incremented the major version number, to 6.0.

Sorawee Porncharoenwase found a case in which the parser was doing thewrong thing for a real-world example of contemporary HTML. It turnedout to be in some 17 year-old "structure recovery" constraints thatthought HTML `p` elements couldn't be children of `blockquote`elements. I decided to change that in `html-parsing` version 6.0, whichcould easily break some Web scraper code that was based on the previousparse involving `blockquote` and `p`.

More generally, with the `html-parsing` package, I'm trying to minimizethe changes (and effort spent) on that package, while still making aneffort to respond to problems that people encounter with real-world HTML.

I hope there won't be any further breaking changes. But, having goodunit tests for your Web scraper (or other code) will hopefully identifyany problems without much pain. Please let me know of any such pains.


http://www.neilvandyke.org/racket/html-parsing/
https://pkgs.racket-lang.org/package/html-parsing


--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [racket-users] html-parsing package 5.0 changes

Reply via email to