FYI, I made another potentially breaking change to `html-parsing`, and so have incremented the major version number, to 6.0.

Sorawee Porncharoenwase found a case in which the parser was doing the wrong thing for a real-world example of contemporary HTML.  It turned out to be in some 17 year-old "structure recovery" constraints that thought HTML `p` elements couldn't be children of `blockquote` elements.  I decided to change that in `html-parsing` version 6.0, which could easily break some Web scraper code that was based on the previous parse involving `blockquote` and `p`.

More generally, with the `html-parsing` package, I'm trying to minimize the changes (and effort spent) on that package, while still making an effort to respond to problems that people encounter with real-world HTML.

I hope there won't be any further breaking changes.  But, having good unit tests for your Web scraper (or other code) will hopefully identify any problems without much pain.  Please let me know of any such pains.

http://www.neilvandyke.org/racket/html-parsing/
https://pkgs.racket-lang.org/package/html-parsing


--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to