Simon, as to your last question, here's a start: https://github.com/tmpvar/jsdom/issues?labels=parsing&page=1&state=open
On Sunday, October 28, 2012 11:48:35 PM UTC-4, Simon wrote: > > Domenic, > > I'd be curious to know what parsers you are considering and if you have > some tests / html examples that are tripping up the existing parser.. > > > On Friday, October 26, 2012 10:57:06 PM UTC+7, Domenic Denicola wrote: >> >> Very nice. As maintainer of jsdom, I've been looking for a replacement >> default HTML parser that could solve many of the parsing issues we've >> encountered. I'll put you on the shortlist. Thanks for announcing. >> >> On Friday, October 26, 2012 6:07:48 AM UTC-4, Dean Mao wrote: >>> >>> Hi All, >>> >>> I created a native html parser based on libhubbub, a parser library used >>> by the netsurf browser project. There were quite a few html pages that >>> didn't parse correctly on tautologistics's html parser so I thought it >>> might be easier pulling in a parser from an existing web browser. I >>> considered using webkit & firefox, but those browsers had too many external >>> dependencies. The parser can operate in blocking or non-blocking mode, and >>> streamed (chunked) data. The wonderful jsdom library >>> uses tautologistics/node-htmlparser by default, but one can choose this >>> parser as the overriding default. The readme shows an example of how this >>> is done. >>> >>> Github: >>> https://github.com/deanmao/node-hubbub >>> >>> To install: >>> npm install hubbub >>> >>> >>> -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
