Simon, as to your last question, here's a start:

https://github.com/tmpvar/jsdom/issues?labels=parsing&page=1&state=open

On Sunday, October 28, 2012 11:48:35 PM UTC-4, Simon wrote:
>
> Domenic,
>
> I'd be curious to know what parsers you are considering and if you have 
> some tests / html examples that are tripping up the existing parser..
>
>
> On Friday, October 26, 2012 10:57:06 PM UTC+7, Domenic Denicola wrote:
>>
>> Very nice. As maintainer of jsdom, I've been looking for a replacement 
>> default HTML parser that could solve many of the parsing issues we've 
>> encountered. I'll put you on the shortlist. Thanks for announcing.
>>
>> On Friday, October 26, 2012 6:07:48 AM UTC-4, Dean Mao wrote:
>>>
>>> Hi All,
>>>
>>> I created a native html parser based on libhubbub, a parser library used 
>>> by the netsurf browser project.  There were quite a few html pages that 
>>> didn't parse correctly on tautologistics's html parser so I thought it 
>>> might be easier pulling in a parser from an existing web browser.  I 
>>> considered using webkit & firefox, but those browsers had too many external 
>>> dependencies.  The parser can operate in blocking or non-blocking mode, and 
>>> streamed (chunked) data.  The wonderful jsdom library 
>>> uses tautologistics/node-htmlparser by default, but one can choose this 
>>> parser as the overriding default.  The readme shows an example of how this 
>>> is done.
>>>
>>> Github:
>>> https://github.com/deanmao/node-hubbub
>>>
>>> To install:
>>> npm install hubbub
>>>
>>>
>>>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to