Re: [libxml-devel] Request to add HTML parsing

Aitor Garay-Romero Tue, 18 Jul 2006 10:08:40 -0700

OK, the mailing list software complaints that the message is too big. I send the files compressed (please read below).

On 7/18/06, Aitor Garay-Romero <[EMAIL PROTECTED]> wrote:

Hi there!,

    I did some work myself to allow libxml-ruby to parse HTML directly from an string. I was thinking on implementing some extra features and then sending the patches to the developers.

    But i'm busy with some stuff and i didn't find a moment to finish and send it.

    Anyway find attached to this message the 3 files i modified. They are based on libxml-ruby-0.3.8. Just make a diff to the originals to see what changed.

    Hope that it's useful.

    /AITOR

On 7/18/06, Mark Thomas < [EMAIL PROTECTED]> wrote:
I'm switching to Ruby from Perl, and currently I do all my HTML parsing in
perl's XML::LibXML. Applying XPath to parse HTML is extremely powerful and
fast, fast, fast in libxml.

Can you add that feature to the Ruby one? I think it would be easy to do;
it's just a flag on the parser, which tells libxml to create a DOM from
HTML instead of XML, and all the XML methods then magically work on the
HTML!

So it should be really low hanging fruit. Sweet, delicious fruit.

Please consider it!

Thanks,
- Mark.

_______________________________________________
libxml-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/libxml-devel

parse_html.tar.gz
Description: GNU Zip compressed data

_______________________________________________
libxml-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/libxml-devel

Re: [libxml-devel] Request to add HTML parsing

Reply via email to