I am very interested in it too, but have not had time to work on it. I am thinking of trying to target a website and crawl through the pages, transform it into XML (as much as possible...) and deposit it somewhere.
Thanks for doing this Andy! ----- Original Message ----- From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Cc: "Andy Clark" <[EMAIL PROTECTED]> Sent: Monday, February 25, 2002 6:47 AM Subject: Re: NekoHTML Parser License Change > Andy's NekoHTML parser has worked well for me in a small project where > I needed to scrape some data from a set of HTML pages. With NekoHTML > as the front end I was able to use an XSLT stylesheet to extract that > data directly from the HTML pages. > > NekoHTML also allowed me to write a simple HTML transformation that I > find useful when analyzing HTML page layouts: adding a small colored > border to each TABLE so that the table boundaries are visible. This > transformation requires only a few lines of XSLT added to a standard > "identity" transformation. > > I expect that NekoHTML would make it easy to translate HTML code into > XHTML format. I have encountered a few tag-balancing glitches, where > NekoHTML struggles to accommodate ill-formed HTML code much as the > popular browsers do, but overall it has been very solid. > > NekoHTML is very easy to use. For the most part it is a transparent > addition to a standard Xerces/Xalan configuration, and all the usual > APIs -- including JAXP -- seem to work as expected. > > Nice work Andy. Thank you for making NekoHTML available. --------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
