-----Original Message----- >From: Eddie Shipman <[EMAIL PROTECTED]> >Sent: Apr 25, 2007 3:05 PM >To: Borland's Delphi Discussion List <[email protected]> >Subject: Re: HTML Browser / Parser > >I just had to drop in at this time and insert my comments.. > >HTML can be and in most cases very misused and not very structured. >Now, if you can make sure that the HTML is structured correctly, then >I would think that the MSHTML-DOM would suffice to create your tree.
Absoultely correct, clean well-structured HTML is nearly XML compliant. True XHTML (naturally) will be fully so. If you can guarantee either (e.g. the source is some kind of code generator that you have control over) that greatly simplifies the job. A major portion of creating a standard HTML parser meant to handle "live" HTML (from the Net) is getting it to respond robustly in the face of all the (charitably) idiosyncratic HTML out there in The Wild. >However, you may also be able to write a recursive function to work >on the IHTMLDocument2 Object Model. While I have not attempted it, it >doesn't look like it would be too difficult to do. I looked into that a bit when I dabbled in writing my own TWebBrowser based Browser app a couple of years back. Naked COM interfaces, Mmmmm (shudder), not for the faint of heart. ;-) Here's a few links to some Delphi stuff on it available on the web: http://www.cryer.co.uk/brian/delphi/twebbrowser/twebbrowser_oleobject.htm http://delphi.about.com/od/adptips2004/a/bltip1204_3.htm http://beensoft.blogspot.com/2006_05_01_archive.html http://www.delphifaq.com/faq/delphi/network/f241.shtml http://www.delphipages.com/tips/thread.cfm?ID=292 HTH Stephen Posey [EMAIL PROTECTED] _______________________________________________ Delphi mailing list -> [email protected] http://www.elists.org/mailman/listinfo/delphi

