WebDataKit, http://www.lotontech.com/wdbc.html - free for download. Some kind of SQL for HTML (even from different web-sites, concatenation etc.), interesting... I need to search specific places within HTML...
Some sites have very good design, they have explicit meta-tags... If you are working on Intranet, it's easiest solution: <title>TOSHIBA TECRA S2 Pentium M 15.0" nVIDIA GeForce Go 6600 NoteBook - Retail at Newegg.com</title> <meta name="description" content="Buy TOSHIBA TECRA S2 Pentium M 15.0" nVIDIA GeForce Go 6600 NoteBook - Retail Online" /> <meta name="keywords" content="Buy TOSHIBA TECRA S2 Pentium M 15.0" nVIDIA GeForce Go 6600 NoteBook - Retail Cheap" /> -----Original Message----- From: Jack Tang [mailto:[EMAIL PROTECTED] Sent: Thursday, August 18, 2005 10:15 PM To: [email protected] Subject: Parse-html should be enhanced! Hi Nutchers I think parse-html parse should be enhanced. In some of my projects(Intranet search engine), we only need the content in the specified detectors and filter the junk, say the content between <div class="start-here"> and </div> or some detectors like XPath. Any thoughts on this enhancement? Regards /Jack -- Keep Discovering ... ... http://www.jroller.com/page/jmars
