If you are creating the templates that are generating the HTML then you can make it very easy on yourself by wrapping the text blocks in some kind of marker that you can find in the cfhttp.filecontent later. Maybe wrap it in something like <!- [BEGIN TEXT TO GRAB] --> This is the text you want indexed <!-- [END TEXT TO GRAB] -->
Then in cffile.filecontent, search for blocks of text between the two comment blocks. -----Original Message----- From: Anthony Webb [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 09, 2008 1:04 PM To: CF-Talk Subject: Extract text from webpage content using cfhttp I need to index web page contents for doing verity (or similar) searching. I'd like to insert just the text that a web page returns and not any of the other stuff (like html, JS, CSS, images, etc) I noticed that cfhttp.filecontent returns the entire contents of the page, anyone have a good way to get at just the text? Also, I am storing the results in a mysql database and was anticipating using the "text" data type, I assume that is the best way to go? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;203748912;27390454;j Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:308835 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

