There are some facilities you can use that spring to mind. But first, a comment ...
You are going to have to parse the records to at least some extent (assuming the records contain free text. For instance, it's not enough to look for the string "bum", as you'll censor phrases like "bumper crop". One way to do this is to transform the record so that: - it contains no punctuation with the exception of a space between words - it starts and ends with a space Once you've done this, you can do your search for " bum " without false triggers on longer words. Now, I don't know regular expressions well enough to answer the next bit, but I expect that they can help you to at least knock out the punctuation and insert the spaces at each end. (Note that it isn't necessary to convert multiple spaces back to single spaces.) Now, you need to check all your words against the sentence. At the very least, you can use the VBScript / JScript "find sting inside string" built-in facility, which will be pretty quick. But this will only do one word search at a time, so you'd be doing a lot of them if you have a big list of bad words. You could always reverse the search - breaking the record into individual words with a split facility (especially in JScript) and joining your list of bad words into one "sentence" (again the JScript join is great for this). Then you only need to do as many checks as there are words in the record - certainly that's fewer iterations through the loop than iterating through the bad word list. But, get advice from a regular expressions guru. I never fail to be amazed at just how much you can achieve with them if you know how. For instance, if you transform both your record and your bad words into a string each with single spaces separating the words (something regular expressions can certainly do), it might be possible to use a regular expression to look for a match. Dave S ----- Original Message ----- From: Ben Shaffer To: [email protected] Sent: Friday, July 08, 2005 4:58 AM Subject: [ASP] RE: Censoring Records I have a huge number of records in a database as well as a long list of 'bad' words. Before displaying any record, I want to check the list of words and ensure that none of them are contained within the record. What is the most efficient way of doing this? Thanks in advance, Ben [Non-text portions of this message have been removed] --------------------------------------------------------------------- Home : http://groups.yahoo.com/group/active-server-pages --------------------------------------------------------------------- Post : [email protected] Subscribe : [EMAIL PROTECTED] Unsubscribe: [EMAIL PROTECTED] --------------------------------------------------------------------- ------------------------------------------------------------------------------ YAHOO! GROUPS LINKS a.. Visit your group "active-server-pages" on the web. b.. To unsubscribe from this group, send an email to: [EMAIL PROTECTED] c.. Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service. ------------------------------------------------------------------------------ [Non-text portions of this message have been removed] --------------------------------------------------------------------- Home : http://groups.yahoo.com/group/active-server-pages --------------------------------------------------------------------- Post : [email protected] Subscribe : [EMAIL PROTECTED] Unsubscribe: [EMAIL PROTECTED] --------------------------------------------------------------------- Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/active-server-pages/ <*> To unsubscribe from this group, send an email to: [EMAIL PROTECTED] <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/
