I am trying to figure out whether or not Lucene is an appropriate solution for a problem that our site faces. Our site
allows users to post their opinions on various topics. Due to various government legislations around the world our management would like us to scan each users post against various keywords that would indicate inappropriate content in the users posting. We are looking for racial slurs, profanity and attacks against sexual orientation. Each users posting is generally not more that a few paragraphs. I would like to analyze each users post for various words and expressions before publishing their post to the DB. I am reading through the Lucene in action book and it looks as if I cannot analyze a string without first indexing it. If this is true will indexing each post be a performance hit to the site? I was wondering if someone could shed some light on the best way to tackle this problem with Lucene or another api if doing so makes more sense? Thanks, Jeff