Hi Sami, On 12/9/06 2:27 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> Author: siren > Date: Sat Dec 9 14:27:07 2006 > New Revision: 485076 > > URL: http://svn.apache.org/viewvc?view=rev&rev=485076 > Log: > Optimize SpellCheckedMetadata further by taking into account the fact that it > is used only for http-headers. > > I am starting to believe that spellchecking should just be an utility method > used by http protocol plugins. I think that right now I'm -1 for this change. I would make note of all the comments on NUTCH-139, from which this code was born. In the end, I think what we all realized was that the spell checking capabilities is necessary, but not everywhere, as you point out. However, I don't think it's limited entirely to HTTP headers (what you've currently changed the code to). I think it should be implemented as a protocol layer service, also providing spell checking support to other protocol plugins, like protocol-file, etc., where field headers run the risk of being misspelled as well. What's to stop someone from implementing protocol-file++ that returns different file header keys than that of protocol-file? Just b/c HTTP is the most pervasively used plugin right now, I think it's convenient to assume that only HTTP protocol field keys may need spell checking services. Just my 2 cents... Cheers, Chris