On 3/8/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: If you store a hash code of the word rather then the actual word you
: should be able to search for stuff but not be able to actually retrieve

that's a really great solution ... it could even be implemented asa
TokenFilter so none of your client code would ever even need to know that
it was being used (just make sure it comes last after any stemming or what
not)

I don't know... hashing individual words is an extremely weak form of
security that should be breakable without even using a computer... all
the statistical information is still there (somewhat like 'encrypting'
a message as a cryptoquote).

Doron's suggestion is preferable: eliminate token position information
from the index entirely.

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to