:
: StandardAnalyzer matches 'www.google.com' as a HOST and leaves the whole
: token intact. However, if at the end of a sentence, StandardAnalyzer matches
: 'www.google.com.' as an ACRONYM which creates a token of 'wwwgooglecom'. A
: search for 'www.google.com' will of course not match now.

: Or is this a known and accepted compramise?

StandardAnalyzer is black voodoo that i've never delved into ... but if
you are asking for opinions on how it *should* work i would think that
"www.google.com." should not be considered an acronym for obvious reasons
-- if acronym is going to be a special token type where periods are
striped out, then i think assuming single letters is wise.

that said, i dont' think "www.google.com." should be treated as a HOSTname
either ... because it's not.  DNS hostnames can't end in a "." ...
regardless of how grammarians might tell you to write a sentence, when you
put a period at the end, it stops being a hostname, and becomes a word
with funky puntuation in the middle.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to