StandardAnalyzer matches 'www.google.com' as a HOST and leaves the whole token intact. However, if at the end of a sentence, StandardAnalyzer matches 'www.google.com.' as an ACRONYM which creates a token of 'wwwgooglecom'. A search for 'www.google.com' will of course not match now.
Is this a known compromise? It seems kind of scary that you will lose the ability to find a URL in a search if it comes at the end of a sentence. Is only looking for ACRONYM's with a single letter between periods too restrictive? Other ideas? Looking for HOST before ACRONYM is out because we won't ever get to ACRONYM. Or is this a known and accepted compramise? - Mark