The thread more suited for lucene-user. I think that's just a bad work to be indexing....Microsoft...sheesh. Ok, it's not. This is a known thing, even mentioned in the FAQ on jGuru. For how StandardFilter works it's best to look at the source, it's quite simple. I think I might have mentioned that in the Lucene article on Onjava.com as well....not 100% sure any more :)
Otis --- Lukas Zapletal <[EMAIL PROTECTED]> wrote: > Hello all, > > I have a small problem. Let`s have a word 'Microsoft' indexed in > Lucene. > When I query Microsoft, it returns the document, but when I try > Micro* > then nothing is found. After lowercasing the first letter to micro* > Lucene returns the document. > > The same thing is with ?. When I use it, only lower-cased words are > matched. > > Is this a bug or Am I missing something? > > ps - where can I find some information how Lucene parse the input > when > using StandardFilter. I mean I don`t know what is ignored and what > not. > For example acronyms (U.S.A), dates (2002-11-07 or 1. 1. 2003) etc... > I > cannot find it in the documentation. In the StandardFilter API there > is > onthing, it seems to be generated from JavaCC. > > -- > Lukas Zapletal [[EMAIL PROTECTED]] > http://www.tanecni-olomouc.cz/lzap > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
