Is it possible to configure Lucene such that it doesn't tokenize on embedded dashes, and thus doesn't consider the "A" a stop word because it's not standing alone? I do believe the combination of dash handling and stop words is why the query is causing problems for my user.
On Fri, Dec 12, 2008 at 1:32 PM, Daniel Naber <[email protected]> wrote: > On Freitag, 12. Dezember 2008, Jenny Brown wrote: > >> I'm trying to search for company ABC Inc. in places where it may be >> mentioned as A-B-C Inc. Lucene is doing something with those dashes, >> though, that prevents me from getting accurate results. > > "A" (even in "A-B-C" I think) is a stopword with StandardAnalyzer's default > settings, which might cause problems. Please also check out these hints > from the FAQ: > > http://wiki.apache.org/lucene-java/LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71 > > Regards > Daniel > > -- > http://www.danielnaber.de >
