Thanks a lot, I'll try the autoGeneratePhraseQueries property and see how that works.
Regarding the reindexing tip, it's a good tip but due to the my current "on the fly" setup on the servers at work i basically have do build a project with maven and deploy to tomcat, wherein the index lies, and I therefore have to reindex each time otherwise the index would be empty. Also i usually add use the "clean" parameter when testing with DIH. So that shouldn't be a problem. *Aleksander Akerø* Systemkonsulent Mobil: 944 89 054 E-post: aleksan...@gurusoft.no *Gurusoft AS* Telefon: 92 44 09 99 Østre Kullerød www.gurusoft.no 2014-01-29 Alexandre Rafalovitch <arafa...@gmail.com> > I think the whitespace might also be the issue. The query gets parsed > by standard component that splits it on space before passing > individual components into the field searches. > > Try enabling autoGeneratePhraseQueries on the field (or field type) > and reindexing. See if that makes a difference. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all > at once. Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > On Wed, Jan 29, 2014 at 9:55 PM, Aleksander Akerø > <aleksan...@gurusoft.no> wrote: > > update: > > > > Guessing that this has nothing to do with the tokenizer. Tried to use the > > string fieldtype as well, but still the same results. So this must have > to > > do with some other solr config. > > > > What confuses me is that when I search "1005" which is another valid > value > > to search for, it works perfectly, but then again, this query contains no > > whitespace. > > > > Any ideas? > > > > *Aleksander Akerø* > > Systemkonsulent > > Mobil: 944 89 054 > > E-post: aleksan...@gurusoft.no > > > > *Gurusoft AS* > > Telefon: 92 44 09 99 > > Østre Kullerød > > www.gurusoft.no > > > > > > 2014-01-29 Aleksander Akerø <aleksan...@gurusoft.no> > > > >> Thanks for the quick answer, but it doesn't help if I remove the > lowercase > >> analyzer like so: > >> > >> * <fieldType name="keyword" class="solr.TextField" > >> positionIncrementGap="100">* > >> * <analyzer type="index">* > >> * <tokenizer class="solr.KeywordTokenizerFactory"/>* > >> * </analyzer>* > >> * <analyzer type="query">* > >> * <tokenizer class="solr.KeywordTokenizerFactory"/>* > >> * </analyzer>* > >> * </fieldType>* > >> > >> I still need to add quotes to the searchquery to get results. And the > >> weird thing is that if I use the analyzer and put in "FE 009" (again, > >> without quotes) for both index and query values, it highlights the > result > >> as to show a match, but when i search using the GUI it gives me no > results. > >> The same happens when posting directly to the /select requestHandler > via GET > >> > >> These is what i post using GET: > >> http://mysite.com/solr/corename/select?q=number:FE%20009&qf=number > => > >> this does not work > >> http://mysite.com/solr/corename/select?q=number:"FE%20009"&qf=number > => > >> this works > >> > >> Really starting to wonder if I am doing something terribly wrong > somewhere. > >> > >> This is my requestHandler btw, pretty basic: > >> <!-- #### Default handler #### --> > >> <requestHandler name="/select" class="solr.SearchHandler"> > >> <lst name="defaults"> > >> <str name="echoParams">explicit</str> > >> <str name="defType">edismax</str> > >> <str name="q.alt">*:*</str> > >> <str name="rows">10</str> > >> <str name="fl">*,score</str> > >> <str name="qf">number</str> > >> </lst> > >> </requestHandler> > >> > >> *Aleksander Akerø* > >> Systemkonsulent > >> Mobil: 944 89 054 > >> E-post: aleksan...@gurusoft.no > >> > >> *Gurusoft AS* > >> Telefon: 92 44 09 99 > >> Østre Kullerød > >> www.gurusoft.no > >> > >> > >> 2014-01-29 Aruna Kumar Pamulapati <apamulap...@gmail.com> > >> > >> Hi , > >>> > >>> I think the misunderstanding you are having is about > >>> > >>> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LowerCaseTokenizerFactory > >>> lowercase > >>> factory. > >>> > >>> You are correct about KeywordTokenizerFactory but lowercase factory : > >>> Creates > >>> tokens by lowercasing all letters and dropping non-letters. > >>> > >>> The best place to play and learn these pipelines is Solr admin panel => > >>> analysis page. > >>> > >>> > >>> thanks, > >>> Arun > >>> > >>> > >>> On Wed, Jan 29, 2014 at 9:05 AM, Aleksander Akerø < > aleksan...@gurusoft.no > >>> >wrote: > >>> > >>> > Hi, I'll try properly this time. > >>> > > >>> > According to solr documentation the solr.KeywordTokenizerFactory > should > >>> not > >>> > do any tokenizing at all. Thus, if I understand this correctly, it > >>> should > >>> > only return exact matches given that this is the only analyzer > defined > >>> in > >>> > the field type. Such as the following config: > >>> > > >>> > Fieldtypes: > >>> > * <fieldType name="keyword" class="solr.TextField" > >>> > positionIncrementGap="100">* > >>> > * <analyzer type="index">* > >>> > * <tokenizer class="solr.KeywordTokenizerFactory"/>* > >>> > * <filter class="solr.LowerCaseFilterFactory"/>* > >>> > * </analyzer>* > >>> > * <analyzer type="query">* > >>> > * <tokenizer class="solr.KeywordTokenizerFactory"/>* > >>> > * <filter class="solr.LowerCaseFilterFactory"/>* > >>> > * </analyzer>* > >>> > * </fieldType>* > >>> > > >>> > Fields: > >>> > * <field name="number" type="keyword" indexed="true" > >>> stored="true" > >>> > required="false" />* > >>> > > >>> > But it seems not to be this way for me. In the index i have values > like > >>> "FE > >>> > 009", "EE 009", "ED 009" and "FE 009-1" (without the quotes of > course. > >>> But > >>> > when i search "FE 009" (without quotes), I get no results. It seems > >>> that I > >>> > have to add quotes to the searchquery in order to retrieve any > results, > >>> but > >>> > that wont't work for me, as I later on have to expand the index with > >>> other > >>> > fields that need whitespace-tokenization and such, or would that work > >>> > regardless of quotes? I have come to understand that wrapping the > query > >>> in > >>> > quotes forces it to be analyzed as one token, no matter what. > >>> > > >>> > If I get this to work I would also like to add the > >>> > "solr.EdgeNGramFilterFactory" to the index side analyzer, thus adding > >>> > trailing wildcard matches. E.g. return "FE 009-1", "FE 009-2" as > well as > >>> > "FE 009" when searching for "FE 009", but not "EE 009", and "ED 009". > >>> Would > >>> > that be an ok way to do it? > >>> > > >>> > *Aleksander Akerø* > >>> > Systemkonsulent > >>> > Mobil: 944 89 054 > >>> > E-post: aleksan...@gurusoft.no > >>> > > >>> > *Gurusoft AS* > >>> > Telefon: 92 44 09 99 > >>> > Østre Kullerød > >>> > www.gurusoft.no > >>> > > >>> > >> > >> >