I’m guessing that i’m missing something obvious here - so feel free to ask for more details and as well point out other directions i should following.
the problem goes as follows: the input in one case might be a phone number (like +49 1234 12345678), since we’re using edismax the parts gets split on whitespaces - which is fine. bringing the same field (based on TextField) to the party (using qf) doesn’t change a thing. > responseHeader: > params: > q: '+49 1234 12345678' > defType: edismax > qf: person_mobile > pf: person_mobile^5 > debug: > rawquerystring: '+49 1234 12345678' > querystring: '+49 1234 12345678' > parsedquery: '(+(+DisjunctionMaxQuery((person_mobile:49)) >DisjunctionMaxQuery((person_mobile:1234)) >DisjunctionMaxQuery((person_mobile:12345678))) ())/no_coord' > parsedquery_toString: '+(+(person_mobile:49) (person_mobile:1234) >(person_mobile:12345678)) ()’ but .. as far as i was able to reduce the culprit, that only happens when i’m using solr.KeywordTokenizerFactory . as soon as i’m changing that to solr.StandardTokenizerFactory the phrase query appears as expected: > responseHeader: > params: > q: '+49 1234 12345678' > defType: edismax > qf: person_mobile > pf: person_mobile^5 > debug: > rawquerystring: '+49 1234 12345678' > querystring: '+49 1234 12345678' > parsedquery: '(+(+DisjunctionMaxQuery((person_mobile:49)) >DisjunctionMaxQuery((person_mobile:1234)) >DisjunctionMaxQuery((person_mobile:12345678))) >DisjunctionMaxQuery(((person_mobile:"49 1234 12345678")^5.0)))/no_coord' > parsedquery_toString: '+(+(person_mobile:49) (person_mobile:1234) >(person_mobile:12345678)) ((person_mobile:"49 1234 12345678")^5.0)’ removing the + at the beginning, doesn’t make a difference either (just mentioning since tokee already asked this on #solr, where i’ve brought up the question earlier) it’s absolutely possible i’m focusing on a very wrong assumption - but since switching the tokenizer does result in such a rather large behaviour change, i think something is spooky here. i’ve read older issues and posts from the list, some of them pointed out that it might be a optimization that edismax brings to the table - i didn’t find anything specific about that. oh, and btw: if that would be working - my idea is to drop out everything for a given phrase that is not a number, to match the phone number, like this: > <fieldType name="phone_number" class="solr.TextField"> > <analyzer> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.PatternReplaceFilterFactory" pattern="[^\d]" > replacement=""/> > </analyzer> > </fieldType> any thoughts? or wild guesses? Thanks Stefan