Hi Chris, Probably query parser interprets a colon https://solr.apache.org/guide/solr/latest/query-guide/standard-query-parser.html#querying-specific-fields You can check it with debugQuery. If it has a place, you can escape colon with backslash. Or use a specific qparser like https://solr.apache.org/guide/solr/latest/query-guide/other-parsers.html#field-query-parser Also, Solr admin has the analysis tab which can check issues with analysis - beside of qparser processing.
On Sat, Oct 1, 2022 at 5:35 PM Christopher Schultz < ch...@christopherschultz.net> wrote: > All, > > I have a multi-valued field of type text_general and a specific document > contains one field value with text "foo:bar". When searching for either > "foo" or "bar", I do not get this document in search results. > > However, when searching for "foo:bar" or "foo*" or "*bar" I do get the > document, so it's definitely there and the field value is being searched. > > Is a colon (:) not a word-breaking token? > > I have another field containing email address and if I search for e.g. > "gmail.com" (without quotes), I'll get everyone whose email addresses > end with "gmail.com". > > Hmm. I just checked, and if I search for "gmail" (without .com) I don't > fine them. Maybe without whitespace, those characters (:, .) do not > cause a word-split? > > I do have full control over how the indexing takes place, and the > foo:bar is actually a compound value. So I am able to use "foo bar" or > "foo: bar" or whatever. Users are much more likely to want to search for > just "bar" in this case, but also might want to search for "foo:bar" > specifically (and not get baz:bar in the results, or at least not ranked > as highly). > > What am I missing as far as tokenization, here? > > I haven't specified anything special when it comes to tokenization, > etc.: I'm using a pretty much stock Solr 8.1 install with a core created > using the default config set. > > Thanks, > -chris > -- Sincerely yours Mikhail Khludnev