Hi Chris,
Probably query parser interprets a colon
https://solr.apache.org/guide/solr/latest/query-guide/standard-query-parser.html#querying-specific-fields
You can check it with debugQuery. If it has a place, you can escape colon
with backslash. Or use a specific qparser like
https://solr.apache.org/guide/solr/latest/query-guide/other-parsers.html#field-query-parser
Also, Solr admin has the analysis tab which can check issues with analysis
- beside of qparser processing.

On Sat, Oct 1, 2022 at 5:35 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:

> All,
>
> I have a multi-valued field of type text_general and a specific document
> contains one field value with text "foo:bar". When searching for either
> "foo" or "bar", I do not get this document in search results.
>
> However, when searching for "foo:bar" or "foo*" or "*bar" I do get the
> document, so it's definitely there and the field value is being searched.
>
> Is a colon (:) not a word-breaking token?
>
> I have another field containing email address and if I search for e.g.
> "gmail.com" (without quotes), I'll get everyone whose email addresses
> end with "gmail.com".
>
> Hmm. I just checked, and if I search for "gmail" (without .com) I don't
> fine them. Maybe without whitespace, those characters (:, .) do not
> cause a word-split?
>
> I do have full control over how the indexing takes place, and the
> foo:bar is actually a compound value. So I am able to use "foo bar" or
> "foo: bar" or whatever. Users are much more likely to want to search for
> just "bar" in this case, but also might want to search for "foo:bar"
> specifically (and not get baz:bar in the results, or at least not ranked
> as highly).
>
> What am I missing as far as tokenization, here?
>
> I haven't specified anything special when it comes to tokenization,
> etc.: I'm using a pretty much stock Solr 8.1 install with a core created
> using the default config set.
>
> Thanks,
> -chris
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to