Analyzer is configured in schema.xml. But literally, splitting on dot is what I expect from StandardTokenizer.
On Tue, May 2, 2023 at 8:48 PM Bill Tantzen <[email protected]> wrote: > Mikhail, > Thanks for the quick reply. Here is the parser info: > > <str name="QParser">LuceneQParser</str> > > ~~Bill > > On Tue, May 2, 2023 at 12:43 PM Mikhail Khludnev <[email protected]> wrote: > > > Hello Bill, > > Which analyzer is configured for metadata_txt? Perhaps you need to tune > it > > accordingly. > > > > On Tue, May 2, 2023 at 7:40 PM Bill Tantzen <[email protected]> > > wrote: > > > > > In my solr 9.2 schema, I am leveraging the dynamicField > > > > > > <dynamicField name="*_txt" type="text_general" indexed="true" > > > stored="true"/> > > > > > > which tokenizes with solr.StandardTokenizerFactory for index and query. > > > > > > However, when I query with, for example, > > > <str name="q">metadata_txt:XYZ.tif</str> > > > > > > I see many more hits than I expect. When I add debug=true to the > query, > > I > > > see: > > > <str name="rawquerystring">metadata_txt:XYZ.tif</str> > > > <str name="querystring">metadata_txt:XYZ.tif</str> > > > <str name="parsedquery">metadata_txt:XYZ metadata_txt:tif</str> > > > > > > But I expect that dots not followed by whitespace will be kept as part > of > > > the token, that is, the parsed query should remain > "metadata_txt:XYZ.tif" > > > but solr appears to be splitting into two tokens. > > > > > > Can somebody point out what I am misunderstanding? > > > Thanks, > > > ~~Bill > > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > > https://t.me/MUST_SEARCH > > A caveat: Cyrillic! > > > > > -- > Human wheels spin round and round > While the clock keeps the pace... -- John Mellencamp > ________________________________________________________________ > Bill Tantzen University of Minnesota Libraries > 612-626-9949 (U of M) 612-325-1777 (cell) > -- Sincerely yours Mikhail Khludnev https://t.me/MUST_SEARCH A caveat: Cyrillic!
