On Jun 23, 2008, at 4:45 PM, Jon Drukman wrote:
Erik Hatcher wrote:
Jon,
You provided a lot of nice details, thanks for helping us help you :)
The one missing piece is the definition of the "text" field type.
In Solr's _example_ schema, "bobby" gets analyzed (stemmed) to
"bobbi"[1]. When you query for bobby*, the query parser is not
running an analyzer on the wildcard query, thus literally searching
for terms that begin with "bobby"[2].
As for "steve" , same story, but it analyzes to "steve", which is
found with a "steve*" query.
so, what's the solution?
it depends(tm) ;)
if i change the field to string, will it be able to find bobby* ?
No, because the original data is <str name="name">Bobby Gaza</str>, so
Bobby* would match, but not bobby*. "string" type (in the example
schema, to be clear) does effectively no analysis, leaving the
original string indexed as-is, case and all.
eventually it would be nice to be able to use fuzzy matching, to
find 'jon' from 'john', for example.
you could search for john~ to do that. or bobby~ would match "bobbi".
stemming and wildcard term queries aren't quite compatible, as you've
found, but it does depend on how much of the prefix is provided. bob*
matches "bobbi", for example.
Erik