if you don't use the same tokenizer for indexing and searching, you will have troubles like this. Mixing exact match (with ") and wildcard (*) is a strange idea. Typographical rules says that you have a space after a comma, no? Your field is tokenized?
M. Renaud Waldura a écrit : > My very simple analyzer produces tokens made of digits and/or letters only. > Anything else is discarded. E.g. the input "smith,anna" gets tokenized as 2 > tokens, first "smith" then "anna". > > Say I have indexed documents that contained both "smith,anna" and > "smith,annanicole". To find them, I enter the query <<smith,ann*>>. The > stock Lucene 2.0 query parser produces a PrefixQuery for the single token > "smith,ann". This token doesn't exist in my index, and I don't get a match. > > I have found some references to this: > http://www.nabble.com/Wildcard-query-with-untokenized-punctuation-tf3378386. > html > but I don't understand how I can fix it. Comma-separated terms like this can > appear in any field; I don't think I can create an untokenized field. > > Really what I would like in this case is for the comma to be considered > whitespace, and the query to be parsed to <<+smith +ann*>>. Any way I can do > that? > > --Renaud > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]