Hello folks,
We are currently attempting to use Lucene.Net to do some searching of a
Lucene index built off of a MySQL database. The index is built and
searching on it is going quite well. However, we are attempting to
search for characters that Lucene trims out automatically.
For example, "asdf23(4)" becomes two separate terms "asdf23" and "4".
When searching for "asdf23\(4\)" (slashes included to allow the brackets
to remain in the search query), we receive no results. This is because
when adding it to the index, it strips out the brackets and divides them
into individual terms.
Is there a way to stop Lucene from splitting that into individual terms?
The code we use to add documents is as follows:
[start code]
string[] sReplace = new string[] {"\\", "+", "-", "&&", "||", "!", "(",
")", "{", "}", "[", "]", "^", "\"", "~", "*", "?", ":"};
foreach (string sReplaceTerm in sReplace)
sInsert = sInsert.Replace(sReplaceTerm, "\\" + sReplaceTerm);
doc.Add(new Lucene.Net.Documents.Field(dr["FieldName"].ToString(),
sInsert, Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS));
[end code]
Thanks in advance,
Trevor Watson