Use of QueryParser to construct the query causes this, with the word breaking specifics being determined by the analyzer that you selected.
To avoid word breaking and symbol replacement, you could use a different analyzer; but it would be best to construct the query directly using the BooleanQuery, TermQuery and related classes. The latter is preferred because some symbols (for example "+", "-") are an essential part of the query syntax that QueryParser recognizes. For example when run through QueryParser the search [ +red +blue -green ] is identical to the search [ red AND blue NOT green ] To directly construct a search that does not strip out the "+" symbol you could do something like this to search for the string "red+green" in a given field: Query query = new TermQuery(new Term(searchField,"red+green")); The [ red AND blue NOT green ] search from above would be constructed like this: BooleanQuery query = new BooleanQuery(); query.Add(new TermQuery(new Term(searchField,"red")), BooleanClause.Occur.MUST); query.Add(new TermQuery(new Term(searchField,"blue")), BooleanClause.Occur.MUST); query.Add(new TermQuery(new Term(searchField,"green")), BooleanClause.Occur.MUST_NOT); One other consideration. The analyzer used to add documents to the Lucene index will also determines how the original content is broken into searchable terms. If I recall correctly, the StandardAnalyzer will keep the special symbols that comprise a phone number together as a searchable unit; this may not be true for other analyzers. There is a very useful tool called Luke that can be used to inspect an index and run trial searches using different analyzers. Hope this helps. -- Neal -----Original Message----- From: Li Bing [mailto:[email protected]] Sent: Thursday, August 20, 2009 12:33 AM To: [email protected] Subject: Lucene Query Questions Dear all, I am using the following code to search indexed data. However, when the searchKeyword contains some special characters, such as "//", ":", "+", "-", ".", and even digital numbers, the query removes some required characters or splits the keyword. Sometimes, it causes no results although I am sure the results exist. May I cancel the feature so that the query does not change my original searchKeyword? ...... IndexSearcher searcher = new IndexSearcher(fsDirectory); Analyzer chineseAnalyzer = new ChineseAnalyzer(); QueryParser queryParser = new QueryParser(searchField, chineseAnalyzer); Query query = queryParser.Parse(DBTools.FilterKeyFieldValue(searchKeyword)); Hits results = searcher.Search(query); ...... Thanks so much! LB
