Hey Floyd, Have you tried: http://lucene.472066.n3.nabble.com/CJKAnalyzer-and-Synonyms-td2510104.html
If you go the AST route, here is a code snippet for a query parser which replaces all term queries with the term + prefix query (e.g. "foo" -> "foo foo*"). This sounds approximately like what you need. (I apologize in advance for the formatting which I'm sure will be lost): class WildcardQueryParser : MultiFieldQueryParser { /// <summary> /// Gets a field (term or phrase) query. /// </summary> /// <param name="field"></param> /// <param name="queryText"></param> /// <returns></returns> public override Query GetFieldQuery(string field, string queryText) { Query origQuery = base.GetFieldQuery(field, queryText); // The base query parser might decide that the query is null, e.g. if // they search for a word like "and" if (origQuery == null) { return null; } // Since both term and phrase queries call this method though, we need to check // to make sure it's a term query we're rewriting, and not a phrase query. if (origQuery.GetType() != typeof(PhraseQuery)) { BooleanQuery bq = new BooleanQuery(false); // Note that base query parser handles analysis, so we don't need to bq.Add(origQuery, BooleanClause.Occur.SHOULD); bq.Add(base.GetPrefixQuery(field, queryText), BooleanClause.Occur.SHOULD); return bq; } else { return origQuery; } } } ----- Original Message ----- From: Floyd Wu <floyd...@gmail.com> To: lucene-net-user@lucene.apache.org Cc: Sent: Thursday, October 6, 2011 11:08 PM Subject: [Lucene.Net] How to obtain Query string AST Hi, I want to write my own query expander. I may need to obtain the AST (abstract syntax tree) of an already parsed query string, navigate to certain parts of it (words) and make logical phrases of those words by adding to the AST - where necessary. And finally transfrom this AST to lucene query string (or query objcet) then send to lucene searcher to get result. This cannot be done to the string because the query logic cannot be semantically altered. (e.g. AND, OR, paren's etc) so it must be parsed first. How can this be done with Lucene.Net or combine with other 3-party library? Thanks for any tips. Floyd PS: example is user input a query string from front-end interface like (A OR B) AND (C OR D) I want my application rewrite this Query to ( A OR Y OR B OR T) AND (C OR Z OR D OR F) The A B Y T C Z D F are CJK-words(term) with double-quota surround it. Why I want to do this, Basically I want to do synonymous query but lucene.net's synonymous seems have some problem in my test (Solr also) especially processing CJK.