Hello, everyone, My name is Da Huang. I'm studying for my master degree of Computer Science in Peking University. I have been using lucene for about half a year. It's so elegent that I hope to have a chance to contribute some code for it.
Therefore, I have been scaned the jira GoSC 2014 Ideas page about lucene for several days. I find "LUCENE-3333: Specialize DisjunctionScorer if all clauses are TermQueries" more suitable for me to do. I have spent some time to scan the revelant code, and the Issue "LUCENE-3328" which spinoff "LUCENE-3333". I find the following questions confusing me. 1) I have checkout the code from " http://svn.apache.org/repos/asf/lucene/dev/trunk lucene_trunk", but I couldn't find the relevant code of the fixed Issue "LUCENE-3328". It seems that the patch attached on the page is not on the trunk. Why? 2) My intuitive idea of solving this issue is to make a class "DisjunctionTermScorer" to do the all TermQueries clauses; then, judging whether to use DisjunctionTermScorer in the method 'scorer' in class BooleanQuery. Is this idea right? Above are my questions about "LUCENE-3333". Besides, I would like to propose the following issue which is about the QueryParser. When we use QueryParser to parse a querystring like "science AND (engineering AND technology)". The generated query would be "+science (+engineering +technology)". I think it would be more efficient for searching if the final query is "+science +engineering +technology". My idea is to make the cascaded AND and cascaded OR flat. Do you agree? I hope I have made my idea clear. Thanks, Da Huang -- 黄达(Da Huang) Team of Search Engine & Web Mining School of Electronic Engineering & Computer Science Peking University, Beijing, 100871, P.R.China
