First, to understand what your query looks like, go to admin/analysis.jsp. It lets you see what happens to your queries when they go in. Then, do the query with debugQuery=true. This will add some complex junk to the end of the XML page that describes in painful detail exactly how each document was scored.

After all that- you might have a problem with the PrnP etc. stuff getting chopped up in weird ways. I don't know how people handle this in chemistry/bio search.

Lance

Ahmet Arslan wrote:
Example of Question:
- What is the role of PrnP in mad cow disease?
First thing is do not directly query questions. Manually formulate queries:
remove 'what' 'is' 'the' 'of' '?' etc.

For example i would convert this question into:

"mad cow"^5 "cow disease"^3 "mad cow disease"^15 "role PrnP"~5^2 "role mad cow 
disease"~45 mad^0.1 role^0.5 cow disease PrnP^10

I am running in 11.638 documents and the result is 10410
docs for this question (lowwwwww precision)
Use OR default operator, collect and evaluate top 1000 documents only.

And instead of Porter you can try KStem.
http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi

Try different length normalization described here. Also their Lucene query 
example (SpanNear) can inspire you.  
http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to