On Nov 5, 2009, at 4:31 PM, Jason Eacott wrote:
Hi All, I hope someone can offer some advice.
I want to extend lucene to search in a particular way(if it cant
already):
I want to index docs, each with file containing several terms
something like:
doc1=>myfield:a
doc2=>myfield:a,b
doc3=>myfield:a,b,c
doc4=>myfield:a,b,c,d
so far nothing new.
I want to query for matching docs such that a query something like
myfield:(a or b) should only return docs if the doc itself is FULLY
matched.
ie, for the query myfield:(a or b) , only doc1 and doc2 should match.
So the rules are its only a match if the termcount for each doc is
<=the termcount of the query(for that field) AND ALL the terms in
the doc were matched
a few more examples just to clarify:
myfield:(a or b or d) would match doc4
myfield:(a or b or c or d) would match ALL the docs here (this one
works anyway but only because it uses all the terms that exist)
myfield:(a) would match doc1
order is not important (but might be a nice have)
can anyone tell me if its possible to make lucene do this, and
perhaps offer a starting point?
Would overriding http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/DefaultSimilarity.html#coord%28int,%20int%29
help?
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org