Mike Polzin wrote:
I am working on building a web search engine and I would like to build a reults page similar to what Google does. The functionality I am looking to include is what I refer to a "rolling up" sites, meaning that even if a particular site (defined by its base URL) has many relevent hits on various pages for the searches keywords, that site is only shown once in the results listing with a link to the most relevent hit on that site. What I do not want is to have one site dominate a search results page.
Does it make sense to just do the search, get the hits list and then
programatically remove the results which, although they meet the search
criteria, are not as relevent? Is there a way to do this through queries?
Thanks in advance!
Mike
Actually why don't you use another index for the search results. For
example you make a search and get all the results, then index those
documents to a RAMIndex with the matching term. Then retrieve all
documents in the RamIndex based on hits for the search terms once. In
other terms you will filter the documents based on maximum term
similarity. Is it make sense for you?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org