I am working on building a web search engine and I would like to build a reults page similar to what Google does. The functionality I am looking to include is what I refer to a "rolling up" sites, meaning that even if a particular site (defined by its base URL) has many relevent hits on various pages for the searches keywords, that site is only shown once in the results listing with a link to the most relevent hit on that site. What I do not want is to have one site dominate a search results page.
Does it make sense to just do the search, get the hits list and then programatically remove the results which, although they meet the search criteria, are not as relevent? Is there a way to do this through queries? Thanks in advance! Mike