I'm not sure if this should be a Nutch question or a Solr question.. I have a large index of the WWW that is rapidly growing daily. Some queries to the Solr index return result sets that include page after page from the same site/hostname..
I have set the host and the domain fields as stored and indexed. I'm trying to figure out how to limit the number of results returned per hostname on a Solr query.. Solr 7.5, Nutch 1.5 The site is at izabee.com ----- Bee Keeper at IZaBEE.com -- Sent from: http://lucene.472066.n3.nabble.com/Nutch-User-f603147.html