Hello Alexis, see inline.
Regards,
Markus
-----Original message-----
> From:IZaBEE_Keeper <[email protected]>
> Sent: Wednesday 20th March 2019 1:28
> To: [email protected]
> Subject: RE: Limiting Results From Single Domain
>
> Markus Jelsma-2 wrote
> > Hello Alexis,
> >
> > This is definately a question for Solr. Regardless of that, you choice is
> > between Solr's Result Grouping component, or FieldCollapsing filter query
> > parser.
> >
> > Regards,
> > Markus
>
> Thank you..
>
> I kinda figured that I'd need to figure out how to use the FieldCollapsing
> query parser & figure out how to make it work on a per hostname basis from
> the hostname field.. I'm not too sure on how to write the function for it
> but I should be able to figure it out..
fq={!collapse field=host}
keep in mind, for this to work equal hosts must be indexed into equals shards.
> I'm hopeful though that nutch might solve some of this for me as it indexes
> another billion pages.. It seems to be less frequent with more pages added
> to the index from multiple domains..
Nutch, out-of-the-box, can't solve this for you, unless you crawl or index
less. Or get rid of a decent amount of duplicates, which are usually around if
you crawl a few billion pages.
>
> Thanks again.. :)
>
>
>
>
> -----
> Bee Keeper at IZaBEE.com
> --
> Sent from: http://lucene.472066.n3.nabble.com/Nutch-User-f603147.html
>