Are you telling that your sites have form siteA.mydomain.com, 
siteB.mydomain.com, siteC.mydomain.com?

Alex.

 

 

 

-----Original Message-----
From: mbehlok <[email protected]>
To: user <[email protected]>
Sent: Wed, Feb 13, 2013 11:05 am
Subject: Nutch identifier while indexing.


Hello, I am indexing 3 sites:

SiteA
SiteB
SiteC

I want to index these sites in a way that when searching them in solr I can
query a search on each of these sites in separate. So one could say... thats
easy, just filter them by host... WRONG...  Sites are hosted on the same
host but have different starting points. That is, starting the crawl from
different root urls (SiteA, SiteB, SiteC) produces different results. My
imagination tells me to somehow specify an identifier on schema.xml that
passes to solr which was the root url that produced that crawl. Any ideas on
how to implement this? any variations?

Mitch 
 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-identifier-while-indexing-tp4040285.html
Sent from the Nutch - User mailing list archive at Nabble.com.

 

Reply via email to