Yes you can. As Ken replied in your Solr thread you must create custom parse 
and indexing filters. The parse filter is needed to extract the information 
and store it in the document and the index filter is used to pass that new 
information to the Solr index.


On Monday 12 September 2011 12:55:49 dpt9876 wrote:
> Hi, the friendly guys at the Solr user group pointed me here.
> 
> I am wondering if Nutch/Solr will do the following for a project I am
> working on.
> I want to create a search engine with facets for potentially hundreds of
> websites.
> Similar to say crawling amazon + buy.com + ebay and someone can search
> these 3 sites from my 1 website.
> (I realise there are better ways of doing the above example, its for
> illustrative purposes).
> Eventually I would build that search crawl to index say 200 or 1000
> merchants.
> Someone would come to my site and search for "digital camera".
> 
> They would get results from all 3 indexes and hopefully dynamic facets eg
> Price $100-200
> Price 200-300
> Resolution 1mp-2mp
> 
> etc etc
> 
> Can this be done on the fly?
> 
> I ask this because I am currently developing webscrapers to crawl these
> websites, dump that data into a db, then was thinking of tacking on a solr
> server to crawl my db.
> 
> Problem with that approach is that crawling the worlds ecommerce sites will
> take forever, when it seems solr might do that for me? (I have read about
> multiple indexes etc).
> 
> Many thanks
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Will-Solr-Nutch-crawl-multi-websites-ak
> a-a-mini-google-with-faceted-search-tp3329346p3329346.html Sent from the
> Nutch - User mailing list archive at Nabble.com.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to