On Jul 19, 2012, at 2:46am, albsmith wrote:

> I don't think it is possible to instruct the search engine to mainly focus on
> particular keyword alone. commonly the Search Engine follows The robots
> exclusion protocol (REP), or robots.txt is a text file webmasters create to
> instruct robots (typically search engine robots) on how to crawl & index
> pages on their website. From this i could know that it's possible only in
> the case of indexing the pages alone not a particular keyword i hope so. If
> you have any idea about this kindly share with me.
> http://www.prodigyapex.com/

I _think_ what you're asking about is how to do a focused crawl, where you want 
the crawler to (mostly) fetch pages that contain target keywords.

If so, then see http://www.scaleunlimited.com/about/focused-crawler/ for some 
ideas on how to do this.

It's possible to do the same thing in Nutch, using plug-in page scorers, but I 
haven't looked at that code in a while.

-- Ken

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr




Reply via email to