On Jul 19, 2012, at 2:46am, albsmith wrote: > I don't think it is possible to instruct the search engine to mainly focus on > particular keyword alone. commonly the Search Engine follows The robots > exclusion protocol (REP), or robots.txt is a text file webmasters create to > instruct robots (typically search engine robots) on how to crawl & index > pages on their website. From this i could know that it's possible only in > the case of indexing the pages alone not a particular keyword i hope so. If > you have any idea about this kindly share with me. > http://www.prodigyapex.com/
I _think_ what you're asking about is how to do a focused crawl, where you want the crawler to (mostly) fetch pages that contain target keywords. If so, then see http://www.scaleunlimited.com/about/focused-crawler/ for some ideas on how to do this. It's possible to do the same thing in Nutch, using plug-in page scorers, but I haven't looked at that code in a while. -- Ken -------------------------- Ken Krugler http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Mahout & Solr

