> Hi,
>  i am new to Nutch search, i am working from past one  
> month.Any one can tell what is ment by Vertical search.any 
> one can suggest how can i do it.

Vertical search [http://en.wikipedia.org/wiki/Vertical_search] is basically
"categorized" search. You search "verticals", for e.g. car sales, jobs,
vacation rentals etc. The best (by mine opinion) vertical search engine is
http://www.vast.com

How to do it, not easy! The one way to do it is to use API offered by
vast.com (http://www.vast.com/info/stealThisSite). General idea is that vast
perform the crawling and classification and you get their results via API.
For example http://www.rentalio.com/ is the site that uses data from
vast.com (via API) and shows results/search without having to crawl and
categories Internet.

To make your own vertical search engine, you have to make "categorizer" that
will recognise content on crawled page and extract data from it. There are
many ways to make categorizer, from "rule based" where you have to make
special rules for every site you crawl to fully automated ones based on
bayes (or some similar) alghoritm.

Links that might help:
http://en.wikipedia.org/wiki/Vertical_search  [wiki entry on vertical
search]
http://www.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html [naïve
bayes implementation - easy way to classify txt]
http://www.vast.com [vertical search engine with free API]
http://en.wikipedia.org/wiki/Bayes%27s_theorem [wiki entry on bayes]
http://ai.ijs.si/Mezi/pedagosko/markuslang_seminar.doc [naïve bayes
implementation explanation]


Hope this helps
Bogdan Kecman


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to