Hi , I am newbie in nutch. As part of learning I have done some basic things in nutch like intranet crawling, internet crawling and tried plugin example etc. Actually our main objective is to do opinion crawling. Its like we need to crawl only html pages which contain opinions,i.e user reviews about products, items, movies etc. So My question is during fetching itself whether i can find this html page contains user opinions or not ?If the page contains opinions, parse it . If not discard it.
This is our approach as of now. Please put your comments and suggestions. Thanks in advance. Best regards, Naresh -- View this message in context: http://n3.nabble.com/Opinion-crawling-tp713521p713521.html Sent from the Nutch - User mailing list archive at Nabble.com.