Chetan, Try adding parse-rss in nutch-site.xml. Here's mine:
<property> <name>plugin.includes</name> <value>protocol-httpclient|urlfilter-regex|parse-(text|html|msexcel|msword|mspowerpoint|pdf|zip|swf|rss)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value> <description></description> </property> Ed. > Date: Sat, 27 Sep 2008 01:30:43 -0700 > From: [EMAIL PROTECTED] > To: [email protected] > Subject: crawl xml url using nutch-0.9 > > > Hi All, > > I have tried to crawl xml url (http://sports.yahoo.com/nfl/rss.xml) using > depth 2. > > But it will crawl only root url. > > Please help me how to crawl root url as well as all sub url of root url. > > Thanks in advance. > > Regads, > Chetan Patel > -- > View this message in context: > http://www.nabble.com/crawl-xml-url-using-nutch-0.9-tp19700770p19700770.html > Sent from the Nutch - User mailing list archive at Nabble.com. > _________________________________________________________________ Get all your favourite content with the slick new MSN Toolbar - FREE http://clk.atdmt.com/UKM/go/111354027/direct/01/
