Is it possible to crawl sitemap.xml file with Nutch 1.x? -----Original Message----- From: Markus Jelsma [mailto:[email protected]] Sent: Thursday, December 18, 2014 3:09 PM To: [email protected] Subject: RE: Nutch 1.9 error
Hi - the sitemap command is not part of Nutch 1.x, nor does it have a HostDB. I suspect you are using Nutch 2.x commands. -----Original message----- > From:Richardson, Jacquelyn F. <[email protected]> > Sent: Thursday 18th December 2014 20:30 > To: [email protected] > Subject: Nutch 1.9 error > > I am using Nutch 1.9. I am trying to crawl our sitemap.xml file. > > When I submit the following command: > bin/nutch sitemap crawl -hostdb hostdb -threads 2 to nutch I receive > the following error: > Error: Could not find or load main class sitemap > > Any help you can give will be greatly appreciated. > > Jackie Richardson > > >

