No, i am wrong. Nutch 1.x has a patch for sitemap processing, please see: https://issues.apache.org/jira/browse/NUTCH-1465
-----Original message----- > From:Markus Jelsma <[email protected]> > Sent: Friday 19th December 2014 12:17 > To: [email protected] > Subject: RE: Nutch 1.9 error > > No, unfortunately not. > > > -----Original message----- > > From:Richardson, Jacquelyn F. <[email protected]> > > Sent: Friday 19th December 2014 5:16 > > To: [email protected] > > Subject: RE: Nutch 1.9 error > > > > Is it possible to crawl sitemap.xml file with Nutch 1.x? > > > > -----Original Message----- > > From: Markus Jelsma [mailto:[email protected]] > > Sent: Thursday, December 18, 2014 3:09 PM > > To: [email protected] > > Subject: RE: Nutch 1.9 error > > > > Hi - the sitemap command is not part of Nutch 1.x, nor does it have a > > HostDB. I suspect you are using Nutch 2.x commands. > > > > -----Original message----- > > > From:Richardson, Jacquelyn F. <[email protected]> > > > Sent: Thursday 18th December 2014 20:30 > > > To: [email protected] > > > Subject: Nutch 1.9 error > > > > > > I am using Nutch 1.9. I am trying to crawl our sitemap.xml file. > > > > > > When I submit the following command: > > > bin/nutch sitemap crawl -hostdb hostdb -threads 2 to nutch I receive > > > the following error: > > > Error: Could not find or load main class sitemap > > > > > > Any help you can give will be greatly appreciated. > > > > > > Jackie Richardson > > > > > > > > > > > >

