No, i am wrong. Nutch 1.x has a patch for sitemap processing, please see:
https://issues.apache.org/jira/browse/NUTCH-1465

 
 
-----Original message-----
> From:Markus Jelsma <[email protected]>
> Sent: Friday 19th December 2014 12:17
> To: [email protected]
> Subject: RE: Nutch 1.9 error
> 
> No, unfortunately not. 
>  
>  
> -----Original message-----
> > From:Richardson, Jacquelyn F. <[email protected]>
> > Sent: Friday 19th December 2014 5:16
> > To: [email protected]
> > Subject: RE: Nutch 1.9 error
> > 
> > Is it possible to crawl sitemap.xml file with Nutch 1.x?
> > 
> > -----Original Message-----
> > From: Markus Jelsma [mailto:[email protected]] 
> > Sent: Thursday, December 18, 2014 3:09 PM
> > To: [email protected]
> > Subject: RE: Nutch 1.9 error
> > 
> > Hi - the sitemap command is not part of Nutch 1.x, nor does it have a 
> > HostDB. I suspect you are using Nutch 2.x commands. 
> >  
> > -----Original message-----
> > > From:Richardson, Jacquelyn F. <[email protected]>
> > > Sent: Thursday 18th December 2014 20:30
> > > To: [email protected]
> > > Subject: Nutch 1.9 error
> > > 
> > > I am using Nutch 1.9.  I am trying to crawl our sitemap.xml file.
> > > 
> > > When I submit the following command:
> > > bin/nutch sitemap crawl -hostdb hostdb -threads 2 to nutch I receive 
> > > the following error:
> > > Error: Could not find or load main class sitemap
> > > 
> > > Any help you can give will be greatly appreciated.
> > > 
> > > Jackie Richardson
> > > 
> > > 
> > > 
> > 
> 

Reply via email to