Hi Sami, Thanks for the info.
Is there any other way to share this? Thanks, Sergio. ----- Original Message ---- From: Sami Siren <[EMAIL PROTECTED]> To: [email protected] Sent: Friday, 19 October, 2007 6:59:57 PM Subject: Re: Indexer does not update the Lucene "TITLE" field Sergio Morales wrote: > Hi, > > I have upgraded from NUTCH 9.0 to nutch-2007-09-30_04-01-28.tar.gz. > > It seems the indexer is unable to update the field "TITLE" of the Lucene > index when processing specific html documents. > > > Please find below a brief summay of this issue: > > 1.- Extracted this new version in a separate directory and copy across the > following configuration files: > - {nutch_home_9.0}/bin/url folder, containing the urls > - {nutch_home_9.0}/conf/nutch-site.xml > - {nutch_home_9.0}/conf/crawl-urlfilter.txt > > 2.- To reproduce the issue, you would need to copy the attached html document > to your webserver/filesytem. There was not any html document attached. This is because mailing list software removes them. -- Sami Siren ___________________________________________________________ Want ideas for reducing your carbon footprint? Visit Yahoo! For Good http://uk.promotions.yahoo.com/forgood/environment.html
