Sergio Morales wrote:
> Hi,
>
> I have upgraded from NUTCH 9.0 to nutch-2007-09-30_04-01-28.tar.gz.
>
> It seems the indexer is unable to update the field "TITLE" of the Lucene
> index when processing specific html documents.
>
>
> Please find below a brief summay of this issue:
>
> 1.- Extracted this new version in a separate directory and copy across the
> following configuration files:
> - {nutch_home_9.0}/bin/url folder, containing the urls
> - {nutch_home_9.0}/conf/nutch-site.xml
> - {nutch_home_9.0}/conf/crawl-urlfilter.txt
>
> 2.- To reproduce the issue, you would need to copy the attached html document
> to your webserver/filesytem.
There was not any html document attached. This is because mailing list
software removes them.
--
Sami Siren