bug-ger!
Hi,
to get regex-normalize.xml to work i must put:
<property>
<name>urlnormalizer.class</name>
<value>org.apache.nutch.net.RegexUrlNormalizer</value>
<description>Name of the class used to normalize URLs.</description>
</property>
<property>
<name>urlnormalizer.regex.file</name>
<value>regex-normalize.xml</value>
<description>Name of the config file used by the RegexUrlNormalizer
class.</description>
</property>
in nutch-site.xml
In nutch-default.xml there is set:
<property>
<name>urlnormalizer.class</name>
<value>org.apache.nutch.net.BasicUrlNormalizer</value>
<description>Name of the class used to normalize URLs.</description>
</property>
Is this a bug or a feature? =)