bug-ger!

Hi,

to get regex-normalize.xml to work i must put:

<property>
  <name>urlnormalizer.class</name>
  <value>org.apache.nutch.net.RegexUrlNormalizer</value>
  <description>Name of the class used to normalize URLs.</description>
</property>

<property>
  <name>urlnormalizer.regex.file</name>
  <value>regex-normalize.xml</value>
<description>Name of the config file used by the RegexUrlNormalizer class.</description>
</property>

in nutch-site.xml

In nutch-default.xml there is set:

<property>
  <name>urlnormalizer.class</name>
  <value>org.apache.nutch.net.BasicUrlNormalizer</value>
  <description>Name of the class used to normalize URLs.</description>
</property>

Is this a bug or a feature? =)


Reply via email to