to get regex-normalize.xml to work i must put:
in nutch-site.xml
In nutch-default.xml there is set:
Is this a bug or a feature? =)
nutch-site.xml overrides properties defined in nutch-default. So:
* If you remove urlnormalizer.class property from nutch-default it must
still uses the one
Hi Jérôme,
i think i expressed it wrong. The Question was if its a feature or a bug
that regex-normalize.xml is used only after this changes.
Regards
Michael
Jérôme Charron schrieb:
to get regex-normalize.xml to work i must put:
in nutch-site.xml
In nutch-default.xml there is set
i think i expressed it wrong. The Question was if its a feature or a bug
that regex-normalize.xml is used only after this changes.
the regex-normalize.xml is used only after you specify that you want to use
the RegexUrlNormalizer implementation. So it is used only if you specify