Thought I just missed something. Okay, I just added a few patterns as well as a commandline-checker. See
http://issues.apache.org/jira/browse/NUTCH-279 for the patch. Regards, Stefan TDLN wrote: > Sorry, I was a bit too fast there, the answer applies to the > RegexURLFilter not the RegexUrlNormalizer. I don't think there is a > similar facility for the RegexUrlNormalizer, but let me know if you > find it :) > > Rgrds, Thomas > > On 5/22/06, TDLN <[EMAIL PROTECTED]> wrote: >> Hi Stefan >> >> try running bin/nutch org.apache.nutch.net.URLFilterChecker >> >> Rgrds, Thomas >> >> On 5/22/06, Stefan Neufeind <[EMAIL PROTECTED]> wrote: >> > Hi, >> > >> > is there a way to debug rules for RegexUrlNormalizer, e.g. test the >> > substitution from commandline? >> > >> > >> > bin/nutch org.apache.nutch.net.RegexUrlNormalizer >> > >> > does print out the rules it uses. But afaik there is no such thing >> > possible as >> > >> > echo "http://www.example.com" | bin/nutch >> > org.apache.nutch.net.RegexUrlNormalizer >> > >> > is there? So how do you debug rules when writing new ones and testing >> > them against a set of URLs that should match / should not match? ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
