All. The problem is actualy a bit different. I was a bit in a hurry when I posted the previous message, apologies.
I added both urlfilter-regex.jar and nutch-0.7.1.jar to my classpath. When I run java org.apache.nutch.net.RegexURLFilter, I am getting 051005 221040 parsing jar:file:/C:/Personal/vvdb/Nutch/nutch-0.7.1/nutch- 0.7.1.jar!/nutch-default.xml 051005 221040 parsing jar:file:/C:/Personal/vvdb/Nutch/nutch-0.7.1/nutch- 0.7.1.jar!/nutch-site.xml 051005 221040 Plugins: directory not found: plugins Exception in thread "main" java.lang.ExceptionInInitializerError Caused by: java.lang.NullPointerException at org.apache.nutch.net.RegexURLFilter.<clinit>(RegexURLFilter.java:64) when I run nutch org.apache.nutch.net.RegexURLFilter, I am getting Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/nutch/net/RegexURLFilter I know I am missing something obvious, but your help is really appreciated. Kind regards, Thomas Delnoij On 10/5/05, Thomas Delnoij <[EMAIL PROTECTED]> wrote: > > I was a bit in a hurry when I posted this message, apologies. > > The problem is actualy a bit different. > > I added both urlfilter-regex.jar and nutch-0.7.1.jar to my classpath. > > When I run java org.apache.nutch.net.RegexURLFilter, > > On 10/5/05, Thomas Delnoij < [EMAIL PROTECTED]> wrote: > > > > All. > > > > I want to run the RegexURLFilter's main() method for testing the > > regex-urlfilter.txt. > > > > I set up NUTCH_HOME and NUTCH_CONF_DIR so I think I set up my > > environment correctly. > > > > When I run nutch org.apache.nutch.net.RegexURLFilter I get Exception in > > thread "main" java.lang.NoClassDefFoundError: > > org/apache/nutch/net/RegexURLFilter. > > > > Assuming this was a classpath issue, I added > > NUTCH_HOME/plugins/urlfilter-regex/urlfilter-regex.jar to my classpath. > > > > This did not solve the problem, as I am still getting the > > NoClassDefFoundError. > > > > So my first question is how to set up my environment correctly for > > testing the regex-urlfilter. > > > > Secondly, I want to tune my regex-urlfilter for maximum relevancy of the > > crawl result. By now, I have around 50 entries. My second question is if I > > can expect any performance impact? > > > > Your help is greatly appreciated. > > > > Kind regards, Thomas Delnoij. > > > > > > > > > > > > >
