All.

The problem is actualy a bit different. I was a bit in a hurry when I posted
the previous message, apologies.

I added both urlfilter-regex.jar and nutch-0.7.1.jar to my classpath.

When I run java org.apache.nutch.net.RegexURLFilter, I am getting

051005 221040 parsing jar:file:/C:/Personal/vvdb/Nutch/nutch-0.7.1/nutch-
0.7.1.jar!/nutch-default.xml
051005 221040 parsing jar:file:/C:/Personal/vvdb/Nutch/nutch-0.7.1/nutch-
0.7.1.jar!/nutch-site.xml
051005 221040 Plugins: directory not found: plugins
Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: java.lang.NullPointerException
at org.apache.nutch.net.RegexURLFilter.<clinit>(RegexURLFilter.java:64)

when I run nutch org.apache.nutch.net.RegexURLFilter, I am getting

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/nutch/net/RegexURLFilter

I know I am missing something obvious, but your help is really appreciated.

Kind regards, Thomas Delnoij


On 10/5/05, Thomas Delnoij <[EMAIL PROTECTED]> wrote:
>
> I was a bit in a hurry when I posted this message, apologies.
>
> The problem is actualy a bit different.
>
> I added both urlfilter-regex.jar and nutch-0.7.1.jar to my classpath.
>
> When I run java org.apache.nutch.net.RegexURLFilter,
>
> On 10/5/05, Thomas Delnoij < [EMAIL PROTECTED]> wrote:
> >
> > All.
> >
> > I want to run the RegexURLFilter's main() method for testing the
> > regex-urlfilter.txt.
> >
> > I set up NUTCH_HOME and NUTCH_CONF_DIR so I think I set up my
> > environment correctly.
> >
> > When I run nutch org.apache.nutch.net.RegexURLFilter I get Exception in
> > thread "main" java.lang.NoClassDefFoundError:
> > org/apache/nutch/net/RegexURLFilter.
> >
> > Assuming this was a classpath issue, I added
> > NUTCH_HOME/plugins/urlfilter-regex/urlfilter-regex.jar to my classpath.
> >
> > This did not solve the problem, as I am still getting the
> > NoClassDefFoundError.
> >
> > So my first question is how to set up my environment correctly for
> > testing the regex-urlfilter.
> >
> > Secondly, I want to tune my regex-urlfilter for maximum relevancy of the
> > crawl result. By now, I have around 50 entries. My second question is if I
> > can expect any performance impact?
> >
> > Your help is greatly appreciated.
> >
> > Kind regards, Thomas Delnoij.
> >
> >
> >
> >
> >
> >
>

Reply via email to