Hi,

You might want to package your parser as a plugin so that it has all the
jars that it needs. See this
http://blog.knoldus.com/2012/03/14/intercepting-nutch-crawl-flow-with-a-scala-plugin/

should be quite close to what you need.

Regards | Vikas

On Mon, May 28, 2012 at 5:24 PM, blunderboy <[email protected]> wrote:

> I am receiving the following error :
>
>
> java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError:
> org/jsoup/Jsoup
>        at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>        at org.apache.nutch.parse.ParseUtil.runParser(ParseUtil.java:158)
>        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:87)
>        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:89)
>        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:43)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: java.lang.NoClassDefFoundError: org/jsoup/Jsoup
>        at
> org.apache.nutch.parse.html.JsoupParser.fetchNodes(JsoupParser.java:53)
>        at
> org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:120)
>        at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:35)
>        at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:24)
>        at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.ClassNotFoundException: org.jsoup.Jsoup
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> I have printed the stack trace in ParseUtil.java in runParser method for
> debugging purpose only.
> Any help will be appreciated
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Add-Third-party-dependency-to-your-nutch-plugin-tp3986387p3986388.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Reply via email to