Hi, You might want to package your parser as a plugin so that it has all the jars that it needs. See this http://blog.knoldus.com/2012/03/14/intercepting-nutch-crawl-flow-with-a-scala-plugin/
should be quite close to what you need. Regards | Vikas On Mon, May 28, 2012 at 5:24 PM, blunderboy <[email protected]> wrote: > I am receiving the following error : > > > java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: > org/jsoup/Jsoup > at > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232) > at java.util.concurrent.FutureTask.get(FutureTask.java:91) > at org.apache.nutch.parse.ParseUtil.runParser(ParseUtil.java:158) > at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:87) > at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:89) > at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:43) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > Caused by: java.lang.NoClassDefFoundError: org/jsoup/Jsoup > at > org.apache.nutch.parse.html.JsoupParser.fetchNodes(JsoupParser.java:53) > at > org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:120) > at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:35) > at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:24) > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.ClassNotFoundException: org.jsoup.Jsoup > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > > I have printed the stack trace in ParseUtil.java in runParser method for > debugging purpose only. > Any help will be appreciated > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Add-Third-party-dependency-to-your-nutch-plugin-tp3986387p3986388.html > Sent from the Nutch - User mailing list archive at Nabble.com. >

