On Wednesday 08 February 2012 18:27:32 Ken Krugler wrote:
> On Feb 8, 2012, at 5:28am, Markus Jelsma wrote:
> > On Wednesday 08 February 2012 14:22:36 Julien Nioche wrote:
> >> sorry don't understand what your issue is. We have a dependency on
> >> tika-parsers and the actual parser implementations (listed in tika
> >> parsers' POM) are pulled transitively just like any other dependency
> >> managed by Ivy. They end up being copied in 
> >> runtime/local/plugins/parse-tika/ or put in the job in runtime/deploy/
> > 
> > My problem is that i am working on some code for Tika-parsers
> > 1.1-SNAPSHOT that i need to use in Nutch. However, when i build
> > tika-parsers and put it in Nutch' lib directory i still seem to be
> > missing dependencies. Then trouble
> 
> > begins:
> I don't know anything about how Nutch handles jars in its lib directory,
> but this sounds like you have a "raw" jar (tika-parsers) without its
> pom.xml.
> 
> So then Ivy (or Maven) doesn't know about the transitive dependencies on
> other jars, which are needed to implement the actual parsing support.

You're right, that's exactly what happened. However, i wasn't completely aware 
of it. Thanks

> 
> -- Ken
> 
> > Exception in thread "main" java.lang.NoClassDefFoundError: Could not
> > initialize class org.apache.tika.parser.dwg.DWGParser
> > 
> >        at java.lang.Class.forName0(Native Method)
> >        at java.lang.Class.forName(Class.java:247)
> >        at sun.misc.Service$LazyIterator.next(Service.java:271)
> >        at
> >        org.apache.nutch.parse.tika.TikaConfig.<init>(TikaConfig.java:149
> >        ) at
> > 
> > org.apache.nutch.parse.tika.TikaConfig.getDefaultConfig(TikaConfig.java:2
> > 11)
> > 
> >        at
> >        org.apache.nutch.parse.tika.TikaParser.setConf(TikaParser.java:25
> >        4) at
> > 
> > org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:162
> > )
> > 
> >        at
> > 
> > org.apache.nutch.parse.ParserFactory.getParsers(ParserFactory.java:132)
> > 
> >        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:71)
> >        at
> >        org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:101)
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
> >        org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:138)
> > 
> > Nick told me to remove DWG from the org.apache.tika.parsers.Parsers
> > config file, which i did. But then other dependency issues come and go.
> > The more parsers i remove from the config file the better it goes, but
> > then Tika won't build anymore because of failing tests.
> > 
> > I asked this on the Nutch list because i wasn't sure anymore how Nutch
> > deals with these its own deps, which you explained well.
> > 
> > I'll give up for now :)
> > 
> >> On 8 February 2012 13:03, Markus Jelsma <[email protected]> 
wrote:
> >>> Yes, it looks like it! It should also be upgraded to Tika 1.0. But
> >>> that's something else.
> >>> 
> >>> dependencies, dependencies, dependencies.... :(
> >>> 
> >>> On Wednesday 08 February 2012 14:04:26 Julien Nioche wrote:
> >>>> The dependencies for the plugins are defined locally as shown in the
> >>>> URL below, where you can see the ref to tika-parsers for parse-tika.
> >>>> Is that more clear for you Markus?
> >>>> 
> >>>> On 8 February 2012 12:58, Lewis John Mcgibbney
> >>> 
> >>> <[email protected]>wrote:
> >>>>> Hi Markus,
> >>>>> 
> >>>>> For starters
> >>> 
> >>> http://svn.apache.org/viewvc/nutch/trunk/src/plugin/parse-tika/ivy.xml?
> >>> vi
> >>> 
> >>>>> ew=markup
> >>>>> 
> >>>>> Can we pick our way through this?
> >>>>> 
> >>>>> Thanks
> >>>>> 
> >>>>> 
> >>>>> On Wed, Feb 8, 2012 at 12:50 PM, Markus Jelsma
> >>>>> <[email protected]
> >>>>> 
> >>>>>> wrote:
> >>>>>> Hi,
> >>>>>> 
> >>>>>> Can anyone shed light on this? We don't have any parsers in our libs
> >>> 
> >>> dir
> >>> 
> >>>>>> and
> >>>>>> we don't have tika-parsers jar, only the tika-core jar. Where are
> >>>>>> the parsers
> >>>>>> and how does this all work?
> >>>>>> 
> >>>>>> I've posted a question (same subject) on the Tika list and Nick
> >>>>>> tells
> >>> 
> >>> me
> >>> 
> >>>>>> there
> >>>>>> must be parsers somewhere. Well, i have no idea how we do it in
> >>>>>> Nutch, do you?
> >>>>>> 
> >>>>>> Thanks
> >>>>> 
> >>>>> --
> >>>>> *Lewis*
> >>> 
> >>> --
> >>> Markus Jelsma - CTO - Openindex
> 
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Mahout & Solr

-- 
Markus Jelsma - CTO - Openindex

Reply via email to