On Feb 8, 2012, at 5:28am, Markus Jelsma wrote:

> 
> 
> On Wednesday 08 February 2012 14:22:36 Julien Nioche wrote:
>> sorry don't understand what your issue is. We have a dependency on
>> tika-parsers and the actual parser implementations (listed in tika parsers'
>> POM) are pulled transitively just like any other dependency managed by Ivy.
>> They end up being copied in  runtime/local/plugins/parse-tika/ or put in
>> the job in runtime/deploy/
> 
> My problem is that i am working on some code for Tika-parsers 1.1-SNAPSHOT 
> that i need to use in Nutch. However, when i build tika-parsers and put it in 
> Nutch' lib directory i still seem to be missing dependencies. Then trouble 
> begins:

I don't know anything about how Nutch handles jars in its lib directory, but 
this sounds like you have a "raw" jar (tika-parsers) without its pom.xml.

So then Ivy (or Maven) doesn't know about the transitive dependencies on other 
jars, which are needed to implement the actual parsing support.

-- Ken

> 
> Exception in thread "main" java.lang.NoClassDefFoundError: Could not 
> initialize class org.apache.tika.parser.dwg.DWGParser
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:247)
>        at sun.misc.Service$LazyIterator.next(Service.java:271)
>        at org.apache.nutch.parse.tika.TikaConfig.<init>(TikaConfig.java:149)
>        at 
> org.apache.nutch.parse.tika.TikaConfig.getDefaultConfig(TikaConfig.java:211)
>        at org.apache.nutch.parse.tika.TikaParser.setConf(TikaParser.java:254)
>        at 
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:162)
>        at 
> org.apache.nutch.parse.ParserFactory.getParsers(ParserFactory.java:132)
>        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:71)
>        at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:101)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:138)
> 
> Nick told me to remove DWG from the org.apache.tika.parsers.Parsers config 
> file, which i did. But then other dependency issues come and go. The more 
> parsers i remove from the config file the better it goes, but then Tika won't 
> build anymore because of failing tests.
> 
> I asked this on the Nutch list because i wasn't sure anymore how Nutch deals 
> with these its own deps, which you explained well.
> 
> I'll give up for now :)
> 
> 
> 
>> 
>> On 8 February 2012 13:03, Markus Jelsma <markus.jel...@openindex.io> wrote:
>>> Yes, it looks like it! It should also be upgraded to Tika 1.0. But that's
>>> something else.
>>> 
>>> dependencies, dependencies, dependencies.... :(
>>> 
>>> On Wednesday 08 February 2012 14:04:26 Julien Nioche wrote:
>>>> The dependencies for the plugins are defined locally as shown in the
>>>> URL below, where you can see the ref to tika-parsers for parse-tika.
>>>> Is that more clear for you Markus?
>>>> 
>>>> On 8 February 2012 12:58, Lewis John Mcgibbney
>>> 
>>> <lewis.mcgibb...@gmail.com>wrote:
>>>>> Hi Markus,
>>>>> 
>>>>> For starters
>>> 
>>> http://svn.apache.org/viewvc/nutch/trunk/src/plugin/parse-tika/ivy.xml?vi
>>> 
>>>>> ew=markup
>>>>> 
>>>>> Can we pick our way through this?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> On Wed, Feb 8, 2012 at 12:50 PM, Markus Jelsma
>>>>> <markus.jel...@openindex.io
>>>>> 
>>>>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Can anyone shed light on this? We don't have any parsers in our libs
>>> 
>>> dir
>>> 
>>>>>> and
>>>>>> we don't have tika-parsers jar, only the tika-core jar. Where are
>>>>>> the parsers
>>>>>> and how does this all work?
>>>>>> 
>>>>>> I've posted a question (same subject) on the Tika list and Nick
>>>>>> tells
>>> 
>>> me
>>> 
>>>>>> there
>>>>>> must be parsers somewhere. Well, i have no idea how we do it in
>>>>>> Nutch, do you?
>>>>>> 
>>>>>> Thanks
>>>>> 
>>>>> --
>>>>> *Lewis*
>>> 
>>> --
>>> Markus Jelsma - CTO - Openindex
> 
> -- 
> Markus Jelsma - CTO - Openindex

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr




Reply via email to