[ 
https://issues.apache.org/jira/browse/TIKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623835#comment-14623835
 ] 

Bob Paulin commented on TIKA-1507:
----------------------------------

I unzipped the tika-parser jar I noticed that the 
org.apache.tika.parser.external package is not listed in the MANIFEST.MF 
Import-Package entry.  This means that OSGi will not load the ExternalParser 
class into the tika-parser classloader.  This will lead to the 
NoClassDefFoundError.  The package is being exported by the tika-core project's 
MANIFEST.MF so I can't think of a reason why the maven-bundle-plugin would not 
pick it up as an import for tika-parser.  For this I've filed a bug against the 
maven-bundle-plugin project to see if they have any thoughts: 
https://issues.apache.org/jira/browse/FELIX-4958

One possible workaround is an explicit Export-Package statement in the pom of 
the parser project.  Exported packages are automatically included as imports in 
the plugin so I found adding the attached patch allows the proper classloading 
to take place.  However I have it to export all the classes under 
org.apache.tika.parser.* which will re-export the classes from the tika-core 
bundle under the tika-parser bundle.  This could cause other bundles that use 
the org.apache.tika.parser.* to import these packages from tika-parser instead 
of tika-core.  It's the same classes so it's harmless but a bit odd. 

I've attached a new patch with this update to the pom.

> Under OSGi, ForkParser failes to send core parser classes like ExternalParser
> -----------------------------------------------------------------------------
>
>                 Key: TIKA-1507
>                 URL: https://issues.apache.org/jira/browse/TIKA-1507
>             Project: Tika
>          Issue Type: Bug
>          Components: packaging, parser
>    Affects Versions: 1.6, 1.7
>            Reporter: Nick Burch
>
> Under OSGi, if you try to use ForkParser with the Tesseract OCR parser, it 
> will fail with:
> java.lang.NoClassDefFoundError: org/apache/tika/parser/external/ExternalParser
>       at 
> org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
>       at 
> org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:91)
>       at 
> org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
>       at 
> org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
>       at 
> org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:622)
>       at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>       at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>       at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.ClassNotFoundException: Unable to find class 
> org.apache.tika.parser.external.ExternalParser
>       at 
> org.apache.tika.fork.ClassLoaderProxy.findClass(ClassLoaderProxy.java:117)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
>       ... 13 more
> ExternalParser lives in the Tika Core jar, not the Tika Parsers one. This all 
> works fine outside of OSGi, so it looks like something about the OSGi 
> bundling is causing the fork parser to fail to send the parser-related 
> classes from Tika Core over to the forked JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to