[
https://issues.apache.org/jira/browse/TIKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623835#comment-14623835
]
Bob Paulin commented on TIKA-1507:
----------------------------------
I unzipped the tika-parser jar I noticed that the
org.apache.tika.parser.external package is not listed in the MANIFEST.MF
Import-Package entry. This means that OSGi will not load the ExternalParser
class into the tika-parser classloader. This will lead to the
NoClassDefFoundError. The package is being exported by the tika-core project's
MANIFEST.MF so I can't think of a reason why the maven-bundle-plugin would not
pick it up as an import for tika-parser. For this I've filed a bug against the
maven-bundle-plugin project to see if they have any thoughts:
https://issues.apache.org/jira/browse/FELIX-4958
One possible workaround is an explicit Export-Package statement in the pom of
the parser project. Exported packages are automatically included as imports in
the plugin so I found adding the attached patch allows the proper classloading
to take place. However I have it to export all the classes under
org.apache.tika.parser.* which will re-export the classes from the tika-core
bundle under the tika-parser bundle. This could cause other bundles that use
the org.apache.tika.parser.* to import these packages from tika-parser instead
of tika-core. It's the same classes so it's harmless but a bit odd.
I've attached a new patch with this update to the pom.
> Under OSGi, ForkParser failes to send core parser classes like ExternalParser
> -----------------------------------------------------------------------------
>
> Key: TIKA-1507
> URL: https://issues.apache.org/jira/browse/TIKA-1507
> Project: Tika
> Issue Type: Bug
> Components: packaging, parser
> Affects Versions: 1.6, 1.7
> Reporter: Nick Burch
>
> Under OSGi, if you try to use ForkParser with the Tesseract OCR parser, it
> will fail with:
> java.lang.NoClassDefFoundError: org/apache/tika/parser/external/ExternalParser
> at
> org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
> at
> org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:91)
> at
> org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
> at
> org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
> at
> org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:622)
> at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
> at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
> at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.ClassNotFoundException: Unable to find class
> org.apache.tika.parser.external.ExternalParser
> at
> org.apache.tika.fork.ClassLoaderProxy.findClass(ClassLoaderProxy.java:117)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
> ... 13 more
> ExternalParser lives in the Tika Core jar, not the Tika Parsers one. This all
> works fine outside of OSGi, so it looks like something about the OSGi
> bundling is causing the fork parser to fail to send the parser-related
> classes from Tika Core over to the forked JVM
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)