Greg Holmberg created UIMA-2472:
-----------------------------------
Summary: TikaAnnotator can't find XML parser when used in a PEAR
file with Java 1.5 or later
Key: UIMA-2472
URL: https://issues.apache.org/jira/browse/UIMA-2472
Project: UIMA
Issue Type: Bug
Components: addons
Affects Versions: 2.3.1Addons
Environment: Java 1.5 and later
Reporter: Greg Holmberg
Priority: Critical
When TikaAnnotator is part of a PEAR file, then when you call
UIMAFramework.produceAnalysisEngine() and Tika asks the system for an XML
parser, it fails with the exception:
javax.xml.parsers.FactoryConfigurationError: Provider for
javax.xml.parsers.DocumentBuilderFactory cannot be found
This is because the XML parser is now built into Java, but the UIMA classloader
(used with PEAR files) finds the parser implementation in xml-apis.jar first,
which is older and incompatible with the current XML interfaces. xml-apis.jar
is included because it's one of the eventual maven dependencies for Tika 0.7.
See this issue for more information:
https://issues.apache.org/jira/browse/TIKA-412
This was fixed in Tika 0.8.
A work-around for those UIMA users who want to use TikaAnnotator in PEAR files
with Java 1.6 is to exclude xml-apis from their PEAR file:
<dependency>
<groupId>org.apache.uima</groupId>
<artifactId>TikaAnnotator</artifactId>
<exclusions>
<exclusion>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
</exclusion>
</exclusions>
</dependency>
However, a better fix would be to update the version of Tika used in
TikaAnnotator.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira