[ 
https://issues.apache.org/jira/browse/CONNECTORS-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148866#comment-16148866
 ] 

Karl Wright commented on CONNECTORS-1450:
-----------------------------------------

The classes it can't find are present in a jar that is distributed in 
connector-common-lib: poi-ooxml-schemas-3.15.jar:

{code}
...
com/
com/microsoft/
com/microsoft/schemas/
com/microsoft/schemas/office/
com/microsoft/schemas/office/excel/
com/microsoft/schemas/office/excel/impl/
com/microsoft/schemas/office/office/
com/microsoft/schemas/office/office/impl/
com/microsoft/schemas/office/visio/
com/microsoft/schemas/office/visio/x2012/
com/microsoft/schemas/office/visio/x2012/main/
com/microsoft/schemas/office/visio/x2012/main/impl/
com/microsoft/schemas/office/x2006/
com/microsoft/schemas/office/x2006/digsig/
com/microsoft/schemas/office/x2006/digsig/impl/
com/microsoft/schemas/office/x2006/encryption/
com/microsoft/schemas/office/x2006/encryption/impl/
com/microsoft/schemas/office/x2006/keyEncryptor/
com/microsoft/schemas/office/x2006/keyEncryptor/certificate/
com/microsoft/schemas/office/x2006/keyEncryptor/certificate/impl/
com/microsoft/schemas/office/x2006/keyEncryptor/password/
com/microsoft/schemas/office/x2006/keyEncryptor/password/impl/
com/microsoft/schemas/vml/
com/microsoft/schemas/vml/impl/
...
{code}

This is a required dependency of poi-ooxml.jar:

{code}
[INFO]       +- org.apache.poi:poi-ooxml:jar:3.9:test
[INFO]       |  +- org.apache.poi:poi-ooxml-schemas:jar:3.9:test
[INFO]       |  |  \- org.apache.xmlbeans:xmlbeans:jar:2.3.0:test
[INFO]       |  \- dom4j:dom4j:jar:1.6.1:test
{code}

Since the class is present, but since it can't apparently be found, I have to 
assume that the ooxml jar loads classes in a non-standard way and is 
incompatible with ManifoldCF's class loader setup.

The solution has to be to move these jars (both poi-ooxml-schemas and xmlbeans) 
"up a level" to the core classpath.  The workaround is to use the Tika external 
service instead.

This is a significant enough problem that I think we should consider a point 
release to address it.


> Class not found stack trace coming from Tika parsing when visio file found
> --------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1450
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1450
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 2.8
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>
> The Tika Extractor runs into problems with Visio files.  A stack trace shows 
> that the issue is a class that cannot be loaded, which is apparently a 
> dependency of Apache POI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to