[
https://issues.apache.org/jira/browse/TIKA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134638#comment-17134638
]
Tilman Hausherr commented on TIKA-3114:
---------------------------------------
[~dbalasub] Your stack trace does not contain anything from tika, the last is
"com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse". This must
have been called from tika somewhere.
Alternatively, open your file with tika-app and see what happens. Something
like "java -jar tika-app.jar yourfile.pdf".
> Error reading transcript from document
> --------------------------------------
>
> Key: TIKA-3114
> URL: https://issues.apache.org/jira/browse/TIKA-3114
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.18
> Reporter: Dushyanth Balasubramanian
> Priority: Major
>
> Fatal Error] :1547:3: The element type "div" must be terminated by the
> matching end-tag "</div>".Fatal Error] :1547:3: The element type "div" must
> be terminated by the matching end-tag "</div>".org.xml.sax.SAXParseException;
> lineNumber: 1547; columnNumber: 3; The element type "div" must be terminated
> by the matching end-tag "</div>". at
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
> at
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)