[
https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730208#comment-17730208
]
Ravi Ranjan Jha commented on TIKA-4062:
---------------------------------------
Hi [~tallison], Thanks for looking into this issue so promptly.
While the above change may help in some cases it may not really allow custom
error handling as SAXParserImpl sets the provided DefaultHandler (here
OfflineContentHandler) as the ErrorHandler as shown in code below. It means
none of the methods in ContentHandlerDecorator would throw SAXException as the
code expects!
{noformat}
public void parse(InputSource is, DefaultHandler dh)
throws SAXException, IOException {
if (is == null) {
throw new IllegalArgumentException();
}
if (dh != null) {
xmlReader.setContentHandler(dh);
xmlReader.setEntityResolver(dh);
xmlReader.setErrorHandler(dh);
xmlReader.setDTDHandler(dh);
xmlReader.setDocumentHandler(null);
}
xmlReader.parse(is);
}{noformat}
So we would probably need to add few more ErrorHandler methods to the
ContentHandlerDecorator
{noformat}
@Override
public void warning (SAXParseException exception)
throws SAXException
{
if (handler instanceof ContentHandlerDecorator) {
((ContentHandlerDecorator)handler).warning(exception);
} else {
throw exception;
}
}
@Override
public void error (SAXParseException exception)
throws SAXException
{
if (handler instanceof ContentHandlerDecorator) {
((ContentHandlerDecorator)handler).error(exception);
} else {
throw exception;
}
}
@Override
public void fatalError (SAXParseException exception)
throws SAXException
{
if (handler instanceof ContentHandlerDecorator) {
((ContentHandlerDecorator)handler).fatalError(exception);
} else {
throw exception;
}
}{noformat}
Please share your thoughts.
> OfflineContentHandler/ContentHandlerDecorator does not provide option for
> custom error handling
> -----------------------------------------------------------------------------------------------
>
> Key: TIKA-4062
> URL: https://issues.apache.org/jira/browse/TIKA-4062
> Project: Tika
> Issue Type: Bug
> Components: tika-core
> Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0
> Reporter: Ravi Ranjan Jha
> Priority: Critical
>
> OfflineContentHandler/ContentHandlerDecorator does not provide option for
> custom error handling
> Prior to the change of passing OfflineContentHandler to SAX Parser in
> XMLReaderUtils.parseSAX, one could pass a custom ContentHandlerDecorator to
> handle exception or override error/warning etc methods. The same is not
> possible now because the default impl for handleException in the
> OfflineContentHandler's parent ContentHandlerDecorator just throws exception
> as shown below:
>
> protected void handleException(SAXException exception) throws SAXException {
> throw exception;
> }
>
> which could probably be (at minimum)
> public void handleException(SAXException exception) throws SAXException {
> handler.handleException(exception);
> }
>
> This is breaking our app's behavior. Please take it as priority.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)