Lewis John McGibbney created ANY23-318:
------------------------------------------
Summary: ExtractionException handling in BaseRDFExtractor.java
kills entire extraction
Key: ANY23-318
URL: https://issues.apache.org/jira/browse/ANY23-318
Project: Apache Any23
Issue Type: Bug
Components: core, extractors
Affects Versions: 2.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Blocker
Fix For: 2.2
Right now the following snippet of code contained within BaseRDFExtractor.java
kills entire extractions. I propose to merely log the errors and continue with
the extraction.
{code}
} catch (RDFParseException ex) {
- throw new ExtractionException("Error while parsing RDF document.",
ex, extractionResult);
+ LOG.error("Error while parsing RDF document.", ex,
extractionResult);
}
}
{code}
The parsing strictness is inherited from the underlying semargl parsers which
expect perfect syntax for input data... in the 'wild' however, this
unfortunately is not realistic.
The solution is for us to log the Exception, issues, etc. and carry on with the
extraction.
Patch coming up.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)