Lewis John McGibbney created ANY23-314:
------------------------------------------
Summary: Service fails to return extraction in case of extraction
error
Key: ANY23-314
URL: https://issues.apache.org/jira/browse/ANY23-314
Project: Apache Any23
Issue Type: Bug
Components: service
Affects Versions: 2.1
Environment: Any23 2.2-SNAPSHOT
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Fix For: 2.2
Attachments: extraction.json, output.log
See the following command line extraction
{code}
lmcgibbn@LMC-056430 /usr/local/any23(master) $
./cli/target/appassembler/bin/any23 rover -l output.log -o extraction.json
https://www.jobcluster.de
------------------------------------------------------------------------
Apache Any23 :: rover
------------------------------------------------------------------------
0 [main] WARN org.apache.tika.parser.image.ImageParser - JBIG2ImageReader
not loaded. jbig2 files will be ignored
128 [main] INFO org.apache.any23.rdf.PopularPrefixes - Loading prefixes from
/org/apache/any23/prefixes/prefixes.properties
1388 [main] WARN org.apache.commons.httpclient.HttpMethodBase - Going to
buffer response body of large or unknown size. Using getResponseBodyAsStream
instead is recommended.
4790 [main] INFO org.apache.any23.extractor.SingleDocumentExtraction -
Processing https://www.jobcluster.de/
[Fatal Error] :12:46: The entity name must immediately follow the '&' in the
entity reference.
------------------------------------------------------------------------
Apache Any23 FAILURE
Execution terminated with errors: Error while parsing RDF document.
Total time: 5s
Finished at: Tue Dec 12 08:01:14 PST 2017
Final Memory: 31M/184M
------------------------------------------------------------------------
{code}
This results in the attached extraction result (extraction.json) and associated
log (output.log)
If I attempt to run the same extraction using the service at
[any23.org|http://any23.org/any23/?format=json&uri=https%3A%2F%2Fwww.jobcluster.de%2F&validation-mode=none]
the (partial) extraction result should be returned regardless of whether the
entire extraction was successful or not.
The service servlet seems to be returning the extraction Exception as oppose to
the preferred extraction result. This issue will fix that.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)