Hans Brende created ANY23-415:
---------------------------------
Summary: NTriplesExtractor tries all text/plain files, causing
numerous fatal issues
Key: ANY23-415
URL: https://issues.apache.org/jira/browse/ANY23-415
Project: Apache Any23
Issue Type: Bug
Components: extractors
Affects Versions: 2.3
Reporter: Hans Brende
Fix For: 2.3
Since the NTriplesExtractorFactory includes a content type of "text/plain",
this causes every plain text file to be processed by the NTriplesExtractor,
which in turn causes huge numbers of completely unnecessary fatal issues being
sent to the extraction report.
In my crawls, this mostly occurs for all the "humans.txt" files encountered.
While this isn't a hugely serious bug, it is quite irritating as it does really
clutter up my logs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)