[
https://issues.apache.org/jira/browse/ANY23-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280638#comment-13280638
]
Peter Ansell commented on ANY23-98:
-----------------------------------
I started on a patch for this issue in my github fork but I didn't manage to
get it complete for N3 for some reason, and it looked rather hacky and prone to
errors. I ended up switching to checking whether the format was valid using
Rio.getParser(RDFFormat) and then attempting to parse, as is done now for the
Turtle parser. I am still attempting to get a clean, small, top section from
the document, but I had to include some new definitions. [1]
I was going to raise an issue for it but I didn't yet have a solution myself
that didn't change virtually everything in the way TikaMIMETypeDetector works.
Also, I was able to update to Tika-1.1 successfully. I patched the
mimetypes.xml file from Tika-1.1 to include the RDF definitions that it did not
already contain. You can see the patched version at [2]
[1]
https://github.com/ansell/any23/blob/master/mime/src/main/java/org/apache/any23/mime/TikaMIMETypeDetector.java
[2]
https://github.com/ansell/any23/blob/master/mime/src/main/resources/org/apache/any23/mime/mimetypes.xml
> TikaMIMEtypeDetector doesn't recognize certain file formats when they contain
> header comments
> ---------------------------------------------------------------------------------------------
>
> Key: ANY23-98
> URL: https://issues.apache.org/jira/browse/ANY23-98
> Project: Apache Any23
> Issue Type: Improvement
> Affects Versions: 0.7.0
> Reporter: Michele Mostarda
> Fix For: 0.8.0
>
>
> Adding header comments to NQ, N3 and RSS files prevents the
> TikaMIMEtypeDetector to work properly.
> See #ANY23-97 for further details.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira