[
https://issues.apache.org/jira/browse/TIKA-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14364064#comment-14364064
]
Nick Burch commented on TIKA-1114:
----------------------------------
The file(1) sgml magic file seems to all be sgml-based formats such as svg, xml
sitemap, osm, gnucash etc
Is there really such a thing as a generic SGML file though? Aren't most/all(?)
sgml-based files actually ones of a specific SGML Application which is a
subtype based on the SGML structure?
> sgml mime type is not detected when passed in as byte stream
> ------------------------------------------------------------
>
> Key: TIKA-1114
> URL: https://issues.apache.org/jira/browse/TIKA-1114
> Project: Tika
> Issue Type: Bug
> Components: mime
> Reporter: Vikas Garg
>
> When passing sgml files as TikaInputStream (created from byte[]) to
> Detector.detect(), it returns text/plain as mediatype and not
> application/sgml or text/sgml. But when I provide the file name to metadata,
> then it gives me correct mime-type, i.e., text/sgml.
> Is it because Tika is missing any designated parser for sgml files OR am I
> missing something? I am on Tika-1.3.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)