[
https://issues.apache.org/jira/browse/TIKA-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347494#comment-17347494
]
Nick Burch edited comment on TIKA-3409 at 5/19/21, 11:34 AM:
-------------------------------------------------------------
As well as the primary type that Tika detects, also check the aliases and the
parent type (if defined). You may well find a text in those
eg application/xml has a parent of text/plain
eg application/javascript has an alias of text/javascript
You can get the aliases and parent type via
https://tika.apache.org/1.26/api/org/apache/tika/mime/MediaTypeRegistry.html
was (Author: gagravarr):
As well as the primary type that Tika detects, also check the aliases and the
parent type (if defined). You may well find a text in those
eg application/xml has a parent of text/plain
eg application/javascript has an alias of text/javascript
> provide isBinary/isText method
> ------------------------------
>
> Key: TIKA-3409
> URL: https://issues.apache.org/jira/browse/TIKA-3409
> Project: Tika
> Issue Type: New Feature
> Reporter: Caleb Cushing
> Priority: Major
>
> Since tika can detect what kind of file something is, it could also know
> whether that file type is binary or not, I'd love to have a method
> `MimeType::isBinary` or something, so I could know if I could try "parsing"
> the file.
> related https://stackoverflow.com/q/620993/206466
--
This message was sent by Atlassian Jira
(v8.3.4#803005)