[
https://issues.apache.org/jira/browse/TIKA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting resolved TIKA-1034.
---------------------------------
Resolution: Won't Fix
See the {{Detector}} javadocs. You can pass {{null}} as the {{InputStream}} in
such cases. If you do pass a non-{{null}} stream, it needs to support the
mark/reset feature (you'll need to wrap the stream in {{TikaInputStream}} or
{{BufferedInputStream}} if necessary).
The reason why we only check the type hint from the input metadata after trying
the other detection methods is that often such type hints (for example coming
from a remote web server) are not very accurate. Thus we only use them if a
more specific type can't automatically be detected.
Resolving as Won't Fix.
> MimeTypes seems to be doing unnecessary work in the detect method
> -----------------------------------------------------------------
>
> Key: TIKA-1034
> URL: https://issues.apache.org/jira/browse/TIKA-1034
> Project: Tika
> Issue Type: Improvement
> Components: mime
> Affects Versions: 1.2
> Reporter: Bice Dibley
>
> The final section of MimeTypes.detect is always used to set the type if
> provided in the metadata, but does this after using two other resolution
> strategies. Would it be possible to move the following to the top of the
> detect method
> {code}
> // Get type based on metadata hint (if available)
> String typeName = metadata.get(Metadata.CONTENT_TYPE);
> if (typeName != null) {
> try {
> MediaType hint = forName(typeName).getType();
> if (registry.isSpecializationOf(hint, type)) {
> type = hint;
> }
> } catch (MimeTypeException e) {
> // Malformed type name, ignore
> }
> {code}
> and if the type is successfully set, return at that point rather than
> continuing with the other resolution strategies?
> The reason I ask is that I'm experiencing a problem with MimeType.detect
> causing the stream to be closed because the type of the stream being used is
> read-once and so doesn't support mark/reset. However, I am passing the
> content type of the file in as part of the metadata, so the detect method
> shouldn't need to read from the stream.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira