[ 
https://issues.apache.org/jira/browse/TIKA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-1034.
---------------------------------

    Resolution: Won't Fix

See the {{Detector}} javadocs. You can pass {{null}} as the {{InputStream}} in 
such cases. If you do pass a non-{{null}} stream, it needs to support the 
mark/reset feature (you'll need to wrap the stream in {{TikaInputStream}} or 
{{BufferedInputStream}} if necessary).

The reason why we only check the type hint from the input metadata after trying 
the other detection methods is that often such type hints (for example coming 
from a remote web server) are not very accurate. Thus we only use them if a 
more specific type can't automatically be detected.

Resolving as Won't Fix.
                
> MimeTypes seems to be doing unnecessary work in the detect method
> -----------------------------------------------------------------
>
>                 Key: TIKA-1034
>                 URL: https://issues.apache.org/jira/browse/TIKA-1034
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.2
>            Reporter: Bice Dibley
>
> The final section of MimeTypes.detect is always used to set the type if 
> provided in the metadata, but does this after using two other resolution 
> strategies. Would it be possible to move the following to the top of the 
> detect method
> {code}
> // Get type based on metadata hint (if available)
> String typeName = metadata.get(Metadata.CONTENT_TYPE);
> if (typeName != null) {
>     try {
>          MediaType hint = forName(typeName).getType();
>          if (registry.isSpecializationOf(hint, type)) {
>             type = hint;
>          }
>     } catch (MimeTypeException e) {
>         // Malformed type name, ignore
> }
> {code}
> and if the type is successfully set, return at that point rather than 
> continuing with the other resolution strategies?
> The reason I ask is that I'm experiencing a problem with MimeType.detect 
> causing the stream to be closed because the type of the stream being used is 
> read-once and so doesn't support mark/reset. However, I am passing the 
> content type of the file in as part of the metadata, so the detect method 
> shouldn't need to read from the stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to