Oliver Kopp created TIKA-1120:
---------------------------------
Summary: Enable direct use of
org.apache.tika.mime.MediaType.detect(...)
Key: TIKA-1120
URL: https://issues.apache.org/jira/browse/TIKA-1120
Project: Tika
Issue Type: Wish
Affects Versions: 1.3
Reporter: Oliver Kopp
Priority: Minor
When using mime type detection, the classes allow following use:
try (InputStream is = theInputStream;
BufferedInputStream bis = new BufferedInputStream(is);) {
MimeTypes mt = new MimeTypes();
Metadata md = new Metadata();
md.add(Metadata.RESOURCE_NAME_KEY, theFileName);
MediaType mediaType = mt.detect(bis, null);
return mediaType.toString();
}
When debugging this, the MimeTypes class instantiates its internal patterns
with an empty MediaTypeRegistry. Therefore, getDefaultMimeTypes() is never
called and thus tika-mimetypes.xml never read.
Is it possible to enable direct usage of MediaType.detect()? Like adding a new
constructor, where the MediaTypeRegistry can be set?
If not, the code comments (or the documentation at
https://tika.apache.org/0.10/detection.html) should point out that MimeTypes()
should not instantiated directly for mime type detection, but the detectors
should be used. Possibly, a minimum example should be added to make the usage
clear.
Following example works here
try (InputStream is = theInputStream;
BufferedInputStream bis = new BufferedInputStream(is);) {
AutoDetectParser parser = new AutoDetectParser();
Detector detector = parser.getDetector();
Metadata md = new Metadata();
md.add(Metadata.RESOURCE_NAME_KEY, theFileName);
MediaType mediaType = detector.detect(bis, md);
return mediaType.toString();
}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira