I'm planning to use TIKA as part of a process for cataloging data on a share drive. Based on the website and tika-mimetypes.xml, the type detection looks pretty comprehensive. However, while browsing tika-mimetypes.xml, I noticed that about half of the mime-types listed have no associated glob, root-XML, or magic elements. Without this match criteria, can TIKA ever actually detect a file of one of these types?
I browsed the detector source. It looks like it tries to match against magic, then XML, then names/globs/patterns. If a mime-type doesn't have any of these, can TIKA do anything with it? If so, why is it listed in the tike-mimetypes.xml file? Thank you Doug
