Hi,
Newly Tika user here. Really impressed by the Tika toolkit and we’re planning 
to use it as a Docker service in our case management solution used by the 
public sector in the Nordics, for many different use cases.

As for content detection, today the content-type field with mime type is 
returned. What we would need is a mime-type to file extension lookup and it 
seems logical that this was also returned by Tika.

After some research there are some quite extensive lists with mime-type to file 
extension mapping, based on the official IANA list and Apache and nginx servers
Example:
https://www.iana.org/assignments/media-types/media-types.xhtml

https://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf/mime.types

Is this an add-on that could be considered as part of a standard Tika setup? 
Has this need been discussed before?


Regards,
Willy T. Koch
Norway

Reply via email to