Ah, that’s good news, will look into that!

I’ve only been using the 2.2.1-full official Tika docker image with default 
config, only added some more Tesseract languages for OCR. 

Vennlig hilsen

Willy T. Koch
[email protected]
Mob: +47 480 321 77


Den Tor 10 feb 2022, kl. 22:40, skrev Nick Burch:
> On Thu, 10 Feb 2022, Willy T. Koch wrote:
> > As for content detection, today the content-type field with mime type is 
> > returned. What we would need is a mime-type to file extension lookup and 
> > it seems logical that this was also returned by Tika.
> 
> How are you calling Tika? We already have APIs for this. Just ask the 
> MimeTypes class (available via TikaConfig.getMimeRepository) about a type, 
> and it'll return the details including the preferred extension and other 
> possible well-known extensions
> 
> Nick
> 

Reply via email to