Can you add the content-disposition header to pass the filename as a
hint? See: https://cwiki.apache.org/confluence/display/TIKA/TikaServer

It still feels wrong to me that Tika isn't correctly identifying pptx
without the filename hint.  I'll take a look tomorrow.

‪On Sun, Feb 5, 2023 at 10:27 AM ‫שי ברק‬‎ <[email protected]> wrote:‬
>
> I have a PowerPoint document that I pass to /unpack/all endpoint via Postman.
> I’ve noticed that postman automatically adds in the request header the key of 
> Content-Type, which eventually helps Tika to detect the type of the document 
> and the response I get back is a proper one(including the text, metadata and 
> the images within the presentation).
> However, when I do the same process within my C# project except adding the 
> header of the Content-Type, I get different response from Tika,
> Which is:
> _rels, docProps and ppt folders.
> I also get empty text file back.
> While inspecting the metadata file, it seems the wrong parsers have been used.
> How can I fix this?
> Note that I can’t add the Content-Type header in my code.
>
>

Reply via email to