If rmeta/text is not returning text extracted from embedded files that’s a
bug.

I don’t think /rmeta/all is a thing.

On Thu, Mar 21, 2024 at 5:21 PM Zig Zag <ziganda...@gmail.com> wrote:

> Thanks Josh, thats correct but rmeta/text allows you to control this but
> it only returns one level of text (not documents embedded within others) -
> when you use the recursive interface rmeta/all it always returns content as
> HTML and similarly unpack/all returns meta as CSV.
>
> On Thu, Mar 21, 2024 at 1:40 PM Josh Burchard <burch...@pnp-hcl.com>
> wrote:
>
>> Samuel - Well, I use Tika server and I get my data back in JSON format
>> because I use the /rmeta/text endpoint and send the HTTP header
>> Accept:application/json.  If you were to send Accept:text/plain would that
>> work for you? I've only done that in the context of the /tika endpoint and
>> that was long ago.  Not sure how to do anything similar in the app because
>> I never use that.  By the way, in the context of using the server I find
>> this table very helpful:
>>
>>
>> https://cwiki.apache.org/confluence/display/TIKA/TikaServerEndpointsCompared
>>
>>
>>
>>
>>
>>
>>
>> From:        "Zig Zag" <ziganda...@gmail.com>
>> To:        user@tika.apache.org
>> Date:        03/21/2024 03:49 PM
>> Subject:        Re: Meta output format of tika server /unpack/all
>> ------------------------------
>>
>>
>>
>> [CAUTION: This email is from outside the organization. Unless you trust
>> the sender, don't click links or open attachments as it may be a phishing
>> email, which can steal your information and compromise your computer.]
>>
>>
>> Similarly is it possible to have /rmeta/all format content/text as text
>> instead of HTML?
>>
>> On Thu, Mar 21, 2024 at 9:50 AM Zig Zag <*ziganda...@gmail.com*
>> <ziganda...@gmail.com>> wrote:
>> Hi All,
>>
>> Is there a way to get the __META__ output of /unpack/all in a JSON rather
>> than CSV ?
>>
>> Thank you,
>> Samuel
>>
>>

Reply via email to