I'm working on something similar for tika-app:
https://issues.apache.org/jira/browse/TIKA-4515
If that pans out, it shouldn't be too hard to do something like that
and then zip up the output and return it.

On Fri, Sep 26, 2025 at 9:40 AM Tim Allison <[email protected]> wrote:
>
> You can request an account on our JIRA and open a ticket there. I
> don't think I'll have time to work on this any time soon. :(
>
> On Tue, Sep 9, 2025 at 12:30 PM Zig Zag <[email protected]> wrote:
> >
> > Is there a way I can log a request for this capability ?
> >
> > THank you,
> > Samuel
> >
> > On Mon, Jun 17, 2024 at 1:19 PM Tim Allison <[email protected]> wrote:
> >>
> >> I regret that those endpoints do not have a reliable way to link them.
> >>
> >> I recently integrated something that does work, but it requires the 
> >> tika-pipes framework, which you can use via tika-server.
> >>
> >> It will output .json files and a subdirectory of binary files, and there 
> >> is a key in the json file that points to the binary file.  It is not well 
> >> documented, but I can make time to document that if you'd be interested.
> >>
> >> If I had more time, I'd try to integrate this into an /unpack/v2 or (what 
> >> was it?) /runpack or similar so that you could use the legacy tika-server 
> >> pattern: send bytes, get back a zip. I don't think I'll have time soon to 
> >> implement this.
> >>
> >> On Fri, Jun 14, 2024 at 7:00 PM Zig Zag <[email protected]> wrote:
> >>>
> >>> Hi All,
> >>>
> >>> We have a use case that needs meta, text and the bytes of children from a 
> >>> file. We are using /rmeta and /unpack APIs on TikaServer to solve fro 
> >>> this, unfortunately there is not a great way to correlate the two, the 
> >>> file names and ids generated by the two APIs in some cases are not 
> >>> consistent so we can match up meta and bytes - is there a reliable way to 
> >>> do this ?
> >>>
> >>> Thank you,
> >>> Samuel

Reply via email to