I'm working on something similar for tika-app: https://issues.apache.org/jira/browse/TIKA-4515 If that pans out, it shouldn't be too hard to do something like that and then zip up the output and return it.
On Fri, Sep 26, 2025 at 9:40 AM Tim Allison <[email protected]> wrote: > > You can request an account on our JIRA and open a ticket there. I > don't think I'll have time to work on this any time soon. :( > > On Tue, Sep 9, 2025 at 12:30 PM Zig Zag <[email protected]> wrote: > > > > Is there a way I can log a request for this capability ? > > > > THank you, > > Samuel > > > > On Mon, Jun 17, 2024 at 1:19 PM Tim Allison <[email protected]> wrote: > >> > >> I regret that those endpoints do not have a reliable way to link them. > >> > >> I recently integrated something that does work, but it requires the > >> tika-pipes framework, which you can use via tika-server. > >> > >> It will output .json files and a subdirectory of binary files, and there > >> is a key in the json file that points to the binary file. It is not well > >> documented, but I can make time to document that if you'd be interested. > >> > >> If I had more time, I'd try to integrate this into an /unpack/v2 or (what > >> was it?) /runpack or similar so that you could use the legacy tika-server > >> pattern: send bytes, get back a zip. I don't think I'll have time soon to > >> implement this. > >> > >> On Fri, Jun 14, 2024 at 7:00 PM Zig Zag <[email protected]> wrote: > >>> > >>> Hi All, > >>> > >>> We have a use case that needs meta, text and the bytes of children from a > >>> file. We are using /rmeta and /unpack APIs on TikaServer to solve fro > >>> this, unfortunately there is not a great way to correlate the two, the > >>> file names and ids generated by the two APIs in some cases are not > >>> consistent so we can match up meta and bytes - is there a reliable way to > >>> do this ? > >>> > >>> Thank you, > >>> Samuel
