So I have thousands of files to be run by Tika. Unfortunatly, these are not
available at once but are "created" one by one. My tests have shown that the
creator process is faster than Tika. So now I am wondering how I should combine
creator and parser process to speed things up.
Btw. the creator is completly separate, otherwise I would include the parser
calls directly in it. But this is not possible.
To achieve some kind of parallelism I thought of two options:
1) Spawn a new small Java code piece which parses a file
2) Send the file to Tika Jaxrs Server
But since the creator is so fast it would fire up multiple calls to Tika per
second. On the other hand I don't want to wait for the creator to finish
because it runs for houres and in the meantime I could already start parsing.
Any ideas?