So I have thousands of files to be run by Tika. Unfortunatly, these are not 
available at once but are "created" one by one. My tests have shown that the 
creator process is faster than Tika. So now I am wondering how I should combine 
creator and parser process to speed things up.
Btw. the creator is completly separate, otherwise I would include the parser 
calls directly in it. But this is not possible.
To achieve some kind of parallelism I thought of two options:
1) Spawn a new small Java code piece which parses a file
2) Send the file to Tika Jaxrs Server
But since the creator is so fast it would fire up multiple calls to Tika per 
second. On the other hand I don't want to wait for the creator to finish 
because it runs for houres and in the meantime I could already start parsing.
Any ideas?

Reply via email to