[tesseract-ocr] Multiple images : API or .bat ?

Pierre Anquetil Mon, 24 Apr 2017 03:34:02 -0700

Hello,

i am actually using Tesseract 3.02 (Windows) to recognize a large number of 
pages (few millions).


I am actually launching multiple Tesseract process with Python, by 
launching multiple .bat files that loop over directories.

What the .bat files really do is that they call the tesseract command for 
each file (For %%A in (%_SourcePath%) Do %_Tesseract% %%A 
%_OutputPath%%%~nA)

Intuitively, i am thinking of the cost of this call as Init_cost + 
Process_cost, so that the total cost for N files is N * ( Init_cost + 
Process_cost)

Would the API be faster for this task ? (I think, by making an Init only 
one time when the new instance is created, so that total cost is Init_cost 
+ N * Process_cost ?


Many thanks in advance for your usefull work :-)



-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a889886a-be49-4126-a207-1ceee98a73cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Multiple images : API or .bat ?

Reply via email to