On Friday, February 19, 2016 at 3:00:42 PM UTC-5, viraf wrote:
>
> Tom, I created a multi-page TIFF as per earlier recommendation on this 
> thread (avoid multiple inits).  Running it on Linux from the command line 
> provided me with a reference by which to compute PPM that I could target 
> with Tess4J.  I had hoped to get 10+ PPM / core and shift focus on 
> accuracy.  I am at about 6 PPM and unclear where / how to improve 
> performance (speed).  
>

I take it the question about the representativeness of that size file was 
too sensitive/boring/trivial/... to answer. 

Given the issues with multi-page TIFFs, one experiment worth running is to 
try a list of single page TIFFs instead of one ridiculously large file.

$ cat > filelist.txt
page0001.tif
page0002.tif
...
page1800.tif

$ tesseract filelist.txt

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/977f7307-3440-4c3c-a053-2de9b7c3c4f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to