Hi,

The format does not make a great deal of difference providing the quality 
is good.

There are a number of threads on here which discuss useful imagemagick 
scripts that can improve OCR accuracy.

You cannot give tesseract more/less time to increase accuracy, but you can 
have some of its job done by other programs.

Giving tesseract more resources will make it faster, not more accurate.

You can use TesserractExtractResult() to see where it is going wrong.

My advice on improving accuracy:
Find what characters are common problems and try to improve through simple 
image processing (erode and dilate can make big difference).
If font is unique you can get more accurate results by training a new one 
(this can be time consuming).
Implement a few ocr oriented imagemagick scripts.

I hope this helps.

On Sunday, October 4, 2015 at 1:35:51 AM UTC+8, [email protected] 
wrote:
>
> Lets assume we have 600dpi pictures of typed text, without much noise, 
> border or too much rotation. The results are good, but not great. So what 
> else could we do?
>
> I would like to...
>
>
>    1. ... know wich file format is best. jpg? png? tiff? I didn't find 
>    documentation for that.
>    2. ... give tesseract more time for better results. Is there am option 
>    to do that? (I didn't find one)
>    3. ... give tesseract more computation power for better results. (I 
>    didn't find an option)
>    4. ... see where tesseract is not sure about things, so I can correct 
>    them.
>
> It would be great if someone could provide me some documentation (or even 
> opinion) about this questions.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/941aeca5-bb94-43a4-b2da-d83f6187c58a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to