ScanBizCards actually uses Tesseract 3.01 - I believe the fears
expressed by many on this forum about using "non official" versions of
Tesseract are misplaced. We switched from 2.04 to 3.00 as soon as 3.0
was made available - and only benefited from it - then switched to
3.01 quickly - and again experienced significant improvements (and
only rare cases where Tesseract 3.01 did less well than Tesseract
3.00).

Tesseract's image processing is acceptable in the case of images with
a fairly uniform background and text with high contrast relative to
the background, such as this example:
http://www.scanbizcards.com/benefitquest.jpg

Since we very often get images with shadows, bad lighting and
backgrounds with strong colors ScanBizCards applies its own image
processing and calls Tesseract only with a black and white image. This
is what we produce on the above example (in this case results are
identical - and good - without our preprocessing too):
http://www.scanbizcards.com/benefitquest-bw.jpg

Regarding performance (on iPhone 3GS):
- we spend 3 seconds on image processing for a 1,024 x 768 image
- OCR then takes us 14 seconds, but that's not just Tesseract,
includes time spent in our code

I can get you performance numbers just for Tesseract if you'd like and
for the iPhone 4, let me know.

Regarding layout analysis: it's available and it works. I don't know
if there is an API that returns the coordinates of words, we use the
sequence of boxes for each letter then we determine where there should
be a space (we don't trust Tesseract's space decision much) or a
newline and so we have the coordinates of words. Tesseract 3.01 layout
analysis is about the same as Tesseract 3.0 from my limited comparison

Patrick

On Jul 20, 11:20 am, Cyril <[email protected]> wrote:
> Hi,
>
> I have some basic questions before starting a project of OCR
> recognition for the iPhone.
>
> I have seen the steps to cross-compile tesseract for iOS but have some
> questions on tesseract roadmap itself:
> 1/ should I start on tesseract 2.4 or 3.0? From my understanding 3.0
> is not yet stable but has a major refactoring ongoing plus several
> features (including document layout analysis)? The current 3.0
> "release" is quite far from the head of the trunk, which do not seem
> to compile on iOS, so I am wondering if there is any new release
> (3.01?) planned soon and compatible with iOS?
> 2/ is the accuracy and speed of the 3.0 release better or at least
> similar to the 2.4 release?
> 3/ is the document layout analysis already stable? A particular need I
> have is to be able to get the position of a particular recognized word
> in the document? Is this possible with tesseract?
> 4/ what is the typical preprocessing steps involved in OCR (b&w,
> threshold etc.)? Are they already performed by tesseract or do I need
> to perform them myself? If yes with which library is it usually done?
> Leptonica or OpenCV?
>
> I am also interested if you could give me pointers to code samples
> that demonstrate the API usage or tutorials on OCR concepts or on the
> APIs of tesseract. Any pointer to the state-of-the-art of OCR,
> including papers on useful preprocessing techniques impacting
> performance is also welcomed.
>
> I have seen that ScanBizCard is using tesseract 3.0. Do you have other
> examples of iPhone applications using Tesseract or concurrent
> solutions (commercial or open-source)?
>
> Thanks in advance for all your answers,
>
> Cyril

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to