Re: Using Tesseract from a C++ application.

MARTIN Pierre Sat, 10 Apr 2010 10:26:39 -0700

Hello Remi,

> I share a rather other point of view of what should be an universal C+
> + OCR engine.
i agree, but only for the core (Which should only accept "internal" types we 
create, so it's cross-platform).


> Python is great on "Intel" or more precisely on Google Apps Engine.
> Do you known about Windows Mobile 4, 5 or 6? Do you know about Nokia?
> Or AS/400 ? Beagleboard (that I don't know) ?
> These kind of platforms doesn't like STL, BOOST, Qt and I'm sure an
> OCR doesn't need these kind of library. What for ? For string class,
> vectors, list, map...
Wrong, Qt is running everywhere. And that's, as i said in a previous mail, only 
for being able to easily port, we can get rid of it once we have stable code.

> An OCR doesn't need very advanced string class, vectors, list and map
> we can find more portable implementation.
It does, as soon as we want the API to be "communicant" easily with any other 
language / APP. Re-creating our strings, vectors and type isnt a good idea, 
since it will be the same pitfall Tess already felt into: forcing the 
developper who "simply" wants to use a single API to include a huge load of 
various headers.

> My point of view is a super very simple code, that only do OCR. No I/
> O, no image decompression, nothing else than OCR.
> I never use smart pointers and never get a memory leak.
> Most platform already have image loading libraries (jpeggroup,
> libtiff, gdiplus, ...)
That's not the point! Let's say all platforms have support for all file types, 
even now, it doesn't mean that the API is the same on each of those platforms. 
We don't want to maintain wrappers for each, it's the job of Nokia (Former 
Trolltech) with Qt, for example. Windows, Linux, Mac have thread support. But 
they all are encapsulated in different APIs, for example.

> Imagine OCR API begin with OCR_
> int main(int argc, char **argv)
> {
>    // Parse aguments
>   OCR_Params oparams; // Contain language def, OCR params
>   LoadParametersFromFile(&oparams); // Or from a stream, a db, ...
> developer specific
> 
>   OCR_Image image; // A simple 24 bit image format
>   LoadImageFromFile(&image); // Decode JPEG, TIFF with platform
> library
> 
>   OCR_Result result;
>   if (OCR_DoOcr(&oparams, &image, &result) {
>     // Use the result
>     for (int i=0; i<result.WordCount; i++)  {
>         OCR_Word word;
>         result.GetWord(i, &word);
>     }
>   }
>   OCR_CleanResult(&result); // Perhaps needed
> }
This is horrible, and so close to what Tess already is. Sequencial / linear / 
process programming is old now. It should be object, in namespaces (Then let's 
say Tess or OCR namespace, whatever).

> No new, no delete, no memory leak
Wrong. None of my programs leak, all of them use "new" / "delete". Alloc / 
dealloc is not synonym of leak. The code you're showing is not efficient, and 
for example all your objects are on stack. We require some data to be in heap, 
and that's what "new" does (And i really don't want to enter technical 
discussion here). All your code actually works with copy constructing your 
return values, even if you don't see it, and that is a big memory / processor 
overhead on such program (OCR).

> Very easy to call from Java, Phyton
It's not bound to the "style" we use, more to the wrappers we provide. And the 
team i'm trying to gather for now is C++ (Core) oriented.

> OCR_DoOCR is basic C++ code that compile everywhere. #ifdef isn't
> needed in source code
Sure, and it's an horrible, ugly, not flexible function, which will require 
globals, static, etc. Also making it compile from everywhere has nothing to do 
with the fact it's C++ code.

> This is exactly my point point of view of what a portable OCR should be.
Agreed. Now if for example you think your "LoadImageFromFile" works, tell me 
how. Do we want to maintain such functions for all platforms, given that it's 
already Qt / GTK / ImageMagik / whatever boost-like library's work?

But the problem is a bit more complex. Our core OCR could be plain C++, and 
accepting types we have control on (OCR_Image in your example). But shipping a 
"core" out of the box is not an option, so there needs to be a "in-between" 
layer wich does the platform -> our type conversion. That's what is obvioulsy 
being in developpement with Tesseract (Leptonica) but it's horribly handled, 
since Tess accepts a PIX* as input data, which is a Leptonica type. Leptonica 
should be used in the in-between layer, not in the core. If we can reach this 
point, the rest will be pretty easy, and i'm sure choosing Qt, boost, 
leptonica, GTK, or whatever, then will just be a matter of taste.

As you probably know, an OCR core involves a lot of different things, this may 
include for example thread support. At this point, using a platform-abstraction 
framework may become also a need for the core.

A): Qt-ish library = Platform (And in some cases hardware) abstraction.
B): OCR = Something which needs to run on any platform.
C): A+B.
D): B+C = surounding layer + core, for example to make a standalone program 
like "tesseract" binary.

Cheers,
Pierre.


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: Using Tesseract from a C++ application.

Reply via email to