Re: Using Tesseract from a C++ application.

rthomas Sat, 10 Apr 2010 23:40:17 -0700


On Apr 10, 7:22 pm, MARTIN Pierre <[email protected]> wrote:
> Hello Remi,
>
> > I share a rather other point of view of what should be an universal C+
> > + OCR engine.
>
> i agree, but only for the core (Which should only accept "internal" types we 
> create, so it's cross-platform).
>
> > Python is great on "Intel" or more precisely on Google Apps Engine.
> > Do you known about Windows Mobile 4, 5 or 6? Do you know about Nokia?
> > Or AS/400 ? Beagleboard (that I don't know) ?
> > These kind of platforms doesn't like STL, BOOST, Qt and I'm sure an
> > OCR doesn't need these kind of library. What for ? For string class,
> > vectors, list, map...
>
> Wrong, Qt is running everywhere. And that's, as i said in a previous mail, 
> only for being able to easily port, we can get rid of it once we have stable 
> code.
>
> > An OCR doesn't need very advanced string class, vectors, list and map
> > we can find more portable implementation.
>
> It does, as soon as we want the API to be "communicant" easily with any other 
> language / APP. Re-creating our strings, vectors and type isnt a good idea, 
> since it will be the same pitfall Tess already felt into: forcing the 
> developper who "simply" wants to use a single API to include a huge load of 
> various headers.
>
> > My point of view is a super very simple code, that only do OCR. No I/
> > O, no image decompression, nothing else than OCR.
> > I never use smart pointers and never get a memory leak.
> > Most platform already have image loading libraries (jpeggroup,
> > libtiff, gdiplus, ...)
>
> That's not the point! Let's say all platforms have support for all file 
> types, even now, it doesn't mean that the API is the same on each of those 
> platforms. We don't want to maintain wrappers for each, it's the job of Nokia 
> (Former Trolltech) with Qt, for example. Windows, Linux, Mac have thread 
> support. But they all are encapsulated in different APIs, for example.
>
>
>
> > Imagine OCR API begin with OCR_
> > int main(int argc, char **argv)
> > {
> >    // Parse aguments
> >   OCR_Params oparams; // Contain language def, OCR params
> >   LoadParametersFromFile(&oparams); // Or from a stream, a db, ...
> > developer specific
>
> >   OCR_Image image; // A simple 24 bit image format
> >   LoadImageFromFile(&image); // Decode JPEG, TIFF with platform
> > library
>
> >   OCR_Result result;
> >   if (OCR_DoOcr(&oparams, &image, &result) {
> >     // Use the result
> >     for (int i=0; i<result.WordCount; i++)  {
> >         OCR_Word word;
> >         result.GetWord(i, &word);
> >     }
> >   }
> >   OCR_CleanResult(&result); // Perhaps needed
> > }
>
> This is horrible, and so close to what Tess already is. Sequencial / linear / 
> process programming is old now. It should be object, in namespaces (Then 
> let's say Tess or OCR namespace, whatever).
>
> > No new, no delete, no memory leak
>
> Wrong. None of my programs leak, all of them use "new" / "delete". Alloc / 
> dealloc is not synonym of leak. The code you're showing is not efficient, and 
> for example all your objects are on stack. We require some data to be in 
> heap, and that's what "new" does (And i really don't want to enter technical 
> discussion here). All your code actually works with copy constructing your 
> return values, even if you don't see it, and that is a big memory / processor 
> overhead on such program (OCR).
>
> > Very easy to call from Java, Phyton
>
> It's not bound to the "style" we use, more to the wrappers we provide. And 
> the team i'm trying to gather for now is C++ (Core) oriented.
>
> > OCR_DoOCR is basic C++ code that compile everywhere. #ifdef isn't
> > needed in source code
>
> Sure, and it's an horrible, ugly, not flexible function, which will require 
> globals, static, etc. Also making it compile from everywhere has nothing to 
> do with the fact it's C++ code.
>
> > This is exactly my point point of view of what a portable OCR should be.
>
> Agreed. Now if for example you think your "LoadImageFromFile" works, tell me 
> how. Do we want to maintain such functions for all platforms, given that it's 
> already Qt / GTK / ImageMagik / whatever boost-like library's work?
>
> But the problem is a bit more complex. Our core OCR could be plain C++, and 
> accepting types we have control on (OCR_Image in your example). But shipping 
> a "core" out of the box is not an option, so there needs to be a "in-between" 
> layer wich does the platform -> our type conversion. That's what is obvioulsy 
> being in developpement with Tesseract (Leptonica) but it's horribly handled, 
> since Tess accepts a PIX* as input data, which is a Leptonica type. Leptonica 
> should be used in the in-between layer, not in the core. If we can reach this 
> point, the rest will be pretty easy, and i'm sure choosing Qt, boost, 
> leptonica, GTK, or whatever, then will just be a matter of taste.
>
> As you probably know, an OCR core involves a lot of different things, this 
> may include for example thread support. At this point, using a 
> platform-abstraction framework may become also a need for the core.
>
> A): Qt-ish library = Platform (And in some cases hardware) abstraction.
> B): OCR = Something which needs to run on any platform.
> C): A+B.
> D): B+C = surounding layer + core, for example to make a standalone program 
> like "tesseract" binary.
>


Pierre,

Qt is a great library but it's not a good strategic choice
- You need a license for commercial products (and the price is high)
- It doesn't target all C++ platforms
- You can't guarantee it'll exist and being maintained in 20 years
- Some other technical issue list below

Let's be less technical to expose my point of view:
- If you want to do OCR this mean you have images, and you certainly
have the lib to open them. Why include JPEG, TIFF or anything else
functions? I already have this and it's so simple and fast to convert
my image into a RAW 24 bits image array.
- Some platform doesn't like STL, some other BOOST, because we need
very basic wstring (unicode), vector, map, no I/O, no thread we can
offer to avoid STL, BOOST and really being compatible with all
platforms
- Future is multicore processors, how take advantage of them? With an
OCR it's simple, open 8 image and do 8 OCR at a time. Splitting OCR
process for one image would be too complex and useless because on
today processors OCR is fast enough for one image.
- I don't want to reinvent an OCR, I want to reverse  engineer
Tesseract and rewrite only the OCR part.
- A "pure" C++ library can very easily being transpose in Java, C# or
same family language. It's important for future : next Windows Mobile
7 machine can't run C++ code. Windows Azure prefer manage code.
Android... I think future is managed OS.

Remi

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: Using Tesseract from a C++ application.

Reply via email to