Hello Remi,
> I share a rather other point of view of what should be an universal C+
> + OCR engine.
i agree, but only for the core (Which should only accept "internal" types we
create, so it's cross-platform).
> Python is great on "Intel" or more precisely on Google Apps Engine.
> Do you known about Windows Mobile 4, 5 or 6? Do you know about Nokia?
> Or AS/400 ? Beagleboard (that I don't know) ?
> These kind of platforms doesn't like STL, BOOST, Qt and I'm sure an
> OCR doesn't need these kind of library. What for ? For string class,
> vectors, list, map...
Wrong, Qt is running everywhere. And that's, as i said in a previous mail, only
for being able to easily port, we can get rid of it once we have stable code.
> An OCR doesn't need very advanced string class, vectors, list and map
> we can find more portable implementation.
It does, as soon as we want the API to be "communicant" easily with any other
language / APP. Re-creating our strings, vectors and type isnt a good idea,
since it will be the same pitfall Tess already felt into: forcing the
developper who "simply" wants to use a single API to include a huge load of
various headers.
> My point of view is a super very simple code, that only do OCR. No I/
> O, no image decompression, nothing else than OCR.
> I never use smart pointers and never get a memory leak.
> Most platform already have image loading libraries (jpeggroup,
> libtiff, gdiplus, ...)
That's not the point! Let's say all platforms have support for all file types,
even now, it doesn't mean that the API is the same on each of those platforms.
We don't want to maintain wrappers for each, it's the job of Nokia (Former
Trolltech) with Qt, for example. Windows, Linux, Mac have thread support. But
they all are encapsulated in different APIs, for example.
> Imagine OCR API begin with OCR_
> int main(int argc, char **argv)
> {
> // Parse aguments
> OCR_Params oparams; // Contain language def, OCR params
> LoadParametersFromFile(&oparams); // Or from a stream, a db, ...
> developer specific
>
> OCR_Image image; // A simple 24 bit image format
> LoadImageFromFile(&image); // Decode JPEG, TIFF with platform
> library
>
> OCR_Result result;
> if (OCR_DoOcr(&oparams, &image, &result) {
> // Use the result
> for (int i=0; i<result.WordCount; i++) {
> OCR_Word word;
> result.GetWord(i, &word);
> }
> }
> OCR_CleanResult(&result); // Perhaps needed
> }
This is horrible, and so close to what Tess already is. Sequencial / linear /
process programming is old now. It should be object, in namespaces (Then let's
say Tess or OCR namespace, whatever).
> No new, no delete, no memory leak
Wrong. None of my programs leak, all of them use "new" / "delete". Alloc /
dealloc is not synonym of leak. The code you're showing is not efficient, and
for example all your objects are on stack. We require some data to be in heap,
and that's what "new" does (And i really don't want to enter technical
discussion here). All your code actually works with copy constructing your
return values, even if you don't see it, and that is a big memory / processor
overhead on such program (OCR).
> Very easy to call from Java, Phyton
It's not bound to the "style" we use, more to the wrappers we provide. And the
team i'm trying to gather for now is C++ (Core) oriented.
> OCR_DoOCR is basic C++ code that compile everywhere. #ifdef isn't
> needed in source code
Sure, and it's an horrible, ugly, not flexible function, which will require
globals, static, etc. Also making it compile from everywhere has nothing to do
with the fact it's C++ code.
> This is exactly my point point of view of what a portable OCR should be.
Agreed. Now if for example you think your "LoadImageFromFile" works, tell me
how. Do we want to maintain such functions for all platforms, given that it's
already Qt / GTK / ImageMagik / whatever boost-like library's work?
But the problem is a bit more complex. Our core OCR could be plain C++, and
accepting types we have control on (OCR_Image in your example). But shipping a
"core" out of the box is not an option, so there needs to be a "in-between"
layer wich does the platform -> our type conversion. That's what is obvioulsy
being in developpement with Tesseract (Leptonica) but it's horribly handled,
since Tess accepts a PIX* as input data, which is a Leptonica type. Leptonica
should be used in the in-between layer, not in the core. If we can reach this
point, the rest will be pretty easy, and i'm sure choosing Qt, boost,
leptonica, GTK, or whatever, then will just be a matter of taste.
As you probably know, an OCR core involves a lot of different things, this may
include for example thread support. At this point, using a platform-abstraction
framework may become also a need for the core.
A): Qt-ish library = Platform (And in some cases hardware) abstraction.
B): OCR = Something which needs to run on any platform.
C): A+B.
D): B+C = surounding layer + core, for example to make a standalone program
like "tesseract" binary.
Cheers,
Pierre.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en.