On Apr 10, 7:22 pm, MARTIN Pierre <[email protected]> wrote: > Hello Remi, > > > I share a rather other point of view of what should be an universal C+ > > + OCR engine. > > i agree, but only for the core (Which should only accept "internal" types we > create, so it's cross-platform). > > > Python is great on "Intel" or more precisely on Google Apps Engine. > > Do you known about Windows Mobile 4, 5 or 6? Do you know about Nokia? > > Or AS/400 ? Beagleboard (that I don't know) ? > > These kind of platforms doesn't like STL, BOOST, Qt and I'm sure an > > OCR doesn't need these kind of library. What for ? For string class, > > vectors, list, map... > > Wrong, Qt is running everywhere. And that's, as i said in a previous mail, > only for being able to easily port, we can get rid of it once we have stable > code. > > > An OCR doesn't need very advanced string class, vectors, list and map > > we can find more portable implementation. > > It does, as soon as we want the API to be "communicant" easily with any other > language / APP. Re-creating our strings, vectors and type isnt a good idea, > since it will be the same pitfall Tess already felt into: forcing the > developper who "simply" wants to use a single API to include a huge load of > various headers. > > > My point of view is a super very simple code, that only do OCR. No I/ > > O, no image decompression, nothing else than OCR. > > I never use smart pointers and never get a memory leak. > > Most platform already have image loading libraries (jpeggroup, > > libtiff, gdiplus, ...) > > That's not the point! Let's say all platforms have support for all file > types, even now, it doesn't mean that the API is the same on each of those > platforms. We don't want to maintain wrappers for each, it's the job of Nokia > (Former Trolltech) with Qt, for example. Windows, Linux, Mac have thread > support. But they all are encapsulated in different APIs, for example. > > > > > Imagine OCR API begin with OCR_ > > int main(int argc, char **argv) > > { > > // Parse aguments > > OCR_Params oparams; // Contain language def, OCR params > > LoadParametersFromFile(&oparams); // Or from a stream, a db, ... > > developer specific > > > OCR_Image image; // A simple 24 bit image format > > LoadImageFromFile(&image); // Decode JPEG, TIFF with platform > > library > > > OCR_Result result; > > if (OCR_DoOcr(&oparams, &image, &result) { > > // Use the result > > for (int i=0; i<result.WordCount; i++) { > > OCR_Word word; > > result.GetWord(i, &word); > > } > > } > > OCR_CleanResult(&result); // Perhaps needed > > } > > This is horrible, and so close to what Tess already is. Sequencial / linear / > process programming is old now. It should be object, in namespaces (Then > let's say Tess or OCR namespace, whatever). > > > No new, no delete, no memory leak > > Wrong. None of my programs leak, all of them use "new" / "delete". Alloc / > dealloc is not synonym of leak. The code you're showing is not efficient, and > for example all your objects are on stack. We require some data to be in > heap, and that's what "new" does (And i really don't want to enter technical > discussion here). All your code actually works with copy constructing your > return values, even if you don't see it, and that is a big memory / processor > overhead on such program (OCR). > > > Very easy to call from Java, Phyton > > It's not bound to the "style" we use, more to the wrappers we provide. And > the team i'm trying to gather for now is C++ (Core) oriented. > > > OCR_DoOCR is basic C++ code that compile everywhere. #ifdef isn't > > needed in source code > > Sure, and it's an horrible, ugly, not flexible function, which will require > globals, static, etc. Also making it compile from everywhere has nothing to > do with the fact it's C++ code. > > > This is exactly my point point of view of what a portable OCR should be. > > Agreed. Now if for example you think your "LoadImageFromFile" works, tell me > how. Do we want to maintain such functions for all platforms, given that it's > already Qt / GTK / ImageMagik / whatever boost-like library's work? > > But the problem is a bit more complex. Our core OCR could be plain C++, and > accepting types we have control on (OCR_Image in your example). But shipping > a "core" out of the box is not an option, so there needs to be a "in-between" > layer wich does the platform -> our type conversion. That's what is obvioulsy > being in developpement with Tesseract (Leptonica) but it's horribly handled, > since Tess accepts a PIX* as input data, which is a Leptonica type. Leptonica > should be used in the in-between layer, not in the core. If we can reach this > point, the rest will be pretty easy, and i'm sure choosing Qt, boost, > leptonica, GTK, or whatever, then will just be a matter of taste. > > As you probably know, an OCR core involves a lot of different things, this > may include for example thread support. At this point, using a > platform-abstraction framework may become also a need for the core. > > A): Qt-ish library = Platform (And in some cases hardware) abstraction. > B): OCR = Something which needs to run on any platform. > C): A+B. > D): B+C = surounding layer + core, for example to make a standalone program > like "tesseract" binary. >
Pierre, Qt is a great library but it's not a good strategic choice - You need a license for commercial products (and the price is high) - It doesn't target all C++ platforms - You can't guarantee it'll exist and being maintained in 20 years - Some other technical issue list below Let's be less technical to expose my point of view: - If you want to do OCR this mean you have images, and you certainly have the lib to open them. Why include JPEG, TIFF or anything else functions? I already have this and it's so simple and fast to convert my image into a RAW 24 bits image array. - Some platform doesn't like STL, some other BOOST, because we need very basic wstring (unicode), vector, map, no I/O, no thread we can offer to avoid STL, BOOST and really being compatible with all platforms - Future is multicore processors, how take advantage of them? With an OCR it's simple, open 8 image and do 8 OCR at a time. Splitting OCR process for one image would be too complex and useless because on today processors OCR is fast enough for one image. - I don't want to reinvent an OCR, I want to reverse engineer Tesseract and rewrite only the OCR part. - A "pure" C++ library can very easily being transpose in Java, C# or same family language. It's important for future : next Windows Mobile 7 machine can't run C++ code. Windows Azure prefer manage code. Android... I think future is managed OS. Remi -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

