I am curious to know how long will take "to reinvent an OCR",as well as "to reverse engineer Tesseract and rewrite *only the OCR *part". It is presumed that *reverse engineer and re-write only the OCR* part will take *shorter period* than *reinvent new OCR*" With Regards, -sriranga(77yrsold)
On Sun, Apr 11, 2010 at 12:10 PM, rthomas <[email protected]> wrote: > > > On Apr 10, 7:22 pm, MARTIN Pierre <[email protected]> wrote: > > Hello Remi, > > > > > I share a rather other point of view of what should be an universal C+ > > > + OCR engine. > > > > i agree, but only for the core (Which should only accept "internal" types > we create, so it's cross-platform). > > > > > Python is great on "Intel" or more precisely on Google Apps Engine. > > > Do you known about Windows Mobile 4, 5 or 6? Do you know about Nokia? > > > Or AS/400 ? Beagleboard (that I don't know) ? > > > These kind of platforms doesn't like STL, BOOST, Qt and I'm sure an > > > OCR doesn't need these kind of library. What for ? For string class, > > > vectors, list, map... > > > > Wrong, Qt is running everywhere. And that's, as i said in a previous > mail, only for being able to easily port, we can get rid of it once we have > stable code. > > > > > An OCR doesn't need very advanced string class, vectors, list and map > > > we can find more portable implementation. > > > > It does, as soon as we want the API to be "communicant" easily with any > other language / APP. Re-creating our strings, vectors and type isnt a good > idea, since it will be the same pitfall Tess already felt into: forcing the > developper who "simply" wants to use a single API to include a huge load of > various headers. > > > > > My point of view is a super very simple code, that only do OCR. No I/ > > > O, no image decompression, nothing else than OCR. > > > I never use smart pointers and never get a memory leak. > > > Most platform already have image loading libraries (jpeggroup, > > > libtiff, gdiplus, ...) > > > > That's not the point! Let's say all platforms have support for all file > types, even now, it doesn't mean that the API is the same on each of those > platforms. We don't want to maintain wrappers for each, it's the job of > Nokia (Former Trolltech) with Qt, for example. Windows, Linux, Mac have > thread support. But they all are encapsulated in different APIs, for > example. > > > > > > > > > Imagine OCR API begin with OCR_ > > > int main(int argc, char **argv) > > > { > > > // Parse aguments > > > OCR_Params oparams; // Contain language def, OCR params > > > LoadParametersFromFile(&oparams); // Or from a stream, a db, ... > > > developer specific > > > > > OCR_Image image; // A simple 24 bit image format > > > LoadImageFromFile(&image); // Decode JPEG, TIFF with platform > > > library > > > > > OCR_Result result; > > > if (OCR_DoOcr(&oparams, &image, &result) { > > > // Use the result > > > for (int i=0; i<result.WordCount; i++) { > > > OCR_Word word; > > > result.GetWord(i, &word); > > > } > > > } > > > OCR_CleanResult(&result); // Perhaps needed > > > } > > > > This is horrible, and so close to what Tess already is. Sequencial / > linear / process programming is old now. It should be object, in namespaces > (Then let's say Tess or OCR namespace, whatever). > > > > > No new, no delete, no memory leak > > > > Wrong. None of my programs leak, all of them use "new" / "delete". Alloc > / dealloc is not synonym of leak. The code you're showing is not efficient, > and for example all your objects are on stack. We require some data to be in > heap, and that's what "new" does (And i really don't want to enter technical > discussion here). All your code actually works with copy constructing your > return values, even if you don't see it, and that is a big memory / > processor overhead on such program (OCR). > > > > > Very easy to call from Java, Phyton > > > > It's not bound to the "style" we use, more to the wrappers we provide. > And the team i'm trying to gather for now is C++ (Core) oriented. > > > > > OCR_DoOCR is basic C++ code that compile everywhere. #ifdef isn't > > > needed in source code > > > > Sure, and it's an horrible, ugly, not flexible function, which will > require globals, static, etc. Also making it compile from everywhere has > nothing to do with the fact it's C++ code. > > > > > This is exactly my point point of view of what a portable OCR should > be. > > > > Agreed. Now if for example you think your "LoadImageFromFile" works, tell > me how. Do we want to maintain such functions for all platforms, given that > it's already Qt / GTK / ImageMagik / whatever boost-like library's work? > > > > But the problem is a bit more complex. Our core OCR could be plain C++, > and accepting types we have control on (OCR_Image in your example). But > shipping a "core" out of the box is not an option, so there needs to be a > "in-between" layer wich does the platform -> our type conversion. That's > what is obvioulsy being in developpement with Tesseract (Leptonica) but it's > horribly handled, since Tess accepts a PIX* as input data, which is a > Leptonica type. Leptonica should be used in the in-between layer, not in the > core. If we can reach this point, the rest will be pretty easy, and i'm sure > choosing Qt, boost, leptonica, GTK, or whatever, then will just be a matter > of taste. > > > > As you probably know, an OCR core involves a lot of different things, > this may include for example thread support. At this point, using a > platform-abstraction framework may become also a need for the core. > > > > A): Qt-ish library = Platform (And in some cases hardware) abstraction. > > B): OCR = Something which needs to run on any platform. > > C): A+B. > > D): B+C = surounding layer + core, for example to make a standalone > program like "tesseract" binary. > > > > Pierre, > > Qt is a great library but it's not a good strategic choice > - You need a license for commercial products (and the price is high) > - It doesn't target all C++ platforms > - You can't guarantee it'll exist and being maintained in 20 years > - Some other technical issue list below > > Let's be less technical to expose my point of view: > - If you want to do OCR this mean you have images, and you certainly > have the lib to open them. Why include JPEG, TIFF or anything else > functions? I already have this and it's so simple and fast to convert > my image into a RAW 24 bits image array. > - Some platform doesn't like STL, some other BOOST, because we need > very basic wstring (unicode), vector, map, no I/O, no thread we can > offer to avoid STL, BOOST and really being compatible with all > platforms > - Future is multicore processors, how take advantage of them? With an > OCR it's simple, open 8 image and do 8 OCR at a time. Splitting OCR > process for one image would be too complex and useless because on > today processors OCR is fast enough for one image. > - I don't want to reinvent an OCR, I want to reverse engineer > Tesseract and rewrite only the OCR part. > - A "pure" C++ library can very easily being transpose in Java, C# or > same family language. It's important for future : next Windows Mobile > 7 machine can't run C++ code. Windows Azure prefer manage code. > Android... I think future is managed OS. > > Remi > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

