Hi, What is OCR? You have a 2 bit image and you try to get text from it.
>From my point of view an OCR engine don't need image library and image processing library. Keep the code simple, let the developers bundle it with the image lib he likes/wants to use (open source (libtiff), OS included (gdiplus.dll) or commercial (LeadTools, Accusoft...)) The only image processing you can include is thresholding from 24 bit image. Today tesseract have 3 big problem : - memory leak. - too complex code. - process oriented, it's not designed to be use as a lib (exit(), file I/O...) What we need, I'm sure, is a complete rewritting. Transform the 222 cpp file to less than 20. A C++ lib should be OS independent, simply because you don't need OS specific API (no I/O). I think the correct direction is to 1) reverse engineer the code and document it 2) complete rewriting from the documentation When you have a good OCR lib then you can bundle it for "public" usage. I spent a lot of time in tesseract code source and I don't want to spend more time in it. I'm ready to help for a complete rewriting. Remi Tessnet2 author C++ dev since 1989 Windows platform expert (C++/C#) Image processing expert Freelance since 2001 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

