> Here are some thoughts, and I would like to get input from the
> developer/user community on this issue:
>
> For leptonica:
>
>    - Some features will depend on it. To get best performance you will need
>    it.
>    - It could allow simplification of the code, and elimination of the old
>    IMAGE class.
>    - It will allow reading of many more image formats, which a lot of users
>    have requested.
>    - It might be easier if the default windows project files assume that you
>    have leptonica. That would make it easier to build with it, and it would
>    only be a case of downloading it.
>
> Against making tesseract dependent on leptonica:
>
>    - It will require several additional components: leptonica, libtiff,
>    libjpg, libpng, which would bloat the executable, and many (windows) users
>    have refused to even download libtiff.
>    - Installation and build support will become much more effort. (Mostly
>    for windows) If somebody could write a windows installer for it (open 
> source
>    of course), then that would simplify installation a lot for the windows
>    user-only community.

Have you considered instead simplifying the image format all the way?
I mean by this drop the tiff input and replace it with a trivial
encoding such as the Netpbm formats (http://netpbm.sourceforge.net/)
which are extremely portable.

The advantage is simplicity: a reader or writer for the black and
white format takes 5-10 lines of code to write from scratch in just
about any language, so anybody could interface with tesseract easily.
There would be no build dependencies, and no special handling of many
different file formats in the tesseract code, so you can concentrate
on the OCR.

The disadvantage would be that scanners perhaps don't write the pbm
format natively, so users would likely need to convert their images at
some point. Also, pbm files tend to get big, but that's easy to fix
with compression. Many free compression libraries have wrappers for
file handling routines which make reading and writing compressed files
transparent.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to