Yes, but that doesn't stop people requesting more image formats, and
leptonica is a very handy way of providing them ,since we are moving towards
doing more image processing with it.Ray.

On Thu, Jul 9, 2009 at 10:10 PM, Arno Teigseth <[email protected]> wrote:

> Maybe a stupid question,
>
> but can't you use imagemagick to prepare the images for tesseract?
>
> This is how I do it on my computer.
>
> convert -monitor -black-threshold 75% "$i" "$i.png";
> # Create black/white, threshold adjustable
>
> convert -monitor -colors 2 -depth 8 -blur 0 "$i.png" "$i.tif";
> # Create tiff image, two color, 8bpp, remove noise (blur)
>
> rm "$i.png";
> # clean up the mess
>
> tesseract "$i.tif" "$i.ocrin" -l $TESSLANGUAGE;
> # ocr the file
>
>
> OK it's not pretty and definately not fast, but it works for nearly any
> kind of image. PDFs should be converted with "-density 300" option,
> though.
>
> my 2c's
>
> best
> Arno
>
> On Thu, 2009-07-09 at 19:10 -0700, Ray Smith wrote:
> > This is a plea for help!
> >
> >
> > Anyone interested in seeing 3.00 this side of August?
> >
> >
> > Here is the status:
> >
> >
> > Linux:
> > Preliminary alpha release compiles and runs. It is slower than 2.04,
> > due to the new page layout analysis, but the benefits are supposed to
> > outweigh that:
> > Page layout analysis.
> > *Lots* of languages.
> > more...
> > In theory the linux version should compile and link happily with
> > leptonica, given the right combination of apt-gets. Not tested yet, as
> > I have been bogged down with windows:
> >
> >
> > Windows:
> > Preliminary alpha release also compiles and runs *without leptonica
> > only*.
> > DLL is broken due to API change.
> >
> >
> > I only have very little time left before I will be away for a while,
> > but I was hoping to post a pre-alpha version to svn for people to try.
> >
> >
> > The problem is that there is no chance of getting the windows version
> > to work with leptonica any time soon, and without it the flagship page
> > layout analysis won't work properly.
> >
> >
> > Here is the problem:
> > Leptonica depends on the following lower-level libraries:
> > libjpeg,
> > libpng,
> > libtiff,
> > zlib.
> >
> >
> > DLLs for these are all available for windows, but they are all
> > compiled to use msvcrt.dll.
> > Tesseract and Leptonica will not work unless they use the same crt
> > (C-runtime) as the libraries, and VC++2008, which everyone wants to
> > use will not (without jumping through more hoops than I can ask an
> > average tesseract user to do) build anything to use msvcrt.dll. You
> > must use either a statically linked crt, or use msvcr90.dll, a newer
> > version that contains .net stuff that tesseract doesn't care about.
> >
> >
> > What I need are statically linked versions of the 4 libraries above
> > compiled to use a statically linked crt (/MT option) and possibly
> > their dependencies.
> > Alternatively, libraries built for the new msvcr90.dll (/MD) would do,
> > but that would mean everybody has to have the VC++2008 distributables.
> > It might help dll users though, when it is eventually working again.
> >
> >
> > This is not an easy task, as most of the sources for these libraries
> > don't have vcproj/sln projects with which to build them.
> > If anyone is sufficiently expert with VC++2008 and building other
> > people's code, and understands what I am talking about, I would be
> > grateful for the help.
> > The other viable alternative would be to compile letonica without
> > image i/o at all, and leave tesseract still unable to read anything
> > other than compressed tiff.
> >
> >
> > Ray.
> >
> >
> > PS A good place to get all these libraries
> > is:http://gnuwin32.sourceforge.net/packages/*.htm, where * is tiff,
> > jpeg, libpng, or zlib.
> >
> > On Tue, May 12, 2009 at 5:49 AM, javolo <[email protected]>
> > wrote:
> >
> >         Ditto!  I'm working on a pretty cool OCR application, and I'd
> >         happily
> >         help testing for access to the 3.0 beta or release candidate.
> >         I can test on Ubuntu and Windows XP.
> >
> >         Thanks...
> >
> >
> >         On May 4, 3:07 pm, "Rob H." <[email protected]> wrote:
> >         > But seriously... I'm writing a fairly interesting
> >         application using
> >         > Tesseract for my client: Gulfstream Aerospace.
> >         > I have no problem testing 3.0, especially if I can get some
> >         > performance gains.
> >
> >
> >
> >
> >
> > > >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to