Yes, but that doesn't stop people requesting more image formats, and leptonica is a very handy way of providing them ,since we are moving towards doing more image processing with it.Ray.
On Thu, Jul 9, 2009 at 10:10 PM, Arno Teigseth <[email protected]> wrote: > Maybe a stupid question, > > but can't you use imagemagick to prepare the images for tesseract? > > This is how I do it on my computer. > > convert -monitor -black-threshold 75% "$i" "$i.png"; > # Create black/white, threshold adjustable > > convert -monitor -colors 2 -depth 8 -blur 0 "$i.png" "$i.tif"; > # Create tiff image, two color, 8bpp, remove noise (blur) > > rm "$i.png"; > # clean up the mess > > tesseract "$i.tif" "$i.ocrin" -l $TESSLANGUAGE; > # ocr the file > > > OK it's not pretty and definately not fast, but it works for nearly any > kind of image. PDFs should be converted with "-density 300" option, > though. > > my 2c's > > best > Arno > > On Thu, 2009-07-09 at 19:10 -0700, Ray Smith wrote: > > This is a plea for help! > > > > > > Anyone interested in seeing 3.00 this side of August? > > > > > > Here is the status: > > > > > > Linux: > > Preliminary alpha release compiles and runs. It is slower than 2.04, > > due to the new page layout analysis, but the benefits are supposed to > > outweigh that: > > Page layout analysis. > > *Lots* of languages. > > more... > > In theory the linux version should compile and link happily with > > leptonica, given the right combination of apt-gets. Not tested yet, as > > I have been bogged down with windows: > > > > > > Windows: > > Preliminary alpha release also compiles and runs *without leptonica > > only*. > > DLL is broken due to API change. > > > > > > I only have very little time left before I will be away for a while, > > but I was hoping to post a pre-alpha version to svn for people to try. > > > > > > The problem is that there is no chance of getting the windows version > > to work with leptonica any time soon, and without it the flagship page > > layout analysis won't work properly. > > > > > > Here is the problem: > > Leptonica depends on the following lower-level libraries: > > libjpeg, > > libpng, > > libtiff, > > zlib. > > > > > > DLLs for these are all available for windows, but they are all > > compiled to use msvcrt.dll. > > Tesseract and Leptonica will not work unless they use the same crt > > (C-runtime) as the libraries, and VC++2008, which everyone wants to > > use will not (without jumping through more hoops than I can ask an > > average tesseract user to do) build anything to use msvcrt.dll. You > > must use either a statically linked crt, or use msvcr90.dll, a newer > > version that contains .net stuff that tesseract doesn't care about. > > > > > > What I need are statically linked versions of the 4 libraries above > > compiled to use a statically linked crt (/MT option) and possibly > > their dependencies. > > Alternatively, libraries built for the new msvcr90.dll (/MD) would do, > > but that would mean everybody has to have the VC++2008 distributables. > > It might help dll users though, when it is eventually working again. > > > > > > This is not an easy task, as most of the sources for these libraries > > don't have vcproj/sln projects with which to build them. > > If anyone is sufficiently expert with VC++2008 and building other > > people's code, and understands what I am talking about, I would be > > grateful for the help. > > The other viable alternative would be to compile letonica without > > image i/o at all, and leave tesseract still unable to read anything > > other than compressed tiff. > > > > > > Ray. > > > > > > PS A good place to get all these libraries > > is:http://gnuwin32.sourceforge.net/packages/*.htm, where * is tiff, > > jpeg, libpng, or zlib. > > > > On Tue, May 12, 2009 at 5:49 AM, javolo <[email protected]> > > wrote: > > > > Ditto! I'm working on a pretty cool OCR application, and I'd > > happily > > help testing for access to the 3.0 beta or release candidate. > > I can test on Ubuntu and Windows XP. > > > > Thanks... > > > > > > On May 4, 3:07 pm, "Rob H." <[email protected]> wrote: > > > But seriously... I'm writing a fairly interesting > > application using > > > Tesseract for my client: Gulfstream Aerospace. > > > I have no problem testing 3.0, especially if I can get some > > > performance gains. > > > > > > > > > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

