Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-04 Thread Greg Dunkel
I just scanned approximately 200 pages in Ubuntu 12.10 with no problems, using 3.02 package from the repository. I had to use convert to improve the tiffs from my scanner, but I got very good results, with a very low error rate. Didnothing special. /greg On Sun, Feb 3, 2013 at 4:08 PM,

Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread Michael Lissner
I have Ubuntu 12.04, which has tesseract 3.02 and leptonica version 1.69. I've installed these, and also installed libtiff4 using apt-get. When I try to process a document, I get: ↪ sudo tesseract united_states_v._ups_customhouse_brokerage_inc.tif

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread zdenko podobny
Can you send and example of you tif file? Zdenko On Sun, Feb 3, 2013 at 10:08 PM, Michael Lissner mliss...@michaeljaylissner.com wrote: I have Ubuntu 12.04, which has tesseract 3.02 and leptonica version 1.69. I've installed these, and also installed libtiff4 using apt-get. When I try to

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread Mike Lissner
It's about 300MB, unfortunately, but I generate it programmatically using imagemagick in a way that's worked in the past, so I don't think the tiff file itself is the issue. If you're willing to download this monster, I'll post it to dropbox. I'd love the help, but I don't think it's the right

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread zdenko podobny
Are you able to generate just one page or small example? Or can you provide step how you create it (so I can create it)? Tiff could be tricky. E.g. libtiff-4 do not work for me... Zdenko On Sun, Feb 3, 2013 at 10:29 PM, Mike Lissner mliss...@michaeljaylissner.com wrote: It's about 300MB,

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread Mike Lissner
Sure, that's a good idea. Here's the original PDF: http://courtlistener.com/pdf/2008/05/28/united_states_v._ups_customhouse_brokerage_inc..pdf If you download that, then run: convert -depth 4 -density 300 united_states_v._ups_customhouse_brokerage_inc..pdf

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread zdenko podobny
BTW: spp means Samples-per-pixel[1]. Are you able to instruct imagick to use 1,3 or 4? And I found report on stackoverflow[2] - there mentioned that imagick use to set spp to 2, which should be invalid for tiff... [1] http://tpgit.github.com/Leptonica/tiffio_8c_source.html [2]

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread Mike Lissner
OK, we're getting somewhere! I figured out that the Ubuntu repo just doesn't work properly with tiffs, and recompiled and installed tesseract and leptonica. So now when I run tesseract -v, I get: ↪ tesseract -v tesseract 3.02.02 leptonica-1.69 libjpeg 8b : libpng 1.2.46 : libtiff 3.9.5 :

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread Mike Lissner
Looks like I'm all set. I had to remove -flatten from the command above, and all is working now. Thanks so much for the help. On Sun, Feb 3, 2013 at 2:18 PM, Mike Lissner mliss...@michaeljaylissner.com wrote: OK, we're getting somewhere! I figured out that the Ubuntu repo just doesn't

Re: Tiff support for tesseract 3.02 on Ubuntu 12.04

2013-02-03 Thread TP
On Sun, Feb 3, 2013 at 1:08 PM, Michael Lissner mliss...@michaeljaylissner.com wrote: I have Ubuntu 12.04, which has tesseract 3.02 and leptonica version 1.69. I've installed these, and also installed libtiff4 using apt-get. libtiff4 is also known as bigtiff. [1] lists important backward