Hi All,
I am trying to remove Line Finding, Baseline Fitting and Skew Detection
modules from tesseract as my input image always has straight line words, So
i would like to know is there a way to remove this modules by configration.
Also if not by config can i remove this code by merealy code
Hi All,
I am trying to remove Line Finding , Baseline Fitting and Skew Detection
modules from the tesseract code as my input image will always have straight
line words(de-skiewed).
I would like to know is there any configration method to remove this
modules. Also if not by configration is it
You don't need to edit it. Just run command as on wiki. If is faster than
editing tr file...
Zdenko
On Sun, Feb 3, 2013 at 12:21 AM, Carlos Antunes cf.antu...@gmail.comwrote:
Zdenko,
Shall I edit it and remove it before going further?
Thanks.
On Saturday, February 2, 2013 1:53:33 PM
I have Ubuntu 12.04, which has tesseract 3.02 and leptonica version 1.69.
I've installed these, and also installed libtiff4 using apt-get.
When I try to process a document, I get:
↪ sudo tesseract united_states_v._ups_customhouse_brokerage_inc.tif
Can you send and example of you tif file?
Zdenko
On Sun, Feb 3, 2013 at 10:08 PM, Michael Lissner
mliss...@michaeljaylissner.com wrote:
I have Ubuntu 12.04, which has tesseract 3.02 and leptonica version 1.69.
I've installed these, and also installed libtiff4 using apt-get.
When I try to
It's about 300MB, unfortunately, but I generate it programmatically using
imagemagick in a way that's worked in the past, so I don't think the tiff
file itself is the issue.
If you're willing to download this monster, I'll post it to dropbox. I'd
love the help, but I don't think it's the right
Are you able to generate just one page or small example? Or can you provide
step how you create it (so I can create it)?
Tiff could be tricky. E.g. libtiff-4 do not work for me...
Zdenko
On Sun, Feb 3, 2013 at 10:29 PM, Mike Lissner
mliss...@michaeljaylissner.com wrote:
It's about 300MB,
Sure, that's a good idea.
Here's the original PDF:
http://courtlistener.com/pdf/2008/05/28/united_states_v._ups_customhouse_brokerage_inc..pdf
If you download that, then run:
convert -depth 4 -density 300
united_states_v._ups_customhouse_brokerage_inc..pdf
BTW: spp means Samples-per-pixel[1]. Are you able to instruct imagick to
use 1,3 or 4?
And I found report on stackoverflow[2] - there mentioned that imagick use
to set spp to 2, which should be invalid for tiff...
[1] http://tpgit.github.com/Leptonica/tiffio_8c_source.html
[2]
OK, we're getting somewhere!
I figured out that the Ubuntu repo just doesn't work properly with tiffs,
and recompiled and installed tesseract and leptonica.
So now when I run tesseract -v, I get:
↪ tesseract -v
tesseract 3.02.02
leptonica-1.69
libjpeg 8b : libpng 1.2.46 : libtiff 3.9.5 :
Looks like I'm all set.
I had to remove -flatten from the command above, and all is working now.
Thanks so much for the help.
On Sun, Feb 3, 2013 at 2:18 PM, Mike Lissner mliss...@michaeljaylissner.com
wrote:
OK, we're getting somewhere!
I figured out that the Ubuntu repo just doesn't
Hi there,
I wish to add new font to tesseract, but also I don't won't lose already
recognisable fonts in eng.traindata.
My question is, what are the default fonts of tesseract? and should I
re-train Tesseract on those fonts besides the new font?
Thanks
--
--
You received this message because
I recently did a personal project with Tesseract (Equation OCR) and the
finals results turned out pretty well:
http://ayoungprogrammer.blogspot.ca/
On Friday, 1 February 2013 11:34:04 UTC-5, Jakub Jaroš wrote:
Hello,
in our project, we would like to decide about using Tesseract for it or
Hello all,
I was able to train some new fonts thanks to the help I've got here.
The Wiki is somewhat vague when it comes to dictionaries.
On the Wiki there are few dictionaries mentioned as well as the concern
with the licenses.
Looking at both aspell and ispell there are different list of
On Sun, Feb 3, 2013 at 1:08 PM, Michael Lissner
mliss...@michaeljaylissner.com wrote:
I have Ubuntu 12.04, which has tesseract 3.02 and leptonica version 1.69.
I've installed these, and also installed libtiff4 using apt-get.
libtiff4 is also known as bigtiff. [1] lists important backward
15 matches
Mail list logo