http://www.linux.com/archive/feed/57222
Also, it can generate output only in the US-ASCII character set, so
glyphs with accent marks or other unsupported attributes will probably
be reproduced incorrectly.
Which is the option to make it limit output to the ASCII charset only?
Some letters such as
I searched a lot and found this:
tesseract image.tif boxes batch.nochop makebox
If I invoke that, i get a boxes.txt file with what appear to be
coordinates. But they are too large. I read somewhere that tesseract
computes the coordinates from the bottom of the image and not from the
top left
(77yrsold)
On Wed, May 26, 2010 at 8:39 AM, nguyenq nguyen...@gmail.com wrote:
You can perform some text manipulations in post-processing steps to
strip out diacritical marks to leave only the base ASCII characters
behind.
On May 25, 3:34 pm, haratron harat...@gmail.com wrote:
http
I'd like to know if there's an OCR forum and/or IRC channel where
people can ask/answer OCR related questions.
Anyone knows if something like that exists?
--
You received this message because you are subscribed to the Google Groups
tesseract-ocr group.
To post to this group, send email to
I'm also interested in this topic.
I have a couple of questions:
1. How can I calculate the ideal image size (300dpi?) to feed to
tesseract? I mean, how do I identify how much scaling the image needs,
before the OCR procedure.
2. I'm currently using ImageMagick's convert program for scaling and
I'm using tesseract 3.00 with hOCR output and I get the xocr_word
among other things.
Example:
span class='xocr_word' id='xword_1_5' title=x_wconf -4testing/span
The x_wconf attribute is for certainty of the result. Which is
calculated through a certainty() function, from what I saw in
Thank you
On Wed, Oct 6, 2010 at 3:26 AM, Jimmy O'Regan jore...@gmail.com wrote:
On 5 October 2010 23:43, haratron harat...@gmail.com wrote:
I'm using tesseract 3.00 with hOCR output and I get the xocr_word
among other things.
Example:
span class='xocr_word' id='xword_1_5' title=x_wconf
Thank you Jimmy.
batch and nobatch are empty and batch.nochop contains:
chop_enable 0
wordrec_enable_assoc 0
What do these do?
On Wed, Oct 6, 2010 at 3:10 AM, Jimmy O'Regan jore...@gmail.com wrote:
On 5 October 2010 23:17, haratron harat...@gmail.com wrote:
I'm trying to figure out the way
Hello Neo,
which SWT implementation did you use? There are several ones out there
and I haven't found one that produces your result yet.
Thanks
On Thu, Dec 13, 2012 at 1:24 PM, Dmitri Silaev daemons2...@gmail.com wrote:
Neo Song,
There are two usual approaches to problems like yours. The
Is there a generic OCR or document image analysis forum or mailing list
somewhere? Something that's not limited to tesseract.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it,
I'm using this snippet to crop an input image into textlines:
Boxa* boxes = api->GetComponentImages(tesseract::RIL_TEXTLINE, true, false,
0, NULL, NULL, NULL);
for (int i = 0; i < boxes->n; i++) {
BOX* box = boxaGetBox(boxes, i, L_CLONE);
PIX* pixd= pixClipRectangle(image, box, NULL);
No, I want to dewarp the warped lines of a page of a book. The warped lines
is due to perspective distortion (picture acquired with the camera of a
mobile phone) and curvature of the book. RIL_WORD or RIL_SYMBOL wouldn't
help with that.
On Thu, Dec 22, 2016 at 6:50 AM, Junmock Lee
Does tesseract provide a way to dewarp warped text lines?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post
How can I find blocks of text (not paragraphs necessarily) with tesseract?
If not possible with tesseract, do you know of any other tool that can do
this?
I want to do OCR zoning.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe
14 matches
Mail list logo