Hi Luke,
On Mon, Aug 06, 2018 at 02:12:38PM -0700, Luke Brandl wrote:
> I've been working to understand Tesseract and looking through the C and Python
> API code and documentation. It looks like some of the code and documentation
> are up to date, while the rest refers to 3.0.2 at least in the
Hi Devon,
On Mon, Feb 22, 2016 at 10:43:33AM -0800, Devon Yoo wrote:
> I have test set that only has "uppercase English alphabets" and "numbers". But
> the provided eng.traineddata returns symbols and lower case alphabets
> sometimes. Is there a way to modify the existing traineddata file so
Hi Łukasz,
> Is it possible to run tesseract without setting up
> LD_LIBRARY_PATH?
Why don't you want to just use LD_LIBRARY_PATH? I suspect, to be
honest, that it would be difficult to compile the leptonica library
into the tesseract executable. It would be fun and interesting (to
me) to
So this email prompted me to try something a little crazy, but it
worked; I just built a statically linked tesseract binary :)
A long time ago I wrote some plain makefiles which didn't rely on
any automake / cmake stuff. The main devs weren't interested,
understandably, but it was useful and
Hi Yizhen,
On Tue, Nov 24, 2015 at 07:08:24PM -0800, Yizhen Hai wrote:
> I am working on a volunteer project to digitize the Sutra and all related
> materials, most of them in Tibetan.
Sounds like a great project :)
> Therefore, I wonder how I can get help to use Tesseract for Tibetan. (I am
On Tue, Nov 10, 2015 at 08:59:19AM -0800, Ryan Baumann wrote:
> Thanks for this, Nick. I'm just getting around to looking into moving my Latin
> training into the tesstrain.sh system and this is very helpful.
Great, I was planning to do that myself with your Latin training -
let me know if you
tar.gz[x_8px]
> Dear Nick,
> Awaiting your valuable guidance.Kindly treat my request as SOS due to my
> overaged factor of 83+yrs old. I want to enjoy the program.
> With warmest reagards,
> sriranga(83yrs)
>
> On Tue, Nov 3, 2015 at 12:19 AM, Nick White <nick.wh...@du
Hi Sriranga,
> I find there three files of '.sh - viz.
> 1) language-specific.sh. (My lang is "kan")
> 2)tesstrain.sh
> 3)tesstrain_utils.sh.
> Request for the valuable guidance how to use above .sh files ( step by step
I plan to write up some proper documentation on how to use these
scripts
Hi all,
I recently finally got around to organising and releasing some
(well, a lot of) ground truth files for the language I have been
training for ages now, Ancient Greek. By "ground truth" I mean real
page scans with the corresponding (hand-typed) correct text, which
is essential to be
Just a note, all the .git URLs listed below are git repositories,
and there isn't a web interface to them on my server, so just clone
them directly like this:
git clone http://ancientgreekocr.org/mignetools.git
Nick
On Thu, Oct 29, 2015 at 06:23:21PM +, Nick White wrote:
> Hi
Hi Alfred,
On Fri, Oct 23, 2015 at 01:11:55AM -0700, Alfred Puca wrote:
> I sent an attachment with the error using program from command line with psm
> option-4
Thanks for that. The first thing I notice is that you're using an
old version of tesseract (3.02). Can you update to the latest
Hi Alfred,
On Wed, Oct 21, 2015 at 01:16:22AM -0700, Alfred Puca wrote:
> I'm having problems with psm option 4 (Assume a single column of text of
> variable sizes).
> It seems as a bug in the application.
> How is it possible to use this option?
What problems are you having? Can you give an
Hi Avinash,
On Wed, Oct 21, 2015 at 01:40:35PM -0700, Avinash Mishra wrote:
> I dont have VPS can anybody tell me how to install it on shared hosting
The instructions for installing without root should be what you need:
On Wed, Sep 16, 2015 at 10:16:40PM +0530, ShreeDevi Kumar wrote:
> If you are having trouble using it with Java, Quan maybe able to suggest a
> solution.
I agree, this sounds more like a Java issue to me. I don't know Java
at all, but if it's treating anything that sends output to stderr as
On Fri, Sep 11, 2015 at 12:13:02AM -0700, fsbo.cons...@gmail.com wrote:
> To anyone else who may run across this, it is because of the way C++ uses
> scope
> to optimize the code when it compiles. Things that are within the scope of the
> for loop will run faster than things that have larger
On Fri, Aug 21, 2015 at 02:13:17PM +0100, Allistair wrote:
This, I think, just illustrates there is no one-size-fits-all approach. All
methods should be enumerated for installing Tesseract for Mac.
I disagree. Mac OS X is a homogenous enough system that we ought to
be able to do it right,
Hi all, long time since I last posted here.
This is just a little update about some training related tools I
wrote a while ago, the 'tesstrainingtools' collection. It has
largely been superceded by the training stuff that's included in
Tesseract now, but maybe someone will still find it
even with specific training?
Tom
On Wednesday, January 22, 2014 at 11:55:28 AM UTC-5, Nick White wrote:
Hi Epin,
On Sat, Jan 18, 2014 at 01:32:11AM -0800, Epin Dorsal wrote:
I've been looking for a soft means for recognition the
international phonetic transcription
On Fri, Aug 22, 2014 at 12:42:21PM -0700, Thomas Bruno wrote:
Is this common when training from text2image output?
APPLY_BOXES: boxfile line 5364/748 ((1488,893),(1532,6)): FAILURE! Couldn't
find a matching blob
FAIL!
Yes, there will be some of these. Check the proportion of failing to
On Wed, Aug 20, 2014 at 07:39:50PM -0700, SHEN Fei wrote:
hi Nick,
I'm trying to use tesseract in my mobile phone so the tessdata size is
critical.
Since I only care about very few fonts, it would be convenient if I could add/
remove a special font.
Maybe removing some dictionary files
On Thu, Aug 21, 2014 at 01:41:23PM +0530, Shree Devi Kumar wrote:
Hi Zdenko,
./ confusing for me :-)
:-) ./ is a common idiom for unix. '.' means 'current directory', so
./ means 'in the current directory'. You have to do it to run
programs in the current directory (or just do something
Hi Dovhani,
Does this happen with all images when using your training, or just
one?
Nick
On Thu, Aug 21, 2014 at 03:03:47AM -0700, Dovhani Foneworx wrote:
Hi guys, I have a problem, I have succesfully trained tesseract 3.03 in Ubunt
14.04 but when i run tesseract it is giving errors on an
, Aug 21, 2014 at 4:03 PM, Nick White nick.wh...@durham.ac.uk wrote:
Hi Dovhani,
Does this happen with all images when using your training, or just
one?
Nick
On Thu, Aug 21, 2014 at 03:03:47AM -0700, Dovhani Foneworx wrote:
Hi guys, I have a problem, I have
On Thu, Aug 21, 2014 at 11:29:09AM -0700, shree wrote:
zdenko,
the current problem also seems related to strtok_r
please see
http://stackoverflow.com/questions/12973750/
fatal-error-strtok-r-h-no-such-file-or-directory-while-compiling-tesseract-oc
Hi Shen,
On Wed, Aug 20, 2014 at 01:10:30AM -0700, SHEN Fei wrote:
Can I remove some fonts from an existing traineddata file?
For example, if I only need 2 or 3 comon fonts of default eng.traineddata, is
there a way to extract them out of the original file?
No, I'm afraid not, not at the
Hi Thomas,
On Mon, Aug 18, 2014 at 02:17:19PM -0700, Thomas Bruno wrote:
Where can I find the box/tif combo for the eng.traineddata that Tessearct 3.02
provides for download?
The tif/box files used to create the eng.traineddata for 3.02 are
not available, and are very unlikely to be made so,
Hi Chris,
On Wed, Aug 20, 2014 at 11:12:50AM -0700, Chris Smeal wrote:
I've been doing some research on using Tesseract for both document scans and
text in scenery, and I was wondering what image processors are best? Given I
have a lot of images, I cannot process each batch by hand, so I will
On Wed, Aug 13, 2014 at 08:39:06AM -0700, Oliver Nicolini wrote:
A little up, I can't find any doc for this topic. If anyone can help that
would
be fantastic.
Did you read Paul's reply? Tesseract only does binarisation. If you
don't want it to do that, binarise your image before passing it
On Tue, Aug 12, 2014 at 12:58:23PM +0530, Shree Devi Kumar wrote:
On Tue, Aug 12, 2014 at 4:31 AM, testing1234
cory.hix...@gmail.com wrote:
Note.. Step 5 above the last command should be
sudo make install-langs
Nick, it maybe helpful to add/update instructions in wiki.
Cory
Dear Wikisourcerers,
It's good to hear from you. Wikisource is awesome, as far as I am
concerned.
One
of the most serious issues was raised by the Belarusian community which uses 2
different scripts with no commercial OCR support. This means that the
volunteers have to type each word
On Thu, Jul 24, 2014 at 05:53:56AM -0700, Victoria A. wrote:
From my experience, seeing that Tesseract's English training data can
recognize
words that are NOT contained in the dictionary, I suppose Tesseract only uses
the custom dictionary for hints instead of only knowing the words in the
Hi David,
You're right, that would be useful. Tesseract has a basic version of
that, called patterns; search the manpage for a bit of information
on them.
However at present they can't be assigned per region, only as
possible patterns for the whole OCR job. Also they aren't
restrictive, but
On Tue, Jul 22, 2014 at 11:48:21PM +0200, zdenko podobny wrote:
If you want to have several version of tesseract (e.g. you want to compare OCR
result) I would suggest you to compile them from source (e.g. in /usr/src) and
not installed them. If you want to test particular version you can run
Hi Prashant,
On Wed, Aug 06, 2014 at 01:32:54AM -0700, Prashant Mahskey wrote:
I am using tesseract for my android app with arabic language. I've
copied all the files required from the language files download page. I've
tried
with gray scaling and cropping extra blank part from the
Hi Rara,
On Thu, Jul 31, 2014 at 08:29:51AM -0700, Rara wrote:
I'm searching of a detailed guide for developpement with Tesseract and a tuto
explained how to use and test this platform with windows OS.
Looking forward to your answer !
There is an example program using the C API here:
Hi Albrecht,
Sorry for not replying sooner, I've been away.
Nevertheless I read a post from Ray where he says that he receives
millions of
emails and the last thing he likes to do is writing long texts (email
responses
or documentations). I think this is a fatal situation, because if he
Hi Fajar,
Looks like you should try binarising the image yourself prior to
handing it over to Tesseract.
Nick
--
You received this message because you are subscribed to the Google Groups
tesseract-ocr group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Hi Richard,
On Sun, Jul 20, 2014 at 01:51:32PM -0700, Richard Arnold wrote:
Stroke Width Transform looks very interesting. However, I have some questions
regarding its use in what I'm doing.
I'm writing a Desktop application and OpenOCR appears to use a web service
call??
Stroke Width
On Wed, Aug 06, 2014 at 08:50:27PM +0530, Shree Devi Kumar wrote:
My current plan for documentation is as follows:
- Rewrite and simplify TrainingTesseract3 on the wiki
- Write manpages for each tool in training/
- Document how each training file is used, and
On Wed, Jul 16, 2014 at 11:17:00PM -0700, Jing JC wrote:
I am going through Ray Smith's tutorial, and don't get it?
He means that as the co-ordinate system uses bottom left as the
origin, you will never get a minus number co-ordinate (as you could
if the origin was elsewhere).
--
You
On Thu, Jul 17, 2014 at 12:14:43AM -0700, Jing JC wrote:
The Ray's tutorial said the bounding box overlaps.
so when I modify the box inside JTessbox,
do I keep the overlapping boxes,
or
make the boxes non touching.
That's interesting, actually; I didn't realise Tesseract did
outlining
Hi,
The part you aren't reading closely enough from the manual page is:
properties
An integer mask of character properties, one per bit. From least
to most significant bit, these are: isalpha, islower, isupper,
isdigit, ispunctuation.
So ; has ispunctuation set, but none of the
Hi Albrecht,
On Mon, Jul 14, 2014 at 01:10:07PM -0700, Albrecht Hilker wrote:
When I download the traineddata files and extract the unicharset file from
them
I notice that some are extremely different from the ones on SVN in the folder
training/langdata.
For example:
Bengali, Hebrew,
Hi again,
On Mon, Jul 14, 2014 at 09:38:26AM -0700, Albrecht Hilker wrote:
After some days I came back here and I'm very surprised about your lots of
posts.
Thanks for answering and taking the time.
As you may have noticed, there aren't too many people around here
who are comfortable
Sorry for the noise. I've looked into this more, and discovered more
:)
On Tue, Jul 15, 2014 at 10:54:06AM -0400, Nick White wrote:
On Mon, Jul 14, 2014 at 01:10:07PM -0700, Albrecht Hilker wrote:
When I download the traineddata files and extract the unicharset file from
them
I notice
Hi Mustak,
On Tue, Jul 15, 2014 at 03:14:35AM -0700, Mustak M wrote:
I am new to tesseract. I am using tesseract 3.2. I am able to retrieve the
text
from an image. And able to get the co-ordinates for each word with tesseract
source.jpg output hocr command. Is there any command to retireve
Hi,
On Tue, Jul 15, 2014 at 10:04:24AM -0700, Jing JC wrote:
yep yep.
Thanks a lot Nick.
I tried to cancel mu post last night.
but seems I can not get access to it after posted but before approved.
I tried to match the V2's example to V3's format.
I figured it out later.
No
On Mon, Jul 14, 2014 at 11:36:46AM -0700, Paul wrote:
Am Montag, 14. Juli 2014 10:07:59 UTC+2 schrieb sibi kanagaraj:
But , I feel that Tamil Training is not sufficient and it
could be
streamlined . Hence I went to see if there are sufficient training
documents for Tamil .
On Mon, Jul 14, 2014 at 07:38:19AM -0700, Christopher Smeenk wrote:
I found the source for v3.03 here: http://packages.ubuntu.com/trusty/
tesseract-ocr
The version called 3.03 in Ubuntu is an -rc - there is no official
3.03 release yet. As I understand it Ray Jeff called it 3.03 so
that
I build the tesseract svn source code in win8, I used the
VS2013/Cygwin/MinGW to build this, all failed.
Hi, you need to give us more clues as to why it failed. What error
messages did you get?
what version of leptonica the newest svn use? 1.70 or 1.71?
Tesseract should work fine with
On Sun, Jul 13, 2014 at 06:38:11PM +0430, universal reseller wrote:
is google drive use tesseract 3.03 ?
It's -rc1, meaning release candidate 1. So it isn't an official
release, but rather a testing preview release, which should be to
what the final 3.03 will be.
i checked one english pdf
On Fri, Jul 11, 2014 at 03:06:29PM -0700, Alex Ryan wrote:
I wrote some simple code to preprocess the image because I realized I will be
doing basically the same image every time so its foolish to try and use
Tesseracts binaziration technique which was designed for a different and more
Hi Alex,
One quick thought, if you're still using .uzn, it's only loaded with
certain psm levels (it is with -psm 6, but not -psm 3, the default).
And it's loaded from imagename_without_extension.uzn. So if you
have any .uzn files lying around, they will be being applied with
psm 6, but not
Hi,
I haven't tried it, but quickly grepping around the source code
suggests setting the config variable crunch_include_numerals to
true might do the job.
Please let us know if that works.
Nick
On Wed, Jul 09, 2014 at 11:15:10PM -0700, Damien D wrote:
Hi everyone,
tesseract seems to
I'm just going to go through your numbered points here.
On Fri, Jul 04, 2014 at 10:02:43AM -0700, Albrecht Hilker wrote:
1.)
The column other_case should contain the ID of the other-case letter.
For the lowercase letters they point correctly to the uppercase letters.
But the uppercase letters
On Sat, Jul 05, 2014 at 03:34:05PM -0700, Albrecht Hilker wrote:
Hello zdenop
It is clear that you are not the right person to answer this question.
If YOU would ever have looked into the source code you have seen that these
values ARE in use (in version 3.03).
You're being pretty unfair on
I have more thoughts to the unicharset metrics discussion.
So this example says that
the character 1 has a min_bottom value of 59 and
the character 9 has a min_bottom value of 18.
Weird ? ? ?
Both numbers are aligned to the baseline!
I am guessing now (I'll take a look at the code later),
On Tue, Jul 08, 2014 at 10:36:50PM -0700, Alex Ryan wrote:
In one of the links tho I saw something about -psm setting. When I run the OCR
with -psm 6 all of a sudden it worked perfect!!! Im really not sure what that
setting does, ive tried doing some searches, but im still unclear. Can you
V
£
4
9
Q
A
P
¢
]
3
2
©
8
/
X
é
j
;
7
€
O
¥
U
x
}
E
§
=
!
’
G
)
Z
q
{
“
—
Y
K
*
W
\
°
fi
‘
_
fl
/*
* Copyright 2014 Nick White nick.wh...@durham.ac.uk
*
* Licensed under the Apache License, Version 2.0 (the License);
* you may not use this file except in compliance with the License.
* http
On Wed, Jul 09, 2014 at 03:16:08AM -0700, Paul wrote:
How about using ImageJ (can be automated with macros) to create a better
binary
result of the image.
Thanks for mentioning this; I hadn't heard of it and it sounds very
useful. I added a link to the ImproveQuality wiki page.
Nick
--
On Wed, Jul 09, 2014 at 09:48:20AM -0700, Rani Yaroshinski wrote:
From the point of view of the performance measures of the OCR ?
I don't think anybody has figures on this. You could do some tests
yourself, and let us know the results.
I would guess that file size would be a bigger slowdown
On Wed, Jul 09, 2014 at 09:50:01AM -0700, Rani Yaroshinski wrote:
In order to improve the accuracy of the OCR results ?
Yes, it is, if you know more details about the images you'll be
using, so can do better than Tesseract's guesses.
See
On Tue, Jul 08, 2014 at 10:49:49PM -0700, shree wrote:
My information IS dated - I haven't followed the recent changes. Please see
this thread - almost a year old which talked of the upcoming changes for
training
Hi Albrecht,
On Thu, Jul 03, 2014 at 09:40:51PM -0700, Albrecht Hilker wrote:
Generally it is very sad that there is no detailed documentation about
Tesseract.
I agree. I do work on the documentation, but there is an awful lot
missing. I appreciate you taking the time to ask questions here
Hi Alex,
If you're up for some programming, you could recognise the squares
yourself, and pass each one separately to tesseract with the
PSM_SINGLE_CHAR segmentation type. That should help if Tesseract is
not segmenting each whole square separately.
If the board is always the same size, you
On Fri, Jul 04, 2014 at 02:08:46AM -0700, Meenal Goyal wrote:
If you're sure that all the words you will encounter will be in the
dictionary this should help somewhat:
https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_to_
increase_the_trust_in/strength_of_the_dictionary?
On Fri, Jul 04, 2014 at 02:15:52AM -0700, Iskander Sharipov wrote:
I need to create new tessdata language, which is very similar to russian in
charset.
Every time I try to do so by training tesseract on a box containing needed
letters I get new traineddata,
which actually can recognize new
On Wed, Jul 02, 2014 at 10:26:16PM -0700, Meenal Goyal wrote:
The post about question about training tesseract only suggests some
pre-processing steps which include binarisation and I have already tried
them.
I wanted to know if anything can be done to improve output at later stage,
Hi Artur,
On Wed, Jul 02, 2014 at 10:18:55PM -0300, Artur Augusto wrote:
As many people ask about how to use tesseract to read 7 segments display, I
decided to publish an open source sample project.
If someone wanna check it: https://github.com/arturahttps://github.com/
Hi Elena,
Just a guess, but maybe this line:
api - SetSourceResolution(600);
is the source of your troubles? Tesseract from the command line
would have just been guessing it, and perhaps its guess, coupled
with its ideas about different sizes of fonts, were better than
yours?
Nick
. , their’ are some words which may not be
considered
fully as noise but they get filtered out after regex matching.
Also, Is there any way to retrain tesseract for improving results in such
cases? Any feedback mechanism which can help improve?
On Tuesday, July 1, 2014 8:52:35 PM UTC+5:30, Nick
On Mon, Jun 30, 2014 at 10:42:41PM -0700, nirali kanani wrote:
is there Tesseract - ocr v 3.03 exe available anywhere ?
Tesseract v3.03 hasn't been released yet (except as a pre-release
version in the latest ubuntu). The code is unlikely to change a lot
from what's currently in SVN, so you
Hi,
On Mon, Jun 30, 2014 at 09:25:23PM -0700, 韩煦深 wrote:
I'm a Chinese student and I want to use the tesseract-ocr in our linux system.
I have Ubuntu OS and I install tesseract in my ubuntu system.
But I don't know how to use C++ API in linux system because all the examples
are based on VC++
Hi Meena,
On Tue, Jul 01, 2014 at 02:04:36AM -0700, Meenal Goyal wrote:
When I try to ocr an image, it also produces some noise apart from the
meaningful words. An example output for an image is:
All women become
like their’ mqthers. _ ' 1’ '
- —T at-{rs their tragedy. ” R-‘»“T‘*'-.
Hi Meenal,
On Mon, Jun 30, 2014 at 01:40:10AM -0700, Meenal Goyal wrote:
When i run tesseract on my image, it produces some words not present in the
dictionary. Is there some way to directly get the list of these words and
prevent tesseract from showing them in the output.
Example of such
Hi Scott,
On Fri, Jun 27, 2014 at 09:39:21PM -0700, scott.ha...@gmail.com wrote:
Hi all. Firstly let me say I am totally blown away by Tesseract, it vastly
exceeded my expectations for an open source OCR project. I have an
application
(http://hackaday.io/project/1569-NSA-Away) that
On Fri, Jun 27, 2014 at 01:48:52AM -0700, thinker wrote:
reading image with multiple language (arabic and english) by using -l
ara+eng option gives garbage output.
There are currently a couple of bugs with combining Arabic and
English together, so it isn't working. I'd recommend you add any
Hi Mori,
On Fri, Jun 27, 2014 at 01:51:01AM -0700, morteza neishaboori wrote:
I want to use OCR to detect small words in images containing indoor signs and
etc
you can find some sample images in the link below to get the idea
Hi Sheeyam, sorry for not replying to your emails sooner.
On Sun, Jun 22, 2014 at 04:43:27AM -0700, sheeyam shellvacumar wrote:
Does Tesseract support sinhala. How do u guys train them ??? Actually i am
confused help me
It looks like some people have trained Tesseract for Sinhala; see
Hi Paulo,
On Mon, Jun 23, 2014 at 10:11:28AM -0700, Paulo Basilio wrote:
Good day, I am trying to develop a mobile app that can read cursive
handwriting
(doctor's handwriting to be exact). My question is, can tesseract-ocr read
cursive handwriting? If not, can someone give me suggestion for
Hi Raghavan,
On Tue, Jun 24, 2014 at 06:58:56AM -0700, Raghavan P wrote:
When i try to make use of tesseract classes like BLOCK_IT and BLOCK_LINE_IT, I
am getting the error it was not declared in this scope.
May i know what header should i bring in or what am i missing here?
Are you using the
On Fri, Jun 27, 2014 at 04:57:30PM -0400, Nick White wrote:
On Mon, Jun 23, 2014 at 10:11:28AM -0700, Paulo Basilio wrote:
Good day, I am trying to develop a mobile app that can read cursive
handwriting
(doctor's handwriting to be exact). My question is, can tesseract-ocr read
cursive
Hi Jack,
I replied privately, but the gist is that VietOCR is a graphical
program that makes Tesseract easier to use on a Mac (as well as
Linux Windows).
Nick
On Thu, Jun 26, 2014 at 08:55:19AM -0700, Jack Kershaw wrote:
I am an ancient greek student currently studying A levels. I have been
On Mon, Jun 23, 2014 at 08:32:52AM -0700, Traun Leyden wrote:
One more thing that document should have is a mention of Stroke Width
Transform
to improve OCR recognition on images that have a lot of non-text content.
Oh cool, that looks great! I definitely will add that to the wiki
page
Hi Eddie,
I'd suggest contacting the author of the PHP wrapper, that isn't
something provided by the core Tesseract project, and it doesn't
look like any issue with Tesseract proper, just with the caller.
Nick
On Wed, Jun 25, 2014 at 12:36:59AM -0700, Eddie G wrote:
I'm using the PHP
Hi Amar,
If you can wait for the release of Tesseract 3.03 (or compile the
latest version from SVN), that has PDF output built in.
Nick
On Mon, Jun 23, 2014 at 12:19:52AM -0700, Amar wrote:
Hello dear friends, Is HOCR2PDF command line tool limited only to non-windows
platforms? I could not
Hi Traun,
Any tips on doing pre-processing on the images to improve the
recognition?
The place to start would be here:
https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality
Nick
--
You received this message because you are subscribed to the Google Groups
tesseract-ocr group.
To
Hi Ketut,
On Tue, Jun 10, 2014 at 11:30:39PM -0700, ketut ariasa wrote:
I have a very limited OCR application using tesseract, where I want to
recognize only 8 letters and numbers begin with the letter 'D'.
Is there a way to restrict tesseract to attempt to
recognize only 8 digits letters
On Wed, Jun 18, 2014 at 07:30:03AM -0700, Paul wrote:
That upper bound actually might be the root of your problem. If you've already
compiled Tesseract on your own,
try to use a greater number for kMaxUserDawgEdges. If you have not, you could
either reduce the number of
words in your
On Thu, Jun 05, 2014 at 01:51:24PM +0200, zdenko podobny wrote:
On Thu, Jun 5, 2014 at 12:10 PM, 'thakobyan' via tesseract-ocr
tesseract-ocr@googlegroups.com wrote:
Trying to OCR the portion of the image. For some reason if I
cut only one word (see Fail.png and Fail2.png attached)
/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/20140605164046.GB5444%40manta.lan.
For more options, visit https://groups.google.com/d/optout.
/* Copyright 2014 Nick White
*
* Licensed under the Apache License, Version 2.0 (the License);
* you
Hi Debayan,
On Wed, Jun 04, 2014 at 01:53:54PM +0530, Debayan Banerjee wrote:
I am contemplating porting the classifier to a deep neural net, probably
https:
//github.com/BVLC/caffe. Anyone already working on this?
This should allow Tesseract to recognise some of the more complicated
work at Georgian.
Will revert later and share my little experience.
среда, 28 мая 2014 г., 19:26:55 UTC+4 пользователь Nick White написал:
Hi all,
Resurrecting an old thread: has anyone got anywhere training
tesseract for Georgian? Or tried
Hi Krijesh,
On Thu, May 29, 2014 at 03:09:21AM -0700, Krijesh PV wrote:
But as you said the command
1. tesseract 8531_001.3B.tif 8531_001.3B_uzn -psm 4 is generating
a 8531_001.3B_uzn.txt file. I am not able to getting uzn file. I need to
generate a template for my image contents.
Hi Krijesh,
On Thu, May 29, 2014 at 07:38:17AM -0700, Krijesh PV wrote:
i am completely a novice on this topics, can please explain on complete
process, how can i create this uzn files are there any tools for that,
There aren't any tools to create uzn files, that I know of. You can
see how
On Wed, May 28, 2014 at 06:09:00AM -0700, Lutz Wittenmayer wrote:
I made a copy of the eurotext.tif and inserted it into the directory where
the tesseract.exe is located. Same error message
Finally I also placed this tif into the directory tessdata.
I got always the meassage Cannot open
Hi Bernardo,
On Mon, May 26, 2014 at 03:58:22PM -0700, Bernardo Meurer wrote:
I'm in need of some help, I was wondering if it would be possible to use
tesseract to read number plates as the one in the image below. If that is
doable, if anyone could give me some directions of where to start it
Hi Bernardo,
On Tue, May 27, 2014 at 01:36:58PM -0700, Bernardo Meurer wrote:
Now, I found this sample code which I am trying to test on my plates to see if
its successful. I am getting compiling bugs when i try to run it, but i'm not
sure this is the place to ask for help in such way. If its
Hi Michael.
On Sat, May 24, 2014 at 10:38:57PM -0700, Michael Yang wrote:
I'm able to compile the text2image training tool, however, I can't seem to get
it to work. I've confirmed that the viewer works with the included tesseract
tests. I've included the output below. Any help is much
Hi Przemysław,
On Sat, May 24, 2014 at 04:11:32AM -0700, Przemysław Woźniak wrote:
The problem which I encountered is that hOCR output that I produce using C++
code isn't the same as what I get using tesseract.exe from Windows console.
I'm
speaking of course about the accuracy of words
1 - 100 of 431 matches
Mail list logo