the outer outline of the
big white blob is very long). One of Tess's notable features is that
it can handle inverted text. Though it should be able to get all
outlines, and you've helped him to achieve this.
Warm regards,
Dmitry Silaev
On Wed, Mar 16, 2011 at 10:57 AM, Ice Head iceh
Manuel,
I'm afraid just chaining command line tools won't help in this case.
I'm talking about programming.
And yes, I did solve many practical problems related to layout
analysis, and other fields of document image processing, and succeeded
in it ))
Warm regards,
Dmitry Silaev
On Mon, Mar
.
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011 at 8:23 AM, David Hoffer dhoff...@gmail.com wrote:
Hi Vicky,
Can you tell me more about this paper? It looks like this is not a
free document so I can't just read it to see if it would solve the
problem I have.
My problem is that I have
Actually, there's more than just VietOCR. Check this:
http://en.wikipedia.org/wiki/Tesseract_(software)#User_interfaces
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011 at 2:13 AM, Onion onionzwie...@gmail.com wrote:
Ok, thanks. That will be too complicated for me to use. Will have
You don't need to bother using *two together*. Tesseract is a basis
FreeOCR is built on, so these two are together already. FreeOCR's
graphic interface is quite user friendly. Just install and use. I
don't know what else needs to be said ))
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011
Ehmm... I don't get it. If you've succeeded in using iterators, it's
at your full disposal to format the output in any way you want
programmatically, isn't it?
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011 at 1:56 PM, Jose diox...@gmail.com wrote:
*I only modify how the result is printed
Dave,
Yep, quality is relatively poor so don't expect high accuracy from Tess.
Do you need every table cell's contents? Or getting numbers is just
enough and in a next step you can restore [predefined] item names?
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011 at 4:19 PM, David Hoffer
Dave,
What is the format and resolution in which you initially get your
images? For such poor quality every conversion makes an image even
worse...
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011 at 5:29 PM, David Hoffer dhoff...@gmail.com wrote:
Dmitry,
Would using a loss-less format
#f98699a9caf36dbc
If you see no clues in these posts then you need to send your sample
images, there's no other way to help you.
Warm regards,
Dmitry Silaev
On Mon, Mar 14, 2011 at 5:22 PM, manuel...@gmail.com
manuel...@gmail.com wrote:
Thanks.
I need a GUI that tells to tesseract to recognize just
As I can see, your source data can be deemed as 1-bit (binary)
losslessly compressed image. So a lossless conversion to any image
format (makes no difference which) will do no harm.
Warm regards,
Dmitry Silaev
On Tue, Mar 15, 2011 at 8:31 AM, David Hoffer dhoff...@gmail.com wrote:
Dmitry
own opinion, and it does not necessarily coincide with
the views of other document image processing people.
Warm regards,
Dmitry Silaev
On Sun, Mar 13, 2011 at 12:52 AM, TP wing...@gmail.com wrote:
How about this technique mentioned in the Leptonica documentation (its
even easier if you can
this was tested under Windows. Probably I can try this
under Ubuntu, but I don't know when I have enough time to reboot, set
up a C++ compiler, build Tesseract and do some testing, sorry ))
Are you sure you downloaded the latest stable version of Tesseract?
Warm regards,
Dmitry Silaev
On Thu
-ocr/wiki/ReadMe#Windows
Warm regards,
Dmitry Silaev
On Sun, Mar 13, 2011 at 11:36 PM, Onion onionzwie...@gmail.com wrote:
I installed Tesseract 3.00 and the German and Czech languages as well as
English.
Now how do I run it? Are there directions somewhere?
When I click Start Tesseract OCR
as with it, and always
the result was satisfactory.
Let me know the details on your command line and OS.
Warm regards,
Dmitry Silaev
On Sun, Mar 13, 2011 at 11:18 PM, patrickq
patrick.questemb...@gmail.com wrote:
You expect way too much from Tesseract: it's not Tesseract's job to
slice and dice
Tesseract's layout
analysis. Then go PSM_SINGLE_LINE and PSM_SINGLE_BLOCK. However for
PSM_SINGLE_WORD or PSM_SINGLE_CHAR you'd need to do your own
segmentation. I don't know if you are ready to dive into such serious
development.
HTH
Warm regards,
Dmitry Silaev
On Sat, Mar 12, 2011 at 7:39 AM
. You need to remove
lines and borders and pass the cleaned image to Tesseract. There can
arise many issues related to this process, but I think there's no need
to tell anything else now, except if you express some interest in it.
Warm regards,
Dmitry Silaev
On Fri, Mar 11, 2011 at 7:21 AM
Try textord_words_min_minspace, fraction of x-height
Warm regards,
Dmitry Silaev
On Mon, Mar 7, 2011 at 8:28 PM, JMW white.j...@gmail.com wrote:
I'm having some consistent problems with lack of whte space between
words. I.e. Thisisyour statementthatshows theamount you owe
foryour
need to extend your pre-processing in order to feed Tess
with images indeed containing text. Decisions can be made based on
contrast estimation, distinctive color distribution, etc.
HTH
Warm regards,
Dmitry Silaev
On Fri, Mar 4, 2011 at 5:25 PM, zdravco zdra...@gmail.com wrote:
Hello,
I am
method based on edge
detector.PDF
HTH
Warm regards,
Dmitry Silaev
On Sat, Mar 5, 2011 at 8:56 AM, Saurabh Gandhi saurabh...@gmail.com wrote:
Hey,
Any algorithm / whitepaper suggestions for text extraction, especially if
the text is not over-lay text but a part of the image itself
Sriranga,
Thanks for letting me know. You are the first one then, and I invented
the bicycle ))
However an article might be still of use instead of verbose forum discussion...
May be you'd like to write it then?
Warm regards,
Dmitry Silaev
On Thu, Mar 3, 2011 at 3:55 PM, Sriranga(78yrsold
in programming
can make this traineddata file himself ))
Warm regards,
Dmitry Silaev
On Thu, Mar 3, 2011 at 5:08 PM, Sriranga(78yrsold)
withblessi...@gmail.com wrote:
Dmitry,
No I am NOT the first invented but actually credited to spohor...@sjm.com
-who helped me very lot including
Manuel,
Is the error message generated by version 2.xx? Did you try to run
version 3.xx with my por.traineddata file?
I don't get it - have you succeeded or not?
Please provide us with the image you are trying to recognize.
Warm regards,
Dmitry Silaev
On Thu, Mar 3, 2011 at 5:34 PM, manuel
Without any image samples, you can only get a vague advice.
Provide the community with samples and you might get a satisfactory
concrete response.
Warm regards,
Dmitry Silaev
On Wed, Mar 2, 2011 at 1:43 PM, Cong Nguyen congnguye...@gmail.com wrote:
Please be careful with the Otsu algorithm
,
Dmitry Silaev
On Thu, Feb 24, 2011 at 1:05 PM, Jose diox...@gmail.com wrote:
Hi, (as you now Saurabh because we talked in private the other day) I tried
the PSM_SINGLE_COLUMN and the accuracy drops dramatically... I can't afford
to loose that accuracy. Is it possible to change the way the output
,
Dmitry Silaev
On Thu, Feb 24, 2011 at 1:50 PM, Jose diox...@gmail.com wrote:
Dmitry the recognition works the only thing is the way it is parsing it...
:S I think segmentation of the images would be too much painful! I only
won't to change the other that is display or the bounding boxes so I
The best way to explain everything would be just to send your source
image examples, describe what information you want to get from them
and provide the community with the code snippets you use to interface
with Tess. And please be as detailed as possible.
Warm regards,
Dmitry Silaev
On Thu
Interesting. I was wondering about Cube since its traces began to
appear in the source code but had no enough time to investigate it
thorougly
Zdenko, would you please kindly share your other findings on Cube?
Regards,
Dmitry
On Tue, Feb 22, 2011 at 11:13 AM, zdenko podobny zde...@gmail.com
I might not understood you fully, but this is an obvious excerpt from
baseapi.h:
Each SetRectangle clears the recogntion results so multiple
rectangles
can be recognized with the same image
Indeed, SetRectangle() calls ClearResults() which deletes the pageres
and
clears the block list ready for
Hi Zvezdoslav,
Check out the code of the Classify::EndAdaptiveClassifier() and
Classify::InitAdaptiveClassifier() methods.
Also search for classify_use_pre_adapted_templates and
classify_save_adapted_templates
HTH
Regards,
Dmitry
On Feb 16, 4:50 pm, Zvezdoslav Kunov z.ku...@gmail.com wrote:
Jon,
I don't know if it's intended but all your links to images report
We're sorry. The page you tried to access is not available. In that
way nothing can be advised on your issue...
Warm regards,
Dmitry Silaev
On Mon, Feb 21, 2011 at 5:02 AM, Jon Andersen jande...@gmail.com wrote:
Hi,
My
.
Instead of rummaging in Tess's guts I'd better use a pretty convenient and
high-level interface provided by ResultIterator (see GetIterator() in
baseapi.h and then read all comments in resultiterator.h and
pageiterator.h)
Warm regards,
Dmitry Silaev
On Wed, Feb 16, 2011 at 5:34 AM, devTess
no value ((
*** I'm still seeking for somebody's help regarding this topic's subject.
***
Warm regards,
Dmitry Silaev
2011/2/8 Sriranga(78yrsold) withblessi...@gmail.com
Dmitry,
Congratulations !! successfully installed in winXP and tried using
phototest.tif
1st commandline tesseract
step-by-step debugging is also of use ))
Warm regards,
Dmitry Silaev
On Tue, Feb 8, 2011 at 6:44 PM, devTess jim...@googlemail.com wrote:
Hi Dimitry, with the guidelines provided from you, I prepared a strong
cup of coffee and start reading the top part of baseapi.h
Q1
Init(datapath
the reasonable forum post size here in Google Groups, I placed the
more verbose and overall nicer looking instructions in my blog at
http://rdaemons.blogspot.com/2011/02/tesseract-ocr-setting-up-interactive.html
Warm regards,
Dmitry Silaev
2011/2/6 Sriranga(78yrsold) withblessi...@gmail.com
Dear
appropriate people to do this job.
Warm regards,
Dmitry Silaev
--
You received this message because you are subscribed to the Google Groups
tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to
tesseract-ocr
not much of a recent graduate already ((
Warm regards,
Dmitry Silaev
--
You received this message because you are subscribed to the Google Groups
tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to
tesseract-ocr
to the
entire glyph combination. Then during the post-processing you'll need to
replace this single code with a predefined dependent Unicode pair.
Hope I've managed to express myself clearly.
Warm regards,
Dmitry Silaev
--
You received this message because you are subscribed to the Google Groups
Dear Sochenda,
I've checked the Unicode table range you've sent and now I see what the
problem is. I'd agree that in such algorithmic writing system (contrasted
with simpler positional systems like say Roman or Cyrillic) the stages of
pre-/post-processing are inevitable.
I'd suggest making
://code.google.com/p/tesseract-ocr/wiki/ReadMe). These are not quite
easy searchable documents but they contain all the info you might need.
Warm regards,
Dmitry Silaev
On Sun, Jan 16, 2011 at 10:42 AM, KHEM Sochenda khemsoche...@gmail.comwrote:
Dear Dmitry,
Thank you very much
,
Dmitry Silaev
On Fri, Jan 14, 2011 at 10:25 AM, KHEM Sochenda khemsoche...@gmail.comwrote:
Dear Tesseract Team,
In training new language step, we have to assign a unicode value to each
box.
I would like to know if a shape that is composed of *several unicode
characters?
Is there anyway
On the plus side, it turns out that there are functions buried in the
code to serialise/deserialise the classifier state, so it might be
useful to run a whole corpus of short images through tess in one
batch, save the state, and load that at startup.
Could you please be more specific, what
there was a minor bug which prevented display of magnified textline images
in the viewport after save
now it's fixed
eh, development version as i said
On Tue, Jul 13, 2010 at 3:36 PM, Jimmy O'Regan jore...@gmail.com wrote:
On 13 July 2010 11:55, daemon-s daemons2...@gmail.com wrote:
Please
42 matches
Mail list logo