Hi Quan,
There is more than one way to scale as you may know. I have seen
OCR fail in some cases depending on how you scale. I have a front end
I use for my software that calls tesseract. I ended up providing
options for scaling and options for converting from 24-bit color to
gray or black and white.
Let me start with some simple answers, though. Scaling with
interpolation seems to work best most of the time. Converting to gray-
scale seems to work most of the time. (I read that Ray Smith did not
design tesseract for color screen images, so I really have not
experimented with leaving things in color.) I do not think tesseract
pays attention to the Alpha channel since it does not pertain to when
a single image sits by itself. (Converting to gray-scale does not
work in general if the text is rendered with ClearType or sub-pixel
rendering. If anybody figures out a good approach for OCR of
ClearType, I would appreciate getting an email since I don't read a
lot of the posts. Post your answer too.)
I think the scaling software at the leptonica web site is good. I
have had some trouble with the method in Windows that uses
createGraphics and drawImage. (Someone I worked with used the Windows
method on a blank image and got non-blank OCR results because the
Windows method seemed to me to introduce a row of black around a
couple of the edges. That's how it appeared to me, but it is possible
I did something wrong.)
Relative to scaling, I made a post in August about using nearest-
neighbor scaling when the characters are close together. This is
because scaling with interpolation without sharpening tends to blur
the edges of text characters. Leptonica has code for sharpening, I
believe, but I have not used it yet. Scaling by a factor of 2 without
interpolation and then by a variable factor with interpolation to the
needed size is a simple way to get some sharpening and some separation
between characters.
On Aug 31, 7:26 pm, Quan Nguyen <[email protected]> wrote:
> Hi Ian,
>
> I'm implementing a feature in my program to enable OCR of screenshots.
> The results have been generally better after the captured images were
> rescaled from 96 DPI to 300 DPI. I was wondering if other simple
> manipulations could be done programmatically to the images to produce
> even better results.
>
> The types of the screenshots are either 32bppArgb or 24bppRgb. Would
> changing to grayscale or stripping the Alpha help?
>
> Quan
>
> On Aug 31, 12:17 pm, "Ian Ozsvald (A.I. Cookbook)"
>
>
>
> <[email protected]> wrote:
> > Hi Quan.
>
> > I've used tesseract to OCR frames from 640x480 screencast videos,
> > generally it worked
> > fine:http://ianozsvald.com/2010/05/17/extracting-keyword-text-from-screenc...
>
> > What problems are you seeing when you try tesseract?
>
> > Ian.
>
> > On 30 August 2010 23:46, Quan Nguyen <[email protected]> wrote:
>
> > > I understand the resolutions of screenshots are typically inadequate
> > > for OCR, but besides rescaling to a higher resolution, say, 300 DPI,
> > > what other preprocessing operations may be needed on the images to
> > > yield optimal OCR results?
>
> > > Thanks.
>
> > > --
> > > You received this message because you are subscribed to the Google Groups
> > > "tesseract-ocr" group.
> > > To post to this group, send email to [email protected].
> > > To unsubscribe from this group, send email to
> > > [email protected].
> > > For more options, visit this group
> > > athttp://groups.google.com/group/tesseract-ocr?hl=en.
>
> > --
> > Ian Ozsvald (A.I. researcher, screencaster)
> > [email protected]
>
> >http://IanOzsvald.comhttp://MorConsulting.com/http://blog.AICookbook....-
> >Hide quoted text -
>
> - Show quoted text -
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en.