Re: OCR of Screenshots

Quan Nguyen Thu, 09 Sep 2010 19:22:25 -0700

Thanks, Steve, for all the valuable info. I've used bicubic
interpolation in scaling the screenshots and been able to achieve
acceptable results. The scale factor I used was 300 divided by the
image's resolution. If sharpening and keeping the bit depth to 8
improve the recognition rates further, then I will definitely consider
using them in future attempts.


Regards.

On Sep 9, 6:14 pm, SteveP <[email protected]> wrote:
> "Ideal" may be hard to define for image size.  The wiki (I believe)
> says the lower case letters (for English) should be at least 20 to 30
> pixels in height.  By default, I scale everything by a factor of 3.
> If your screen is set to 96 dpi resolution, 300 dpi would be about a
> factor of 3.  If your font size is large enough, then sometimes you
> can get better results without scaling, since scaling often blurs the
> image a little.
>
> What I said about leptonica is for software developers building a
> front end to tesseract.  If you are using ImageMagick, I suspect that
> is fine.
>
> I think 8-bit per color is standard for tesseract if you are not doing
> black and white.
>
> ClearType is an implementation of sub-pixel rendering, which is
> designed for an LCD screen with the red, green and blue sub-pixels in
> separate locations.  Printers and scanners and OCR typically are not
> oriented to sub-pixels.  I think OCR accuracy is better with sub-pixel
> rendering disabled.
>
> On Sep 8, 4:56 am, haratron <[email protected]> wrote:
>
> > I'm also interested in this topic.
>
> > I have a couple of questions:
> > 1. How can I calculate the ideal image size (300dpi?) to feed to
> > tesseract? I mean, how do I identify how much scaling the image needs,
> > before the OCR procedure.
> > 2. I'm currently using ImageMagick's convert program for scaling and
> > converting to grayscale. Would it make a difference if I used
> > leptonica instead?
> > 3. Do the bits of color matter? Is there an optimal color depth?
> > 4. Does the OCR work best when ClearType is enabled or disabled?
>
> > On Tue, Sep 7, 2010 at 11:30 PM, SteveP <[email protected]> wrote:
> > > Hi Quan,
> > >    There is more than one way to scale as you may know.  I have seen
> > > OCR fail in some cases depending on how you scale.  I have a front end
> > > I use for my software that calls tesseract.  I ended up providing
> > > options for scaling and options for converting from 24-bit color to
> > > gray or black and white.
>
> > > Let me start with some simple answers, though.  Scaling with
> > > interpolation seems to work best most of the time.  Converting to gray-
> > > scale seems to work most of the time.  (I read that Ray Smith did not
> > > design tesseract for color screen images, so I really have not
> > > experimented with leaving things in color.)  I do not think tesseract
> > > pays attention to the Alpha channel since it does not pertain to when
> > > a single image sits by itself.  (Converting to gray-scale does not
> > > work in general if the text is rendered with ClearType or sub-pixel
> > > rendering.  If anybody figures out a good approach for OCR of
> > > ClearType, I would appreciate getting an email since I don't read a
> > > lot of the posts.  Post your answer too.)
>
> > > I think the scaling software at the leptonica web site is good.  I
> > > have had some trouble with the method in Windows that uses
> > > createGraphics and drawImage.  (Someone I worked with used the Windows
> > > method on a blank image and got non-blank OCR results because the
> > > Windows method seemed to me to introduce a row of black around a
> > > couple of the edges.  That's how it appeared to me, but it is possible
> > > I did something wrong.)
>
> > > Relative to scaling, I made a post in August about using nearest-
> > > neighbor scaling when the characters are close together.  This is
> > > because scaling with interpolation without sharpening tends to blur
> > > the edges of text characters.  Leptonica has code for sharpening, I
> > > believe, but I have not used it yet.  Scaling by a factor of 2 without
> > > interpolation and then by a variable factor with interpolation to the
> > > needed size is a simple way to get some sharpening and some separation
> > > between characters.
>
> > > On Aug 31, 7:26 pm, Quan Nguyen <[email protected]> wrote:
> > >> Hi Ian,
>
> > >> I'm implementing a feature in my program to enable OCR of screenshots.
> > >> The results have been generally better after the captured images were
> > >> rescaled from 96 DPI to 300 DPI. I was wondering if other simple
> > >> manipulations could be done programmatically to the images to produce
> > >> even better results.
>
> > >> The types of the screenshots are either 32bppArgb or 24bppRgb. Would
> > >> changing to grayscale or stripping the Alpha help?
>
> > >> Quan
>
> > >> On Aug 31, 12:17 pm, "Ian Ozsvald (A.I. Cookbook)"
>
> > >> <[email protected]> wrote:
> > >> > Hi Quan.
>
> > >> > I've used tesseract to OCR frames from 640x480 screencast videos,
> > >> > generally it worked 
> > >> > fine:http://ianozsvald.com/2010/05/17/extracting-keyword-text-from-screenc...
>
> > >> > What problems are you seeing when you try tesseract?
>
> > >> > Ian.
>
> > >> > On 30 August 2010 23:46, Quan Nguyen <[email protected]> wrote:
>
> > >> > > I understand the resolutions of screenshots are typically inadequate
> > >> > > for OCR, but besides rescaling to a higher resolution, say, 300 DPI,
> > >> > > what other preprocessing operations may be needed on the images to
> > >> > > yield optimal OCR results?
>
> > >> > > Thanks.
>
> > >> > > --
> > >> > > You received this message because you are subscribed to the Google 
> > >> > > Groups "tesseract-ocr" group.
> > >> > > To post to this group, send email to [email protected].
> > >> > > To unsubscribe from this group, send email to 
> > >> > > [email protected].
> > >> > > For more options, visit this group 
> > >> > > athttp://groups.google.com/group/tesseract-ocr?hl=en.
>
> > >> > --
> > >> > Ian Ozsvald (A.I. researcher, screencaster)
> > >> > [email protected]
>
> > >> >http://IanOzsvald.comhttp://MorConsulting.com/http://blog.AICookbook....quoted
> > >> > text -
>
> > >> - Show quoted text -
>
> > > --
> > > You received this message because you are subscribed to the Google Groups 
> > > "tesseract-ocr" group.
> > > To post to this group, send email to [email protected].
> > > To unsubscribe from this group, send email to 
> > > [email protected].
> > > For more options, visit this group 
> > > athttp://groups.google.com/group/tesseract-ocr?hl=en.-Hide quoted text -
>
> > - Show quoted text -

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: OCR of Screenshots

Reply via email to