Check if the DPI is about 300. Screenshots generally have lesser DPI. On Friday, November 11, 2016 at 5:50:23 AM UTC+5:30, JF wrote: > > I have an app that needs to recognize text in screenshots. > > Does that matter? I think this image is clean enough for Tesseract to > recognize? > > On Thursday, November 10, 2016 at 1:03:43 PM UTC-8, Allistair C wrote: >> >> What is it you are trying to achieve exactly? >> >> On 10 November 2016 at 18:02, JF <[email protected]> wrote: >> >>> I'm using Tesseract (3.04.01 with leptonica-1.73) on Mac OS 10.12 to >>> segment a clean screenshot of a web page. >>> >>> Here is the command: >>> >>> >>> tesseract screen.png output.txt >>> >>> >>> screen.png: >>> >>> >>> [image: screen.png] >>> <https://camo.githubusercontent.com/c82fb95cab29d3a05e1694ee5cd2b2365b60bbdf/68747470733a2f2f692e737461636b2e696d6775722e636f6d2f77667745692e706e67> >>> >>> >>> output.txt: >>> >>> >>> a CSS Regwstratmnfi x >>> >>> C (D localnostr >>> >>> Accoum Dexans >>> >>> Eu a Pine: 5" a >>> >>> Fifi/(‘3’ 22pm; J. , km?“ ”9 >>> >>> Persuna‘ Dexaus Funhev \muvmanun >>> >>> «m s , (35‘ m Was :6 ms >>> >>> FMS, Emms' (u v Jaruawy >>> >>> *1: \(uax y , >>> >>> Chum >>> >>> Terms and Mamng >>> m any ‘ ‘ Regwsley» >>> >>> w lc‘asehe :avicxflza \zh»,:\':\e >>> >>> Mm , (ism-ye I/Exzavheilédgémzéi >>> >>> >>> The output is complete garbage except for a few words like "Terms and". >>> >>> I've read the "ImproveQuality >>> <https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality>" wiki, >>> but I don't think any case applies to this image. >>> >>> Could anyone please tell me which command line options I should set to >>> make it work? >>> >>> >>> Thanks in advance! >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >>
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4b4d0698-aed4-4655-ba89-80da17e31e53%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

