I have an app that needs to recognize text in screenshots. Does that matter? I think this image is clean enough for Tesseract to recognize?
On Thursday, November 10, 2016 at 1:03:43 PM UTC-8, Allistair C wrote: > > What is it you are trying to achieve exactly? > > On 10 November 2016 at 18:02, JF <[email protected] <javascript:>> wrote: > >> I'm using Tesseract (3.04.01 with leptonica-1.73) on Mac OS 10.12 to >> segment a clean screenshot of a web page. >> >> Here is the command: >> >> >> tesseract screen.png output.txt >> >> >> screen.png: >> >> >> [image: screen.png] >> <https://camo.githubusercontent.com/c82fb95cab29d3a05e1694ee5cd2b2365b60bbdf/68747470733a2f2f692e737461636b2e696d6775722e636f6d2f77667745692e706e67> >> >> >> output.txt: >> >> >> a CSS Regwstratmnfi x >> >> C (D localnostr >> >> Accoum Dexans >> >> Eu a Pine: 5" a >> >> Fifi/(‘3’ 22pm; J. , km?“ ”9 >> >> Persuna‘ Dexaus Funhev \muvmanun >> >> «m s , (35‘ m Was :6 ms >> >> FMS, Emms' (u v Jaruawy >> >> *1: \(uax y , >> >> Chum >> >> Terms and Mamng >> m any ‘ ‘ Regwsley» >> >> w lc‘asehe :avicxflza \zh»,:\':\e >> >> Mm , (ism-ye I/Exzavheilédgémzéi >> >> >> The output is complete garbage except for a few words like "Terms and". >> >> I've read the "ImproveQuality >> <https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality>" wiki, >> but I don't think any case applies to this image. >> >> Could anyone please tell me which command line options I should set to >> make it work? >> >> >> Thanks in advance! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/28c6052b-a79e-42ea-89e9-4a73a27219da%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

