So if Tesseract was able to detect every piece of text perfectly what would you 
use? It matters because you might not be thinking about the problem properly. 
For instance sometimes people ask how to ocr a screen but what they really want 
is a portion of the screen and so there's usually a step before tesseract to 
isolate rectangles of input.

Sent from my iPhone

> On 11 Nov 2016, at 00:20, JF <[email protected]> wrote:
> 
> I have an app that needs to recognize text in screenshots. 
> 
> Does that matter? I think this image is clean enough for Tesseract to 
> recognize?
> 
>> On Thursday, November 10, 2016 at 1:03:43 PM UTC-8, Allistair C wrote:
>> What is it you are trying to achieve exactly?
>> 
>>> On 10 November 2016 at 18:02, JF <[email protected]> wrote:
>>> I'm using Tesseract (3.04.01 with leptonica-1.73) on Mac OS 10.12 to 
>>> segment a clean screenshot of a web page. 
>>> 
>>> Here is the command:
>>> 
>>> 
>>> 
>>>     tesseract screen.png output.txt
>>> 
>>> 
>>> screen.png:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> output.txt:
>>> 
>>> 
>>> 
>>> a CSS Regwstratmnfi x
>>> 
>>> C (D localnostr
>>> 
>>> Accoum Dexans
>>> 
>>> Eu a Pine: 5" a
>>> 
>>> Fifi/(‘3’ 22pm; J. , km?“ ”9
>>> 
>>> Persuna‘ Dexaus Funhev \muvmanun
>>> 
>>> «m s , (35‘ m Was :6 ms
>>> 
>>> FMS, Emms' (u v Jaruawy
>>> 
>>> *1: \(uax y ,
>>> 
>>> Chum
>>> 
>>> Terms and Mamng
>>> m any ‘ ‘ Regwsley»
>>> 
>>> w lc‘asehe :avicxflza \zh»,:\':\e
>>> 
>>> Mm , (ism-ye I/Exzavheilédgémzéi
>>> 
>>> 
>>> The output is complete garbage except for a few words like "Terms and". 
>>> 
>>> I've read the "ImproveQuality" wiki, but I don't think any case applies to 
>>> this image. 
>>> 
>>> Could anyone please tell me which command line options I should set to make 
>>> it work? 
>>> 
>>> 
>>> 
>>> Thanks in advance!
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/tesseract-ocr/28c6052b-a79e-42ea-89e9-4a73a27219da%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9A4B004C-B797-411B-A101-1169EAF07394%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to