Check if the DPI is about 300. Screenshots generally have lesser DPI.

On Friday, November 11, 2016 at 5:50:23 AM UTC+5:30, JF wrote:
>
> I have an app that needs to recognize text in screenshots. 
>
> Does that matter? I think this image is clean enough for Tesseract to 
> recognize?
>
> On Thursday, November 10, 2016 at 1:03:43 PM UTC-8, Allistair C wrote:
>>
>> What is it you are trying to achieve exactly?
>>
>> On 10 November 2016 at 18:02, JF <[email protected]> wrote:
>>
>>> I'm using Tesseract (3.04.01 with leptonica-1.73) on Mac OS 10.12 to 
>>> segment a clean screenshot of a web page. 
>>>
>>> Here is the command:
>>>
>>>
>>>     tesseract screen.png output.txt
>>>
>>>
>>> screen.png:
>>>
>>>
>>> [image: screen.png] 
>>> <https://camo.githubusercontent.com/c82fb95cab29d3a05e1694ee5cd2b2365b60bbdf/68747470733a2f2f692e737461636b2e696d6775722e636f6d2f77667745692e706e67>
>>>
>>>
>>> output.txt:
>>>
>>>
>>> a CSS Regwstratmnfi x
>>>
>>> C (D localnostr
>>>
>>> Accoum Dexans
>>>
>>> Eu a Pine: 5" a
>>>
>>> Fifi/(‘3’ 22pm; J. , km?“ ”9
>>>
>>> Persuna‘ Dexaus Funhev \muvmanun
>>>
>>> «m s , (35‘ m Was :6 ms
>>>
>>> FMS, Emms' (u v Jaruawy
>>>
>>> *1: \(uax y ,
>>>
>>> Chum
>>>
>>> Terms and Mamng
>>> m any ‘ ‘ Regwsley»
>>>
>>> w lc‘asehe :avicxflza \zh»,:\':\e
>>>
>>> Mm , (ism-ye I/Exzavheilédgémzéi
>>>
>>>
>>> The output is complete garbage except for a few words like "Terms and". 
>>>
>>> I've read the "ImproveQuality 
>>> <https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality>" wiki, 
>>> but I don't think any case applies to this image. 
>>>
>>> Could anyone please tell me which command line options I should set to 
>>> make it work? 
>>>
>>>
>>> Thanks in advance!
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/0aa8871c-393d-4bdf-bd73-673cfa10494d%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4b4d0698-aed4-4655-ba89-80da17e31e53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to