It is the oppsite actually, they are detected, but more then once. For 
example, the third row the original image shows "I I I", but tesseract 
identifies it as "TITITI".

i tired every psm and 6 gave the best results, psm 11 doesn't understand 
that there are 23 rows and is unable to recognize about 50%.

Example output by psm 11:

FA

TS

N

AC

REM

LGT

O

NSM

...and so on. 
[email protected] schrieb am Samstag, 20. August 2022 um 13:36:01 UTC+2:

> Hi,
>
> If some of the letters not detected then you can again run tesseract on 
> image by whitening detected letters using their co-ordinates.
>
> mostly for such sparse text images, PSM 11 workes good.
>
> Regards,
> Nikhil
>
> On Sat, Aug 20, 2022, 4:37 PM Sabbasofa <[email protected]> wrote:
>
>> Hey all,
>>
>> I'm trying to extract the letters for a word search. 
>>
>> Here is my input image: https://i.imgur.com/7zEEx1b.jpg
>>
>> and this is my best try so far:
>>
>> tesseract input.jpg out --psm 6 -c 
>> tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ
>>
>> this is the output it generated:
>>
>> FATSITNEDNACIUREMATITLGTOAS
>> JTUNSMARROZXOIFRRIHWKTILZZC
>> SRNENDEPBAYWSEIJNSMDTITITISR
>> OEAAIRIARACHMVMIJIODTFUPSAIJE
>> UMLVFUAEOVMLUPYJKGAIJQHTIREW
>> TELRGIIWNPLPMEGGYVYAXIZSTTSAH
>> HDIGNNEBEIJDTIZEIFGRIMGAOODO
>> ARHEBTGWYIKNVURIMZOETIVMR
>> FQOAPODEDTIRTVEEMTILIJASOTFN
>> RRGOSWRDETINIWHTITETTIOGEREA
>> IEDNSNAULRREOATABEGQPNRTIN
>> CLNRAATTEATBRHSTNIGTIBTT
>> ALDOKTSNOCBSSOAERTFXATILIOOE
>> VIUVCUAPHPNINUNGPXXUOBL
>> YMMEAOBHAEDAEZATNEZETVEODO
>> OBARIJNTIUPREETILIKGAREEAZWSIBP
>> GAOTSOREDTTFPSBAKIKTSIBAENE
>> RSJEWEOLSAFQUUPZTAUIRIBZTILNII
>> QIYITYKEKHUMMPDETILOEUEIZBM
>> TJDYLSUHIPUCNUBDDVITLIKRST
>> JTKFEBIKNKMMOITFUMUEVDBUPGD
>> FADIUZALBUFIJLIFMMOLTWNZU
>> OOODMONMOPSREPMUIJDUOLTCTCP
>>
>>
>> its not bad, but it sure isn't good enough.
>>
>> Any idea what i could try? Thank you.
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/c69a3e99-d211-45f2-aecd-2259b413824en%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/c69a3e99-d211-45f2-aecd-2259b413824en%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f66f9538-5857-4ae7-a7a5-70fa3f8db06an%40googlegroups.com.

Reply via email to