It is the oppsite actually, they are detected, but more then once. For example, the third row the original image shows "I I I", but tesseract identifies it as "TITITI".
i tired every psm and 6 gave the best results, psm 11 doesn't understand that there are 23 rows and is unable to recognize about 50%. Example output by psm 11: FA TS N AC REM LGT O NSM ...and so on. [email protected] schrieb am Samstag, 20. August 2022 um 13:36:01 UTC+2: > Hi, > > If some of the letters not detected then you can again run tesseract on > image by whitening detected letters using their co-ordinates. > > mostly for such sparse text images, PSM 11 workes good. > > Regards, > Nikhil > > On Sat, Aug 20, 2022, 4:37 PM Sabbasofa <[email protected]> wrote: > >> Hey all, >> >> I'm trying to extract the letters for a word search. >> >> Here is my input image: https://i.imgur.com/7zEEx1b.jpg >> >> and this is my best try so far: >> >> tesseract input.jpg out --psm 6 -c >> tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ >> >> this is the output it generated: >> >> FATSITNEDNACIUREMATITLGTOAS >> JTUNSMARROZXOIFRRIHWKTILZZC >> SRNENDEPBAYWSEIJNSMDTITITISR >> OEAAIRIARACHMVMIJIODTFUPSAIJE >> UMLVFUAEOVMLUPYJKGAIJQHTIREW >> TELRGIIWNPLPMEGGYVYAXIZSTTSAH >> HDIGNNEBEIJDTIZEIFGRIMGAOODO >> ARHEBTGWYIKNVURIMZOETIVMR >> FQOAPODEDTIRTVEEMTILIJASOTFN >> RRGOSWRDETINIWHTITETTIOGEREA >> IEDNSNAULRREOATABEGQPNRTIN >> CLNRAATTEATBRHSTNIGTIBTT >> ALDOKTSNOCBSSOAERTFXATILIOOE >> VIUVCUAPHPNINUNGPXXUOBL >> YMMEAOBHAEDAEZATNEZETVEODO >> OBARIJNTIUPREETILIKGAREEAZWSIBP >> GAOTSOREDTTFPSBAKIKTSIBAENE >> RSJEWEOLSAFQUUPZTAUIRIBZTILNII >> QIYITYKEKHUMMPDETILOEUEIZBM >> TJDYLSUHIPUCNUBDDVITLIKRST >> JTKFEBIKNKMMOITFUMUEVDBUPGD >> FADIUZALBUFIJLIFMMOLTWNZU >> OOODMONMOPSREPMUIJDUOLTCTCP >> >> >> its not bad, but it sure isn't good enough. >> >> Any idea what i could try? Thank you. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/c69a3e99-d211-45f2-aecd-2259b413824en%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/c69a3e99-d211-45f2-aecd-2259b413824en%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f66f9538-5857-4ae7-a7a5-70fa3f8db06an%40googlegroups.com.

