I am trying to identify the molecules from pathway images. This should be
relatively simple from clear, high-res images like the one attached, but my
attempts with Tesseract so are are pretty dismal...
It found 9 of 25 molecules. I even have the luxury of knowing in advance
all the words I'd like extract and tried supplying these as eng.user-words,
but there was no improvement.
I suspect I need to find the magic combination of parameter settings or
perhaps image pre-processing. Any suggestions?
Thanks!
- Alex
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/ff5a2873-8392-4771-b314-3f2f146b0027%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Latent Active
Myostatin Myostetin
. ’ . Follislatin
BMP1I'I'LD FLRG
t
Receptor II Receptor l R-Smad
Akl
- -F
l SmadA
GATA Elk MEF Foxo GSK-3
Nucleus
56K
Target genes
GROWTH MVOSTATIN SURVIVAL GROWTH METABOLISM