hi all,

i thought i might start a new thread here.

Because many OCR users are blind and we actually need the best things to get the best quality out of images which we put through tesseract we might need some help with this.

first we can't see the image so we need something automatically doing the image repair job. might it be the upscaling to around 300 dpi or grayscaling or other stuff. because tesseract currently seems not to be able to do that by itself we need additional programs.

i figured out that imagemagick is pretty interesting and it can do a lot of things.

i looked at
http://www.fmwconcepts.com/imagemagick/textcleaner/

which already looks pretty interesting.
though it might not have already everything on board.
i thought of converting a multipage pdf which the script seems not to be able to handle properly.
i seem always just to get one page.
and i don't know how well the script does its job.

i am also looking a solution for windows as well.
imagemagick works well under windows so we might get all the cool bits out of this script into a normal imagemagick command we could use.

any idea if this is the right program or anyone has a better idea?

thanks.

greetings,
simon



--
Simon Eigeldinger
Follow me on Twitter: http://www.twitter.com/domasofan/
E-Mail: [email protected]
MSN: [email protected]
ICQ: 121823966
Jabber: [email protected]

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz 
ist aktiv.
http://www.avast.com

--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/544AD3AE.4090807%40vol.at.
For more options, visit https://groups.google.com/d/optout.

Reply via email to