I'm trying to get started using OCRopus and find it very cumbersome. Who of 
you is actually productively using OCRopus and how did you learn it?

This is where I'm currently:
After a bit of research and a bugfix I managed to install OCropus 0.5 and 
actually do a test run that didn't return a fatal error. The result is 
unusable, though - so now I need to get into the details of training and 
configuration. The first point I want to improve is the binarization, which 
returned unusable results - way too light, there was basically nothing more 
to recognize in the binarized picture. "ocropus-preproc -h" tells me some 
parameters to tweak: Ground truth extension, zoom, character component 
size, halftone removal, deskewing, sigma and k value. The issue is: I don't 
really know what any of these parameters mean exactly, or how to sensibly 
use them. Sure, there is Google and Wikipedia, and I have actually watched 
all the YouTube videos available, but at the end of the day I was not able 
to find out concrete measures how to improve my binarization results. I 
tried using some estimated numbers for sigma and k, but that apparently had 
no effect whatsoever. What I - and apparently other newbie users around 
here - really need is a manual-like introduction to the whole system, like: 
"A ground truth is defined as abc, while a ground truth extension is xyz. 
... Parameter x needs to be a value between y and z, lower x means ... 
higher x means..."

I feel like there must be an OCRopus bootcamp somewhere, maybe a lecture or 
a manual that I just completely missed in my search and that enabled all 
the other users to actually make productive use of OCRopus. I'm a computer 
scientist and somewhat experienced software developer, so I can take 
technical language and am a quick learner. I'd even be willing to pay 
someone to teach me (within reasonable boundaries) or would be willing to 
write such a manual in return. Can anyone help me by pointing me to the 
right resources, or is personal training for OCRopus usage (maybe remote) 
available?

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/ocropus/-/p_Dmv_UDrOQJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to