I'm trying to get started using OCRopus and find it very cumbersome. Who of you is actually productively using OCRopus and how did you learn it?
This is where I'm currently: After a bit of research and a bugfix I managed to install OCropus 0.5 and actually do a test run that didn't return a fatal error. The result is unusable, though - so now I need to get into the details of training and configuration. The first point I want to improve is the binarization, which returned unusable results - way too light, there was basically nothing more to recognize in the binarized picture. "ocropus-preproc -h" tells me some parameters to tweak: Ground truth extension, zoom, character component size, halftone removal, deskewing, sigma and k value. The issue is: I don't really know what any of these parameters mean exactly, or how to sensibly use them. Sure, there is Google and Wikipedia, and I have actually watched all the YouTube videos available, but at the end of the day I was not able to find out concrete measures how to improve my binarization results. I tried using some estimated numbers for sigma and k, but that apparently had no effect whatsoever. What I - and apparently other newbie users around here - really need is a manual-like introduction to the whole system, like: "A ground truth is defined as abc, while a ground truth extension is xyz. ... Parameter x needs to be a value between y and z, lower x means ... higher x means..." I feel like there must be an OCRopus bootcamp somewhere, maybe a lecture or a manual that I just completely missed in my search and that enabled all the other users to actually make productive use of OCRopus. I'm a computer scientist and somewhat experienced software developer, so I can take technical language and am a quick learner. I'd even be willing to pay someone to teach me (within reasonable boundaries) or would be willing to write such a manual in return. Can anyone help me by pointing me to the right resources, or is personal training for OCRopus usage (maybe remote) available? -- You received this message because you are subscribed to the Google Groups "ocropus" group. To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/p_Dmv_UDrOQJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
