BH Hello,
I am an experienced programmer, but absolute newbie to OCR / document analysis / all computer optical recognition. *Desired Effect *(workflow I'm trying to program) - Dynamically generate a form intended to be printed and filled out IRL - Scan the completed form and obtain its data *Type of Data* - *Highest Priority*: Check boxes, filled in by pen / pencil / marker etc., marked with check-mark, X, diagonal strike, etc. - Optional: Written Numbers, circled options, *Theoretical Coding Solution* - When generating a form, store layout / coordinate information of form elements - Place recognizable anchors (rotated 'L' s or '+' symbols) at the corners of the printed page to define a general known rectangular area - Print a bar-code or numeric identifier at pre-defined coordinates in the rectangle area - Obtain data out of form elements using layout/format information & coordinates previously stored for this identified form *Bottom line*: Is this possible? *How to do this*? What do I need to learn in order to get to a point where I know how to use OCRopus (or other libraries) to achieve these results? ------------------------------ Related Links (describe some technical aspects & bits of theoretical solutions, but no practical road-map of how to actualize this) - http://stackoverflow.com/questions/15227243/what-is-the-proper-way-to-test-if-checkbox-is-ticked-on-scanned-document - https://groups.google.com/forum/#!searchin/tesseract-ocr/checkbox/tesseract-ocr/kvyILJMuuCI/iJeQc0ga-OkJ -- You received this message because you are subscribed to the Google Groups "ocropus" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ocropus/bbeedc2b-9d59-4251-8c34-1a61c619fc07%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
