[ocropus] How to Read Prepared/Generated Forms (known layout) and Obtain Check-Box Data

kingIZZZY Tue, 07 Oct 2014 10:35:37 -0700

BH

Hello,


I am an experienced programmer, but absolute newbie to OCR / document 
analysis / all computer optical recognition.

*Desired Effect *(workflow I'm trying to program)

   - Dynamically generate a form intended to be printed and filled out IRL
   - Scan the completed form and obtain its data


*Type of Data*

   - *Highest Priority*: Check boxes, filled in by pen / pencil / marker 
   etc., marked with check-mark, X, diagonal strike, etc.
   - Optional: Written Numbers, circled options,


*Theoretical Coding Solution*

   - When generating a form, store layout / coordinate information of form 
   elements
   - Place recognizable anchors (rotated 'L' s or '+' symbols) at the 
   corners of the printed page to define a general known rectangular area
   - Print a bar-code or numeric identifier at pre-defined coordinates in 
   the rectangle area
   - Obtain data out of form elements using layout/format information & 
   coordinates previously stored for this identified form


*Bottom line*: Is this possible? *How to do this*? What do I need to learn 
in order to get to a point where I know how to use OCRopus (or other 
libraries) to achieve these results?


------------------------------

Related Links (describe some technical aspects & bits of theoretical 
solutions, but no practical road-map of how to actualize this)

   - 
   
http://stackoverflow.com/questions/15227243/what-is-the-proper-way-to-test-if-checkbox-is-ticked-on-scanned-document
   - 
   
https://groups.google.com/forum/#!searchin/tesseract-ocr/checkbox/tesseract-ocr/kvyILJMuuCI/iJeQc0ga-OkJ


-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ocropus/bbeedc2b-9d59-4251-8c34-1a61c619fc07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ocropus] How to Read Prepared/Generated Forms (known layout) and Obtain Check-Box Data

Reply via email to