On Tue, 4 Nov 2025, VY wrote:
The problem that I am facing so far. I use a mediocre Android phone to scan/convert the form into a PDF (I use Adobe Scan app) just as a test. Then I tried tesseract or some python libs to recognize the words (both the printed questions on the form as well as hand-writing words). Both tesseract or the python libs can recognize the printed questions but handle very poorly on the hand-writing words. I suspect maybe my phone camera is not "good enough" even though it is advertised to be 50MP.
VY, I've not used OCR in a very long time yet reading the above brought a thought to mind so I'm making it public. FWIW, I successfully used gocr for my scanned-to-text needs. I've seen paper forms (e.g., Aetna health insurance reimbursement request) where all user input is in sets of boxes, one character per box. No cursive allowed so each character should be more easily identified by the OCR software. HTH, Rich
