Thanks for all the input!
On Tue, Nov 4, 2025 at 8:52 PM Ted Mittelstaedt <[email protected]> wrote: > I would not expect tesseract to recognize handwritten stuff with any > accuracy. > > OCR technology has been around a long time as it was developed by the > industry as a bridge technology to go from 100% paper office workflows to > 100% electronic workflows. At one time it was very important but it's > importance has fallen drastically. This is why major vendors like HP and > Google have abandoned projects like Tesseract, because OCR is becoming a > niche tool. > > With IT technology niche tools tend to increase in cost, and attract > proprietary solutions that then tend to drive standards-based solutions out > of the market, because the proprietary stuff is better. This then > stimulates businesses (who are the source of funding of these tools) to > look for alternative solutions. Many find them which then further shrinks > the niche market, driving prices up further, stimulating even more people > to abandon them - well you get the picture. This is why for example > Tungsten was able to buy up all those companies. They recognized it would > be possible to monopolize this hybrid paper/electronic workflow market > because there WOULD be a small percentage of businesses who absolutely > would not let go of hybrid paper workflows until you pried them from their > cold, dead, fingers. > > The only reason I could get away with using AMC is because when you are > doing surveys you want to deliberately anonymize them so I don't need to > collect names, addresses, etc. > > If we HAD to collect names and addresses we would have used cheap tablets > and a website with forms then handed the person the tablet. > > Other than that, I can't really advise you other than to say it's not > likely we are going to see open source handwriting recognition that is > free, since it's going to require access to a pretty powerful AI engine to > do it. > > Ted > > -----Original Message----- > From: PLUG <[email protected]> On Behalf Of VY > Sent: Tuesday, November 4, 2025 6:23 PM > To: Portland Linux/Unix Group <[email protected]> > Subject: Re: [PLUG] document scanner > > >Both tesseract or the python libs can recognize the printed questions but > handle very poorly on the hand-writing words. I suspect maybe my phone > >camera is not "good enough" even though it is advertised to be 50MP. > > > On Tue, Nov 4, 2025 at 6:02 AM Ted Mittelstaedt <[email protected]> > wrote: > > > Followup on the "handwritten forms" > > > > If you are able to convert these forms into a multiple-choice form > > that people fill in boxes by hand, instead of writing actual words on > > them, I can Tell you how to convert these into actual data output. At > > my office we have this customer satisfaction survey thing that we do > periodically, and > > For a zillion unrelated reasons it has to be handed out on paper. The > > department doing it was wringing their hands over this as it would > > take Hours and hours and hours for some poor soul to go through all of > > the forms and input the results into a spreadsheet. > > > > I looked into commercial products that do this - and there's only 1 > > company out there that sells software nowadays for this - Tungsten > > Automation. These turkeys have been spending the last decade buying > > up every company in the "hybrid paper workflows" market and they now > > have a complete monopoly on it - and literally they sell complete > > systems, they no longer sell standalone software that does forms > > conversions. Pricing is quote-only and it's in the low 5 figures. > > > > I found an open source software program for this and built a system > > around it - so now, they just feed the 300 or so paper surveys into > > the hopper in a scanner like what I just linked to, and all the > > resulting PDF's get fed into the system and the data is then loaded > > into a MariaDB database. I then took their Excel spreadsheet and > > converted it into a front end using the ODBC drivers for MariaDB. Works > slick saves hours of drudgery. > > > > Anyway, this is only good for multiple choice click box forms that > > people fill out by hand. For OCR of cursive or handwritten printing - > > good effing luck. > > > > Ted > > > > -----Original Message----- > > From: PLUG <[email protected]> On Behalf Of VY > > Sent: Tuesday, November 4, 2025 4:47 AM > > To: General Linux/UNIX discussion and help, civil and on-topic < > > [email protected]> > > Subject: [PLUG] document scanner > > > > Dear All > > > > I am looking for a good document scanner that is Linux compatible. > > Better yet if it is Raspberry Pi compatible. > > > > I have a bunch of forms that have hand writing on them. I will be > getting > > them on a regular basis and I like to scan them and convert them to > > high-resolution PDFs. > > > > Any pointer for such a scanner is much appreciated. > > > > -Vincent > > > > > >
