Hanging below may be best idea on this sub-thread. Hanging below one
could BACK LIGHT the paper, which would make even lighting much more
easy. A milky white piece of glass with several lights behind it, where
if they were in front they would block the camera. It would require
compensation for the green bar fanfold, but maybe greenish lights would
On 02/11/2018 02:10 PM, Timothe Litt wrote:
On 11-Feb-18 14:29, Davis Johnson wrote:
I think what you need is a wide carriage printer with the typical
feed up through a slot in the bottom, and a camera.
It's not that simple. You need to deal with at least 2 common
vertical pitches (6 & 8 LPI), and a number of page lengths (and
widths). These need to be setup per job; not all printers support all
these. Plus, misalignment (as Al noted, crossing the perforations at
the bottom of a page is quite common). The OP mentioned that his
listings have a hard crease; this will cause (at least) feed and
stacking problems. Form feed causes a high-speed slew; this becomes
less reliable as the distance moved increases. You're proposing an
entire page at a time - which means that the paper will jump off the
tractors frequently. Old paper is fragile. Over hundreds of pages,
dimensions may not be stable; it was not uncommon to have to re-adjust
TOF after a while. There's a fair bit of error detection and recovery
to work out.
The only working function needed from the printer is form feed.
Photograph the page that is hanging below the printer, form feed and
Anybody here ought to be able to handle the programming to automate
You would need to manually photograph the first page.
The camera would need good depth of field.
Lighting is an issue, as is compensating for keystoning and other
misalignments. Most cameras don't have a standard remote trigger
interface - one of the pointers I provided loads modified firmware
into cameras from one manufacturer to make this work. If you look at
digital camera reviews, you'll see that the lenses have varying
degrees of artifacts, especially at the edges. So you need to find
and zoom to an area that's relatively "flat" & doesn't need a lot of
correction. While depth of field will help, it also will result in
apparent font size changes as paper sways forward and back. If you
stop that, you simplify the OCR - and don't need as much depth of field.
There are many backgrounds that need to be subtracted for OCR to
work. (Printer paper was notorious for institutional logos, as well
as bars and other aids to human readers.) Then there are the other
issues mentioned in my earlier note.
It seems simple, but it is a P.roject. That's a capital P. With a lot
of roject to work out.
It's worthwhile, but it's not simple. It's a pretty interesting
hardware (and software) project. I don't mean to discourage anyone
who wants to work on it - but you need to go in with eyes open, or
you'll end up very, very frustrated.
Thunderscan tried to scan line by line & retrieve grayscale; the
challenges were piecing together the adjacent lines with pixel
resolution. The focal distance was constant because the camera was
on a carriage. The idea here is to capture a page per frame. So the
registration problems are quite different. One could try the
thunderscan approach; it would trade one set of problems xxx
"challenges and opportunities" for another.
 In my experience, with many brands and models of tractor feed
printers over many years. Paper handling is really difficult to get
Simh mailing list