Greetings Restorers,
  I think a number of us have wanted to restore software that's only available as a scanned listing from a line printer.  The original printout probably wasn't the best typographic quality, and scanning doesn't improve it.   As a first pass, OCR with tools like Adobe Acrobat can easily produce a rough draft of the content in text form, but it takes almost as much work to correct the many "typos" as it does to simply re-type the listing.   It seems like, with all this high-tech AI processing around, it should be possible to take advantage of the limited character set, fixed fonts, and restricted grammar that one might find in a listing to resolve more of the ambiguities in character recognition.   Does anyone have an approach that's more efficient than generic OCR and a long process of correcting typos on every line of code or comment?
  Thanks
/guy


Reply via email to