At 10:29 pm +1000 19/8/06, [EMAIL PROTECTED] wrote:
I would also like to know what people are doing in this quarter. We are
looking at some research projects where we might be OCRing 100,000 pages
(already scanned), so any simple efficienies will be worhtwhile.
thanks
jon patrick
Quoting Ian Cheong <[EMAIL PROTECTED]>:

 I would like to know if anybody is scanning to OCR'd pdfs, or knows
 anyone in GP land who is doing so.

 I'd prefer to do this to scanning only images or only OCR, because
 the content is then text searchable with background image retained
 for "medicolegal" puposes.

 There seem to be several commercial products ranging from 100s to
 1000s of dollars that will do this - interested to know what products
 work well.
 >


You can download several of the commercial programs in demo (time limited) version for free - so DIY testing it is for now. Having trouble finding any good comparative reviews on the net.

My limited testing suggests you could be stuck with "horses for courses" - depending on quality of scanned images and need for recognition accuracy.

A friend who works for an accounting firm says OmniPage is best - but then they might have a bent for accurate numbers?

I scanned the same invoice with logotype, text fonts, tiny fonts and ran them through a couple of OCR programs that do layered PDF image+text. File size differed by a factor of more than 10. Recognition accuracy differed depending on which quality original scan was used. There was no clear consistent winner.


Anyone interested in the GP scanning challenge??

We could get some representative examples of anomyised scanned documents of varying quality to test OCR engines for:
recognition accuracy
speed
file size

They and the test results could have a permanent home on ozdocit.


Ian.


--
Dr Ian R Cheong, BMedSc, FRACGP, GradDipCompSc, MBA(Exec)
Health Informatics Consultant, Brisbane, Australia
Internet: [EMAIL PROTECTED]
(for urgent matters, please send a copy to my practice email as well: [EMAIL PROTECTED])

PRIVACY NOTE
I am happy for others to forward on email sent by me to public email lists.
Please ask my permission first if you wish to forward private email to other parties.
_______________________________________________
Gpcg_talk mailing list
[email protected]
http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk

Reply via email to