At 1:15 am -0400 1/8/06, Horst Herb wrote:
On Monday 31 July 2006 23:58, Ian Cheong wrote:
 I think David said they might have got those settings wrong to start.
 So he wants to convert them after the fact, which is no problem.

 I am seriously looking at using Acrobat for scanning, as it can do
 attempted OCR and still keep bitmaps of the bits it can't OCR
 properly. So the resulting pdfs are text searchable. OCR speed
 appears to a minor problem. (Usual problem of compression CPUs vs
 storage MB tradeoff.)

And how would Acrobat know it failed with the OCR?
I really tried it a lot. Especially with distinguishing numbers form
characters there are still big problems, and depending on font some numbers
(e.g. 1 and 7) get frequently mixed up - a catastrophy waiting to happen if
you rely on such machine interpreted data


The resulting page is an equivalent image to the original scanned image, just where there are recognisable text patterns, the pdf has a textual representation of the image of the text character, which renders on screen just like the original image.

Really it's just a complex character based form of image compression (using characters rather than machine nonsense patterns as patterns).

I can send an example if you like.


Ian.
--
Dr Ian R Cheong, BMedSc, FRACGP, GradDipCompSc, MBA(Exec)
Health Informatics Consultant, Brisbane, Australia
Internet: [EMAIL PROTECTED]
(for urgent matters, please send a copy to my practice email as well: [EMAIL PROTECTED])

PRIVACY NOTE
I am happy for others to forward on email sent by me to public email lists.
Please ask my permission first if you wish to forward private email to other parties.
_______________________________________________
Gpcg_talk mailing list
[email protected]
http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk

Reply via email to