Does the processed text need to be fault-less (ie needing human checking)?

We scan about 6,000 pages per year with human checking of the ocr'd
text. Our process wouldn't scale very well, and even at this low end of
the scale it is very labour intensive.

For a batch of 100,000 you would need high software such as used by the
large law firms to construct their knowledgebases.


Tony Eviston
G.P.
Boonah Medical Centre



As an aside - wouldn't they (the legal profession) love to get their
hands on 100,000 medical documents for trawling purposes! I think I'll
hand write my medico-legal reports from now on.
A quick search on our 58,000 documents for the strings "unfortunately" +
"hospital" returned 491 hits including the likes of the following:

Unfortunately this has resulted in confusion as to who to contact
regarding his care.

Unfortunately the waiting period for varicose vein surgery at xxxxxx
Hospital is several years

 1 note that his cochlear implant late last year has unfortunately not
been successful

Unfortunately the specimen has gone astray.

Unfortunately there was a modest amount of distal embolisation into the
distal obtuse marginal artery

Unfortunately this MRI was not compared to the previous MRI done in
November.

etc


[EMAIL PROTECTED] wrote:
> I would also like to know what people are doing in this quarter. We are
> looking at some research projects where we might be OCRing 100,000 pages
> (already scanned), so any simple efficienies will be worhtwhile.
> thanks
> jon patrick
> Quoting Ian Cheong <[EMAIL PROTECTED]>:
> 
>> I would like to know if anybody is scanning to OCR'd pdfs, or knows
>> anyone in GP land who is doing so.
>>
>> I'd prefer to do this to scanning only images or only OCR, because
>> the content is then text searchable with background image retained
>> for "medicolegal" puposes.
>>
>> There seem to be several commercial products ranging from 100s to
>> 1000s of dollars that will do this - interested to know what products
>> work well.
>>
>>
>> Ian.
>>
>> --
>> Dr Ian R Cheong, BMedSc, FRACGP, GradDipCompSc, MBA(Exec)
>> Health Informatics Consultant, Brisbane, Australia
>> Internet: [EMAIL PROTECTED]
>> (for urgent matters, please send a copy to my practice email as well:
>> [EMAIL PROTECTED])
>>
>> PRIVACY NOTE
>> I am happy for others to forward on email sent by me to public email
>> lists.
>> Please ask my permission first if you wish to forward private email
>> to other parties.
>> _______________________________________________
>> Gpcg_talk mailing list
>> [email protected]
>> http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk
>>
> 
> 
> 
> 
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> _______________________________________________
> Gpcg_talk mailing list
> [email protected]
> http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk
> 
> 

_______________________________________________
Gpcg_talk mailing list
[email protected]
http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk

Reply via email to