No we are not going to human check everything, but we will do a trial run
to "tune" the OCR beofer doing the full batch.
thanks
jon
Quoting Tony Eviston <[EMAIL PROTECTED]>:

>
>
> Does the processed text need to be fault-less (ie needing human
> checking)?
>
> We scan about 6,000 pages per year with human checking of the ocr'd
> text. Our process wouldn't scale very well, and even at this low end of
> the scale it is very labour intensive.
>
> For a batch of 100,000 you would need high software such as used by the
> large law firms to construct their knowledgebases.
>
>
> Tony Eviston
> G.P.
> Boonah Medical Centre
>
>
>
> As an aside - wouldn't they (the legal profession) love to get their
> hands on 100,000 medical documents for trawling purposes! I think I'll
> hand write my medico-legal reports from now on.
> A quick search on our 58,000 documents for the strings "unfortunately" +
> "hospital" returned 491 hits including the likes of the following:
>
> Unfortunately this has resulted in confusion as to who to contact
> regarding his care.
>
> Unfortunately the waiting period for varicose vein surgery at xxxxxx
> Hospital is several years
>
>  1 note that his cochlear implant late last year has unfortunately not
> been successful
>
> Unfortunately the specimen has gone astray.
>
> Unfortunately there was a modest amount of distal embolisation into the
> distal obtuse marginal artery
>
> Unfortunately this MRI was not compared to the previous MRI done in
> November.
>
> etc
>
>
> [EMAIL PROTECTED] wrote:
> > I would also like to know what people are doing in this quarter. We
> are
> > looking at some research projects where we might be OCRing 100,000
> pages
> > (already scanned), so any simple efficienies will be worhtwhile.
> > thanks
> > jon patrick
> > Quoting Ian Cheong <[EMAIL PROTECTED]>:
> >
> >> I would like to know if anybody is scanning to OCR'd pdfs, or knows
> >> anyone in GP land who is doing so.
> >>
> >> I'd prefer to do this to scanning only images or only OCR, because
> >> the content is then text searchable with background image retained
> >> for "medicolegal" puposes.
> >>
> >> There seem to be several commercial products ranging from 100s to
> >> 1000s of dollars that will do this - interested to know what products
> >> work well.
> >>
> >>
> >> Ian.
> >>
> >> --
> >> Dr Ian R Cheong, BMedSc, FRACGP, GradDipCompSc, MBA(Exec)
> >> Health Informatics Consultant, Brisbane, Australia
> >> Internet: [EMAIL PROTECTED]
> >> (for urgent matters, please send a copy to my practice email as well:
> >> [EMAIL PROTECTED])
> >>
> >> PRIVACY NOTE
> >> I am happy for others to forward on email sent by me to public email
> >> lists.
> >> Please ask my permission first if you wish to forward private email
> >> to other parties.
> >> _______________________________________________
> >> Gpcg_talk mailing list
> >> [email protected]
> >> http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk
> >>
> >
> >
> >
> >
> > ----------------------------------------------------------------
> > This message was sent using IMP, the Internet Messaging Program.
> > _______________________________________________
> > Gpcg_talk mailing list
> > [email protected]
> > http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk
> >
> >
>
> _______________________________________________
> Gpcg_talk mailing list
> [email protected]
> http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
_______________________________________________
Gpcg_talk mailing list
[email protected]
http://ozdocit.org/cgi-bin/mailman/listinfo/gpcg_talk

Reply via email to