Can't help you there I'm afraid. I have no experience with tesseract.js.

/René


On Sat, 31 Aug 2019 at 17:28, Clint William Theron <
[email protected]> wrote:

> Thanks for your response. I already tried your suggestions and I now and
> then get the desired result. What I'm looking to do now is train tesseract
> but I don't get tesseract to use my traineddata language. My app is a
> browser web app that runs on HTTP apache server. I would that you could
> answer my SO question:
>
>
> https://stackoverflow.com/questions/57715343/how-do-i-specify-traineddata-language-path-and-language-code-when-using-tesser
>
> Thanks
>
> On Friday, August 30, 2019, René Hansen <[email protected]> wrote:
> > A few config params wont do the trick. You need to preprocess the image.
> Make sure you read this
> https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality
> >
> > Ideally I think you need to cook down the image you give tesseract to
> something like this:
> >
> </mail/u/0/s/?view=att&th=16ce456a472fa41a&attid=0.2&disp=emb&realattid=ii_jzylkbga1&zw&atsh=1>
> >
> > Even this isn't quite good enough though. I get "NG: 1020452" as a
> result from https://tesseract.projectnaptha.com
> >
> > You might need to train on this specific font to get better results, or
> do further preprocessing to increase accuracy.
> >
> > /René
> >
> > On Fri, 30 Aug 2019 at 21:19, Clint William Theron <
> [email protected]> wrote:
> >>
> >> Consider the following image and output:
> >>
> >>
> </mail/u/0/s/?view=att&th=16ce456a472fa41a&attid=0.1&disp=emb&realattid=31454d70-91ee-42dc-b88c-786a6f11d05c&zw&atsh=1>
> >>
> >> Tesseract's recognition output:
> >> LUHO: R54 MILLION GTD
> >> LOTTO PLUS 1: R6,! MILLION est
> >> LOTTO PLUS 2: R7,4 MILLION est
> >> NIN YOUR SHARE OF R1,! MILLION!!!
> >> Buy any NATIONAL LOTTERY t1cket ther
> >> SMS :ID,#PLAY,TICKET CODE TO 34909.
> >> Cash Prizes to be won!!! T’s and C’
> >> apply vtsit National Lottery website
> >> PLEASE RETAIN YOUR ENTRY TICKET!
> >> First Draw: Saturday 20/07/19
> >> VALID RECEIPT FOR 1 Oraw(S)
> >> FROM DRAW 1937 To 1937
> >> LOTTO PLUS 1: ND
> >> LUTTU PLUS 2: ND
> >> ‘TotaT:R5.00
> >> _‘,{gxt, Inc! 152 VA
> >> I'm a newbie when it comes to Tesseract.js. I know there is a way to
> include config parameters to increase the accuracy for OCR. In the above
> image I'm interested in getting the numbers, between the two horizontal
> dashed stripes, in the image. Would you give a few config parameters to
> include in the recognize method to see if it might improve the OCR accuracy.
> >> Thank you in advance.  Ps. Anything would be helpfull
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> >> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/2a937bf3-8c97-466d-a9bb-26a277e02522%40googlegroups.com
> .
> >
> >
> > --
> > Never fear, Linux is here.
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAB-60nj7hGExHq8Y8VeXKDODgLBF1EJtCOGikU%2BCK%2B6fAu-uHA%40mail.gmail.com
> .
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAOPMViqDHvcR4Be44sRJL3M9i1DFOkm59C68%3DE7aOjL3sLU9gw%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAOPMViqDHvcR4Be44sRJL3M9i1DFOkm59C68%3DE7aOjL3sLU9gw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Never fear, Linux is here.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAB-60nijObav%2BvivzzAUr7sgs%3D97iFLn1UN7mNdC1%2BSQWcNfiQ%40mail.gmail.com.

Reply via email to