Can't help you there I'm afraid. I have no experience with tesseract.js.
/René On Sat, 31 Aug 2019 at 17:28, Clint William Theron < [email protected]> wrote: > Thanks for your response. I already tried your suggestions and I now and > then get the desired result. What I'm looking to do now is train tesseract > but I don't get tesseract to use my traineddata language. My app is a > browser web app that runs on HTTP apache server. I would that you could > answer my SO question: > > > https://stackoverflow.com/questions/57715343/how-do-i-specify-traineddata-language-path-and-language-code-when-using-tesser > > Thanks > > On Friday, August 30, 2019, René Hansen <[email protected]> wrote: > > A few config params wont do the trick. You need to preprocess the image. > Make sure you read this > https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality > > > > Ideally I think you need to cook down the image you give tesseract to > something like this: > > > </mail/u/0/s/?view=att&th=16ce456a472fa41a&attid=0.2&disp=emb&realattid=ii_jzylkbga1&zw&atsh=1> > > > > Even this isn't quite good enough though. I get "NG: 1020452" as a > result from https://tesseract.projectnaptha.com > > > > You might need to train on this specific font to get better results, or > do further preprocessing to increase accuracy. > > > > /René > > > > On Fri, 30 Aug 2019 at 21:19, Clint William Theron < > [email protected]> wrote: > >> > >> Consider the following image and output: > >> > >> > </mail/u/0/s/?view=att&th=16ce456a472fa41a&attid=0.1&disp=emb&realattid=31454d70-91ee-42dc-b88c-786a6f11d05c&zw&atsh=1> > >> > >> Tesseract's recognition output: > >> LUHO: R54 MILLION GTD > >> LOTTO PLUS 1: R6,! MILLION est > >> LOTTO PLUS 2: R7,4 MILLION est > >> NIN YOUR SHARE OF R1,! MILLION!!! > >> Buy any NATIONAL LOTTERY t1cket ther > >> SMS :ID,#PLAY,TICKET CODE TO 34909. > >> Cash Prizes to be won!!! T’s and C’ > >> apply vtsit National Lottery website > >> PLEASE RETAIN YOUR ENTRY TICKET! > >> First Draw: Saturday 20/07/19 > >> VALID RECEIPT FOR 1 Oraw(S) > >> FROM DRAW 1937 To 1937 > >> LOTTO PLUS 1: ND > >> LUTTU PLUS 2: ND > >> ‘TotaT:R5.00 > >> _‘,{gxt, Inc! 152 VA > >> I'm a newbie when it comes to Tesseract.js. I know there is a way to > include config parameters to increase the accuracy for OCR. In the above > image I'm interested in getting the numbers, between the two horizontal > dashed stripes, in the image. Would you give a few config parameters to > include in the recognize method to see if it might improve the OCR accuracy. > >> Thank you in advance. Ps. Anything would be helpfull > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > >> To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > >> To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/2a937bf3-8c97-466d-a9bb-26a277e02522%40googlegroups.com > . > > > > > > -- > > Never fear, Linux is here. > > > > -- > > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAB-60nj7hGExHq8Y8VeXKDODgLBF1EJtCOGikU%2BCK%2B6fAu-uHA%40mail.gmail.com > . > > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAOPMViqDHvcR4Be44sRJL3M9i1DFOkm59C68%3DE7aOjL3sLU9gw%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAOPMViqDHvcR4Be44sRJL3M9i1DFOkm59C68%3DE7aOjL3sLU9gw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- Never fear, Linux is here. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAB-60nijObav%2BvivzzAUr7sgs%3D97iFLn1UN7mNdC1%2BSQWcNfiQ%40mail.gmail.com.

