do you perhaps have an answer for this one:
https://groups.google.com/forum/#!topic/tesseract-ocr/W0e9iusQmi4



On Saturday, August 31, 2019 at 11:18:12 PM UTC+2, Clint William Theron 
wrote:
>
> Thanks. I understand. Which tesseract do you have experience with? In 
> windows 10 I'm able to replace the eng.traineddata file with my own and 
> then tesseract uses my language. That is what I'm looking for but it has to 
> be something online (not local).
>
> On Saturday, August 31, 2019 at 8:17:25 PM UTC+2, René Hansen wrote:
>>
>> Can't help you there I'm afraid. I have no experience with tesseract.js.
>>
>>
>> /René
>>
>>
>> On Sat, 31 Aug 2019 at 17:28, Clint William Theron <
>> [email protected]> wrote:
>>
>>> Thanks for your response. I already tried your suggestions and I now and 
>>> then get the desired result. What I'm looking to do now is train tesseract 
>>> but I don't get tesseract to use my traineddata language. My app is a 
>>> browser web app that runs on HTTP apache server. I would that you could 
>>> answer my SO question:
>>>
>>>
>>> https://stackoverflow.com/questions/57715343/how-do-i-specify-traineddata-language-path-and-language-code-when-using-tesser
>>>
>>> Thanks
>>>
>>> On Friday, August 30, 2019, René Hansen <[email protected]> wrote:
>>> > A few config params wont do the trick. You need to preprocess the 
>>> image. Make sure you read this 
>>> https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality
>>> >
>>> > Ideally I think you need to cook down the image you give tesseract to 
>>> something like this:
>>> > 
>>> </mail/u/0/s/?view=att&th=16ce456a472fa41a&attid=0.2&disp=emb&realattid=ii_jzylkbga1&zw&atsh=1>
>>> >
>>> > Even this isn't quite good enough though. I get "NG: 1020452" as a 
>>> result from https://tesseract.projectnaptha.com
>>> >
>>> > You might need to train on this specific font to get better results, 
>>> or do further preprocessing to increase accuracy.
>>> >
>>> > /René
>>> >
>>> > On Fri, 30 Aug 2019 at 21:19, Clint William Theron <
>>> [email protected]> wrote:
>>> >>
>>> >> Consider the following image and output:
>>> >>
>>> >> 
>>> </mail/u/0/s/?view=att&th=16ce456a472fa41a&attid=0.1&disp=emb&realattid=31454d70-91ee-42dc-b88c-786a6f11d05c&zw&atsh=1>
>>> >>
>>> >> Tesseract's recognition output:
>>> >> LUHO: R54 MILLION GTD
>>> >> LOTTO PLUS 1: R6,! MILLION est
>>> >> LOTTO PLUS 2: R7,4 MILLION est
>>> >> NIN YOUR SHARE OF R1,! MILLION!!!
>>> >> Buy any NATIONAL LOTTERY t1cket ther
>>> >> SMS :ID,#PLAY,TICKET CODE TO 34909.
>>> >> Cash Prizes to be won!!! T’s and C’
>>> >> apply vtsit National Lottery website
>>> >> PLEASE RETAIN YOUR ENTRY TICKET!
>>> >> First Draw: Saturday 20/07/19
>>> >> VALID RECEIPT FOR 1 Oraw(S)
>>> >> FROM DRAW 1937 To 1937
>>> >> LOTTO PLUS 1: ND
>>> >> LUTTU PLUS 2: ND
>>> >> ‘TotaT:R5.00
>>> >> _‘,{gxt, Inc! 152 VA
>>> >> I'm a newbie when it comes to Tesseract.js. I know there is a way to 
>>> include config parameters to increase the accuracy for OCR. In the above 
>>> image I'm interested in getting the numbers, between the two horizontal 
>>> dashed stripes, in the image. Would you give a few config parameters to 
>>> include in the recognize method to see if it might improve the OCR accuracy.
>>> >> Thank you in advance.  Ps. Anything would be helpfull 
>>> >>
>>> >> --
>>> >> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> >> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to [email protected].
>>> >> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/2a937bf3-8c97-466d-a9bb-26a277e02522%40googlegroups.com
>>> .
>>> >
>>> >
>>> > --
>>> > Never fear, Linux is here.
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAB-60nj7hGExHq8Y8VeXKDODgLBF1EJtCOGikU%2BCK%2B6fAu-uHA%40mail.gmail.com
>>> .
>>> > 
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOPMViqDHvcR4Be44sRJL3M9i1DFOkm59C68%3DE7aOjL3sLU9gw%40mail.gmail.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOPMViqDHvcR4Be44sRJL3M9i1DFOkm59C68%3DE7aOjL3sLU9gw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> -- 
>> Never fear, Linux is here.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5a000dd0-4eb0-422b-8559-46bd3aa7a037%40googlegroups.com.

Reply via email to