Thanks Shree On Monday, June 15, 2020 at 6:46:56 AM UTC+5:30, shree wrote: > > See https://github.com/tesseract-ocr/tesstrain/wiki for links regarding > tesseract training for handwriting > > On Sun, Jun 14, 2020, 23:20 mit <[email protected] <javascript:>> wrote: > >> Hi Shree, >> >> Can we train tesseract for handwritten date? >> >> TIA >> >> On Saturday, February 1, 2020 at 5:13:10 PM UTC+5:30, shree wrote: >>> >>> https://github.com/impactcentre/ocrevalUAtion >>> >>> https://github.com/eddieantonio/ocreval >>> >>> https://github.com/tesseract-ocr/tesstrain/wiki/German-Konzilsprotokolle >>> >>> >>> On Sat, Feb 1, 2020 at 4:31 PM manu pranay <[email protected]> wrote: >>> >>>> thank you shree. >>>> I am done with my retraining top layer training with a good accuracy >>>> rate. >>>> but i wanted to know, how can find accuracy in terms of percentage ? >>>> and can you please help how can i train handwritten pdf. >>>> thank you very much for your help. >>>> >>>> >>>> On Sat, Feb 1, 2020 at 12:33 PM Shree Devi Kumar <[email protected]> >>>> wrote: >>>> >>>>> lstmtraining \ >>>>> --debug_interval -1 \ >>>>> --traineddata data/modi/modi.traineddata \ >>>>> --append_index 5 --net_spec "[Lfx128 O1c1]" \ >>>>> --continue_from data/mar/modi.lstm \ >>>>> --model_output data/modi/checkpoints/modiLayer \ >>>>> --train_listfile data/modi/list.train \ >>>>> --eval_listfile data/modi/list.eval \ >>>>> --max_iterations 999999 >>>>> >>>>> On Sat, Feb 1, 2020 at 11:33 AM manu pranay <[email protected]> >>>>> wrote: >>>>> >>>>>> Thank you so much for your help shree. >>>>>> the links you provided were very helpful for me. >>>>>> >>>>>> now i am trying to train lstm training with retraining the top layer. >>>>>> can you please provide me with the commands for retraining top layer >>>>>> . >>>>>> >>>>>> thank you very much. >>>>>> >>>>>> >>>>>> On Tue, Jan 28, 2020 at 12:36 PM Shree Devi Kumar <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Please see https://github.com/Shreeshrii/tesstrain-ckb It uses a >>>>>>> modified training text based on what you sent and earlier text that I >>>>>>> had >>>>>>> from Pewan and other corpora. >>>>>>> >>>>>>> Currently the training data includes >>>>>>> * AWN 0-9 >>>>>>> * AEN - ARabic numbers >>>>>>> * No Persian numbers since some shapes are similar to Arabic Numbers >>>>>>> >>>>>>> Fonts do not include those which convert 0-9 to either Arabic or >>>>>>> Persian numbers. >>>>>>> >>>>>>> The replace layer training is still ongoing. The eval results look >>>>>>> much better than the official ara or script/Arabic, however I do not >>>>>>> have >>>>>>> any real world images for testing. >>>>>>> >>>>>>> ArialArial BoldTahomaTahoma Bold >>>>>>> tessdata_fast/ara Accuracy 62.74 63.49 61.56 61.71 >>>>>>> tessdata_fast/ara Basic Arabic 95.68 95.22 95.76 94.10 >>>>>>> tessdata_fast/ara Arabic Extended 0.31 1.13 0.41 1.32 >>>>>>> tessdata_fast/script/Arabic Accuracy 80.99 80.83 83.02 77.17 >>>>>>> tessdata_fast/script/Arabic Basic Arabic 96.68 96.34 96.05 93.87 >>>>>>> tessdata_fast/script/Arabic Arabic Extended 57.20 58.23 63.76 54.72 >>>>>>> ckbLayer_1.661_152089_296500 >>>>>>> ckbLayer_fast Accuracy 98.20 97.78 98.06 96.13 >>>>>>> ckbLayer_fast Basic Arabic 99.10 99.15 98.54 98.44 >>>>>>> ckbLayer_fast Arabic Extended 98.30 98.70 99.10 96.27 >>>>>>> >>>>>>> >>>>>>> On Mon, Jan 13, 2020 at 7:17 PM Ayub Rauf wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> I attached full training text with forbidden_characters in it. >>>>>>>> really both of number types will be used and I see two type numbers >>>>>>>> written in books but Kurdish institute verified that Arabic numbers >>>>>>>> will be >>>>>>>> used from now on. Persian numbers written by Iranian Kurds and Arabic >>>>>>>> number used by Iraqi Kurds but as I said numbers in ckb should be >>>>>>>> written by Arabic type, but we have to recognize two type in OCR. >>>>>>>> just like two types of "ك" and "ک" that written in books but now we >>>>>>>> only use "ک". >>>>>>>> I think these similarities won't into problem after that we can >>>>>>>> correct letters in a spell checker. >>>>>>>> As I said before Arial and Tahoma fonts are the most used fonts >>>>>>>> books written by. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "tesseract-ocr" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWm%3DXQaxBergf5-OUE-C8jB3u12dSOPUPchRZT4w21Z-g%40mail.gmail.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWm%3DXQaxBergf5-OUE-C8jB3u12dSOPUPchRZT4w21Z-g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3Dbip7ehaT3VWcSoHN4HX5eP8Lmoe7tgdPcYoBLywrbuEA%40mail.gmail.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3Dbip7ehaT3VWcSoHN4HX5eP8Lmoe7tgdPcYoBLywrbuEA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ____________________________________________________________ >>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWACq5BRwR7JEyU0VUNB5U0ffKxzO5yi0zkPQpycOBBSA%40mail.gmail.com >>>>> >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWACq5BRwR7JEyU0VUNB5U0ffKxzO5yi0zkPQpycOBBSA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3DaznSo5g%3D%3D4Wtkyifi63_pb1c1DwqHGRgx8zmS8nmwOuA%40mail.gmail.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3DaznSo5g%3D%3D4Wtkyifi63_pb1c1DwqHGRgx8zmS8nmwOuA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/d12857d3-a217-48a1-88b7-6865213b777bo%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/d12857d3-a217-48a1-88b7-6865213b777bo%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0a9cafe6-7cb9-4532-b405-d7cb565dcb9co%40googlegroups.com.

