Thanks Shree

On Monday, June 15, 2020 at 6:46:56 AM UTC+5:30, shree wrote:
>
> See https://github.com/tesseract-ocr/tesstrain/wiki for links regarding 
> tesseract training for handwriting
>
> On Sun, Jun 14, 2020, 23:20 mit <[email protected] <javascript:>> wrote:
>
>> Hi Shree,
>>
>> Can we train tesseract for handwritten date?
>>
>> TIA
>>
>> On Saturday, February 1, 2020 at 5:13:10 PM UTC+5:30, shree wrote:
>>>
>>> https://github.com/impactcentre/ocrevalUAtion 
>>>
>>> https://github.com/eddieantonio/ocreval
>>>
>>> https://github.com/tesseract-ocr/tesstrain/wiki/German-Konzilsprotokolle
>>>  
>>>
>>> On Sat, Feb 1, 2020 at 4:31 PM manu pranay <[email protected]> wrote:
>>>
>>>> thank you shree.
>>>> I am done with my retraining top layer training with a good accuracy 
>>>> rate.
>>>> but i wanted to know, how can find accuracy in terms of percentage ?
>>>> and can you please help how can i train handwritten pdf.
>>>> thank you very much for your help.
>>>>
>>>>
>>>> On Sat, Feb 1, 2020 at 12:33 PM Shree Devi Kumar <[email protected]> 
>>>> wrote:
>>>>
>>>>> lstmtraining \
>>>>>   --debug_interval -1 \
>>>>>   --traineddata data/modi/modi.traineddata \
>>>>>   --append_index 5 --net_spec "[Lfx128 O1c1]" \
>>>>>   --continue_from data/mar/modi.lstm \
>>>>>   --model_output data/modi/checkpoints/modiLayer \
>>>>>   --train_listfile data/modi/list.train \
>>>>>   --eval_listfile data/modi/list.eval \
>>>>>   --max_iterations 999999
>>>>>
>>>>> On Sat, Feb 1, 2020 at 11:33 AM manu pranay <[email protected]> 
>>>>> wrote:
>>>>>
>>>>>> Thank you so much for your help shree. 
>>>>>> the links you provided were very helpful for me. 
>>>>>>
>>>>>> now i am trying to train lstm training with retraining the top layer.
>>>>>> can you please provide me with the commands for  retraining top layer 
>>>>>> .
>>>>>>
>>>>>> thank you very much.
>>>>>>  
>>>>>>
>>>>>> On Tue, Jan 28, 2020 at 12:36 PM Shree Devi Kumar <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> Please see https://github.com/Shreeshrii/tesstrain-ckb It uses a 
>>>>>>> modified training text based on what you sent and earlier text that  I 
>>>>>>> had 
>>>>>>> from Pewan and other corpora.
>>>>>>>
>>>>>>> Currently the training data includes
>>>>>>> * AWN 0-9
>>>>>>> * AEN - ARabic numbers
>>>>>>> * No Persian numbers since some shapes are similar to Arabic Numbers
>>>>>>>
>>>>>>> Fonts do not include those which convert 0-9 to either Arabic or 
>>>>>>> Persian numbers.
>>>>>>>
>>>>>>> The replace layer training is still ongoing. The eval results look 
>>>>>>> much better than the official ara or script/Arabic, however I do not 
>>>>>>> have 
>>>>>>> any real world images for testing.
>>>>>>>
>>>>>>> ArialArial BoldTahomaTahoma Bold
>>>>>>> tessdata_fast/ara Accuracy 62.74 63.49 61.56 61.71
>>>>>>> tessdata_fast/ara Basic Arabic 95.68 95.22 95.76 94.10
>>>>>>> tessdata_fast/ara Arabic Extended 0.31 1.13 0.41 1.32
>>>>>>> tessdata_fast/script/Arabic Accuracy 80.99 80.83 83.02 77.17
>>>>>>> tessdata_fast/script/Arabic Basic Arabic 96.68 96.34 96.05 93.87
>>>>>>> tessdata_fast/script/Arabic Arabic Extended 57.20 58.23 63.76 54.72
>>>>>>> ckbLayer_1.661_152089_296500 
>>>>>>> ckbLayer_fast Accuracy 98.20 97.78 98.06 96.13
>>>>>>> ckbLayer_fast Basic Arabic 99.10 99.15 98.54 98.44
>>>>>>> ckbLayer_fast Arabic Extended 98.30 98.70 99.10 96.27
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jan 13, 2020 at 7:17 PM Ayub Rauf wrote:
>>>>>>>
>>>>>>>> Hi, 
>>>>>>>> I attached full training text with forbidden_characters in it.
>>>>>>>> really both of number types will be used and I see two type numbers 
>>>>>>>> written in books but Kurdish institute verified that Arabic numbers 
>>>>>>>> will be 
>>>>>>>> used from now on. Persian numbers written by Iranian Kurds and Arabic 
>>>>>>>> number used by Iraqi Kurds but as I said numbers in ckb should be 
>>>>>>>> written by Arabic type, but we have to recognize two type in OCR. 
>>>>>>>> just like two types of "ك" and "ک" that written in books but now we 
>>>>>>>> only use "ک".
>>>>>>>> I think these similarities won't into problem after that we can 
>>>>>>>> correct letters in a spell checker. 
>>>>>>>> As I said before Arial and Tahoma fonts are the most used fonts 
>>>>>>>> books written by. 
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "tesseract-ocr" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWm%3DXQaxBergf5-OUE-C8jB3u12dSOPUPchRZT4w21Z-g%40mail.gmail.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWm%3DXQaxBergf5-OUE-C8jB3u12dSOPUPchRZT4w21Z-g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3Dbip7ehaT3VWcSoHN4HX5eP8Lmoe7tgdPcYoBLywrbuEA%40mail.gmail.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3Dbip7ehaT3VWcSoHN4HX5eP8Lmoe7tgdPcYoBLywrbuEA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>>
>>>>> ____________________________________________________________
>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWACq5BRwR7JEyU0VUNB5U0ffKxzO5yi0zkPQpycOBBSA%40mail.gmail.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWACq5BRwR7JEyU0VUNB5U0ffKxzO5yi0zkPQpycOBBSA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3DaznSo5g%3D%3D4Wtkyifi63_pb1c1DwqHGRgx8zmS8nmwOuA%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3DaznSo5g%3D%3D4Wtkyifi63_pb1c1DwqHGRgx8zmS8nmwOuA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> -- 
>>>
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/d12857d3-a217-48a1-88b7-6865213b777bo%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/d12857d3-a217-48a1-88b7-6865213b777bo%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0a9cafe6-7cb9-4532-b405-d7cb565dcb9co%40googlegroups.com.

Reply via email to