https://github.com/impactcentre/ocrevalUAtion

https://github.com/eddieantonio/ocreval

https://github.com/tesseract-ocr/tesstrain/wiki/German-Konzilsprotokolle


On Sat, Feb 1, 2020 at 4:31 PM manu pranay <[email protected]> wrote:

> thank you shree.
> I am done with my retraining top layer training with a good accuracy rate.
> but i wanted to know, how can find accuracy in terms of percentage ?
> and can you please help how can i train handwritten pdf.
> thank you very much for your help.
>
>
> On Sat, Feb 1, 2020 at 12:33 PM Shree Devi Kumar <[email protected]>
> wrote:
>
>> lstmtraining \
>>   --debug_interval -1 \
>>   --traineddata data/modi/modi.traineddata \
>>   --append_index 5 --net_spec "[Lfx128 O1c1]" \
>>   --continue_from data/mar/modi.lstm \
>>   --model_output data/modi/checkpoints/modiLayer \
>>   --train_listfile data/modi/list.train \
>>   --eval_listfile data/modi/list.eval \
>>   --max_iterations 999999
>>
>> On Sat, Feb 1, 2020 at 11:33 AM manu pranay <[email protected]>
>> wrote:
>>
>>> Thank you so much for your help shree.
>>> the links you provided were very helpful for me.
>>>
>>> now i am trying to train lstm training with retraining the top layer.
>>> can you please provide me with the commands for  retraining top layer .
>>>
>>> thank you very much.
>>>
>>>
>>> On Tue, Jan 28, 2020 at 12:36 PM Shree Devi Kumar <[email protected]>
>>> wrote:
>>>
>>>> Please see https://github.com/Shreeshrii/tesstrain-ckb It uses a
>>>> modified training text based on what you sent and earlier text that  I had
>>>> from Pewan and other corpora.
>>>>
>>>> Currently the training data includes
>>>> * AWN 0-9
>>>> * AEN - ARabic numbers
>>>> * No Persian numbers since some shapes are similar to Arabic Numbers
>>>>
>>>> Fonts do not include those which convert 0-9 to either Arabic or
>>>> Persian numbers.
>>>>
>>>> The replace layer training is still ongoing. The eval results look much
>>>> better than the official ara or script/Arabic, however I do not have any
>>>> real world images for testing.
>>>>
>>>> ArialArial BoldTahomaTahoma Bold
>>>> tessdata_fast/ara Accuracy 62.74 63.49 61.56 61.71
>>>> tessdata_fast/ara Basic Arabic 95.68 95.22 95.76 94.10
>>>> tessdata_fast/ara Arabic Extended 0.31 1.13 0.41 1.32
>>>> tessdata_fast/script/Arabic Accuracy 80.99 80.83 83.02 77.17
>>>> tessdata_fast/script/Arabic Basic Arabic 96.68 96.34 96.05 93.87
>>>> tessdata_fast/script/Arabic Arabic Extended 57.20 58.23 63.76 54.72
>>>> ckbLayer_1.661_152089_296500
>>>> ckbLayer_fast Accuracy 98.20 97.78 98.06 96.13
>>>> ckbLayer_fast Basic Arabic 99.10 99.15 98.54 98.44
>>>> ckbLayer_fast Arabic Extended 98.30 98.70 99.10 96.27
>>>>
>>>>
>>>> On Mon, Jan 13, 2020 at 7:17 PM Ayub Rauf wrote:
>>>>
>>>>> Hi,
>>>>> I attached full training text with forbidden_characters in it.
>>>>> really both of number types will be used and I see two type numbers
>>>>> written in books but Kurdish institute verified that Arabic numbers will 
>>>>> be
>>>>> used from now on. Persian numbers written by Iranian Kurds and Arabic
>>>>> number used by Iraqi Kurds but as I said numbers in ckb should be
>>>>> written by Arabic type, but we have to recognize two type in OCR.
>>>>> just like two types of "ك" and "ک" that written in books but now we
>>>>> only use "ک".
>>>>> I think these similarities won't into problem after that we can
>>>>> correct letters in a spell checker.
>>>>> As I said before Arial and Tahoma fonts are the most used fonts books
>>>>> written by.
>>>>>
>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWm%3DXQaxBergf5-OUE-C8jB3u12dSOPUPchRZT4w21Z-g%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWm%3DXQaxBergf5-OUE-C8jB3u12dSOPUPchRZT4w21Z-g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3Dbip7ehaT3VWcSoHN4HX5eP8Lmoe7tgdPcYoBLywrbuEA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3Dbip7ehaT3VWcSoHN4HX5eP8Lmoe7tgdPcYoBLywrbuEA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWACq5BRwR7JEyU0VUNB5U0ffKxzO5yi0zkPQpycOBBSA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWACq5BRwR7JEyU0VUNB5U0ffKxzO5yi0zkPQpycOBBSA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3DaznSo5g%3D%3D4Wtkyifi63_pb1c1DwqHGRgx8zmS8nmwOuA%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAOt%3D%2B%3DaznSo5g%3D%3D4Wtkyifi63_pb1c1DwqHGRgx8zmS8nmwOuA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>


-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXhSgnV%3D0nKLGtRSPHWfmArWMn%2BpXYuEN2bX%3DBbR0aGEQ%40mail.gmail.com.

Reply via email to