Please check gitHub.com/shreeshrii/tesstrain-akan
The data folder has the fine-tuned traineddata file also.
Since akan is written in Latin script this was easy to do.
On Sat, Apr 25, 2020, 08:40 Shree Devi Kumar wrote:
> On Sat, Apr 25, 2020 at 2:13 AM Peyi Oyelo wrote:
>
>> @shree hello
On Sat, Apr 25, 2020 at 2:13 AM Peyi Oyelo wrote:
> @shree hello sir/maam?
>
Maam :-)
>
> On Wednesday, April 22, 2020 at 7:23:28 AM UTC-7, Peyi Oyelo wrote:
>>
>> I created the akan.traineddata using the typical tesseract 3 legacy
>> workflow.
>>
>
OK. The box/tiff pairs work for creating
@shree hello sir/maam?
On Wednesday, April 22, 2020 at 7:23:28 AM UTC-7, Peyi Oyelo wrote:
>
> I created the akan.traineddata using the typical tesseract 3 legacy
> workflow. I do not have word/freq/punc lists. As of now I would like to
> train using lstm to support as many fonts i.e. 45000
I created the akan.traineddata using the typical tesseract 3 legacy
workflow. I do not have word/freq/punc lists. As of now I would like to
train using lstm to support as many fonts i.e. 45000 fonts, as possible.
The existing akan.traineddata was only trained to work with DejaVu Sans
New
For evaluating OCR accuracy of tesseract models, you can use the following:
https://github.com/impactcentre/ocrevalUAtion
or
https://github.com/eddieantonio/ocreval
How did you create akan.traineddata?
Do you need to train it only for one font?
On Tue, Apr 21, 2020 at 11:06 PM Peyi Oyelo
Please share couple of image files and their corresponding text version so
that I can see what will work best.
On Tue, Apr 21, 2020, 20:17 Peyi Oyelo wrote:
> Hello Shree and sorry for reviving an old dead thread. I am currently
> trying to train Tesseract to recognize the Akan language. I have
Hello Shree and sorry for reviving an old dead thread. I am currently
trying to train Tesseract to recognize the Akan language. I have been able
to create a trained data file that can recognize akan, however this does
not use Tesseract's lstm network. I am now trying to perform lstm training
Hello Shree,
On Friday, January 6, 2017 at 12:09:15 PM UTC+1, shree wrote:
>
> Does anyone know of any utilities to convert a box file to ground truth
> text file?
>
> I am using tesstrain.sh which uses text2image for trying out LSTM
> training. However, because unrenderable words are not
8 matches
Mail list logo