Take a look at
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 for
an overview of  training for 4.0. Follow the tutorials to get a feel of the
training process - you can try for English as well as Malayalam.

In terms of  trainer GUI, I think that it will probably work for `fine
tune` training.

Areas where you could contribute re 4.0 training would be in creating box
files in 4.0 format from scanned images.

Also look at jtessboxeditor which offers tesseract training gui - though
not for 4.0.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sat, Jun 24, 2017 at 7:25 PM, Nalin Linux <[email protected]>
wrote:

>
>
> On Saturday, June 24, 2017 at 7:07:32 PM UTC+5:30, shree wrote:
>>
>> You can update it for 3.05.01
>>
>> I am quit impressed with Tesseract 4.0. And it's working fine for my
> language (Malayalam). Is this trained data for version 4.0 listed in
> https://github.com/tesseract-ocr/tessdata
> created from old language data itself ? (https://github.com/tesseract-
> ocr/langdata).   What about creating a training GUI for version 4.0 ? I
> have two months of time at my disposal for developing such a GUI.
> Please let me know the relevance of this project or else let me switch to
> another relevant free and opensource project.
>
> Thanking you Nalin.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/0929cd89-69c7-4693-be98-14286633d83c%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/0929cd89-69c7-4693-be98-14286633d83c%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWLr%2Be8cT%2B%2BWerrz0%2BS1FH1TFHAJnePLJk17LrdrbULgA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to