Yes, I had made training text with just digits.

Basically, this cuts down on the unicharset in the traineddata to digits.
It finetunes the existing best model to the chosen subset of characters and
does not require too many iterations.

On 04-Jan-2018 7:23 PM, "Thomas Menguy" <[email protected]> wrote:

> Thanks a lot, seen the tutorial but was a bit confused as it is made to «
> remove » characters to let only the digits, but was not sure which chars to
> be removed ...(the whole Unicode minus the digits?) ...
> Anyway thanks again for the answer ... would be awesome if you could find
> back the command line ;)
> BR
>
> Envoyé de mon iPhone
>
> Le 4 janv. 2018 à 10:08, ShreeDevi Kumar <[email protected]> a écrit :
>
> I will have to look for the exact commands and training text I used at
> that time.
>
> You should be able to recreate the training by following instructions
> given at https://github.com/tesseract-ocr/tesseract/wiki/
> TrainingTesseract-4.00#fine-tuning-for--a-few-characters
>
> I had modified the english langdata files and then finally renamed the
> traineddata to digits after completing training.
>
> Create a training text which has digits and signs.
>
> Replace the word list to match the kind of number patterns you expect or
> don't use a word list at all.
>
>
>
> On 04-Jan-2018 12:04 PM, "Thomas Menguy" <[email protected]> wrote:
>
> Hi Shree,
>
> Tried your Data for digits ... really works well!
> Need to do a training set with number and signs for example ... could you
> point me on how you've done your own training data (sorry fairly new to
> Tesseract, never trained it before)
>
> Thanks for your help!
> BR
>
> On Tuesday, October 3, 2017 at 6:39:30 PM UTC+2, shree wrote:
>>
>> You can try the plus-minus type of training if you just want a digits
>> type of traineddata.
>>
>> Your training_text can contain numbers in the format you need and you can
>> train with a font matching your images.
>>
>> For proof of concept you can try my experimental version at
>>
>> https://github.com/Shreeshrii/tessdata4alpha/blob/master/fas
>> t/digits.traineddata
>>
>> On Friday, September 29, 2017 at 12:32:41 PM UTC+5:30, John Miller wrote:
>>>
>>> Today,I found that the problem had been  posted on
>>> https://github.com/tesseract-ocr/tesseract/issues/751
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/tesseract-ocr/5f98dc8f-55e9-46dc-84b2-4ee1c7adc868%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/5f98dc8f-55e9-46dc-84b2-4ee1c7adc868%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/tesseract-ocr/-oeCTcojYfw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/CAG2NduXyCd3RFDA0G%3DXyYtUa6Cft1afT4KRrEx2%
> 3DFhZKq_yS%2BQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXyCd3RFDA0G%3DXyYtUa6Cft1afT4KRrEx2%3DFhZKq_yS%2BQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/58D78AED-8C8D-44C9-9C70-B7BB5B7E19AE%40gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/58D78AED-8C8D-44C9-9C70-B7BB5B7E19AE%40gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXnfKM9ufqqsUJd-V1nena2bG8z4%3D3HJm-zNh-T_Mi9Wg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to