For clarification: With "text", I meant languages

On Monday, July 6, 2015 at 3:07:36 PM UTC+2, Brennan Nunamaker wrote:
>
> I need to use my own trained data, because in the future we will be using 
> it on text that has no trained data, so we will have to generate it 
> ourselves. If I don't understand what I am doing wrong, I won't be able 
> to... 
>
> Thank you anyway
>
> On Monday, July 6, 2015 at 3:03:20 PM UTC+2, shree wrote:
>>
>> Did you try with the Latin traineddata 
>>
>>
>> https://github.com/tesseract-ocr/tessdata/blob/master/lat.traineddata?raw=true
>>
>>
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Mon, Jul 6, 2015 at 5:46 PM, Brennan Nunamaker <[email protected]> 
>> wrote:
>>
>>> Hello,
>>>
>>> I just generated the traineddata file for an old historical version of 
>>> latin text, but when I run tesseract on the .tif that I used to train 
>>> tesseract for the language (as well as with other sample images), it 
>>> returns an empty result. However, when I use the English language for 
>>> classification, it generates text with a few errors due to a lack of 
>>> recognition for some specific characters. (Meaning that the fault lies with 
>>> the traineddata and not the samples I am running it on)
>>>
>>> Why could this be? I have been struggling to even generate the 
>>> traineddata, and ended up using a fairly short training text (see 
>>> attachment). Do I need to use a longer training text/tif?
>>>
>>> If anyone could point me in the right direction I would be extremely 
>>> grateful.
>>>
>>> Thanks in advance!
>>> -Brennan
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/29355c0a-deeb-4f65-a176-9abae60bcb9c%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/29355c0a-deeb-4f65-a176-9abae60bcb9c%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/69ff93c3-3a56-498a-8cfc-417c7fc2aab4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to