I'm runing tesseract .01. My os is windows 7.I added the  files as you
said. But when I run the command tesseract input output bazaar it says
can't find the file eng.user-words. But the file is there.

Thanks!

On Sun, Aug 12, 2012 at 4:37 PM, zdenko podobny <[email protected]> wrote:

> please post details (OS, tesseract version, exact error message...)
>
> --
> Zdenko
>
> On Sun, Aug 12, 2012 at 7:32 AM, Chathuri Gunawardhana <
> [email protected]> wrote:
>
>> I followed
>> http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html#_config_files_and_augmenting_with_user_data
>>  .But
>> I'm  getting error could not open user-data. User data file is actually in
>> correct location. But it says that file is not there. Any suggestions?
>>
>> Thanks!
>>
>> On Sat, Aug 11, 2012 at 6:48 PM, Chathuri Gunawardhana <
>> [email protected]> wrote:
>>
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: zdenko podobny <[email protected]>
>>> Date: Sat, Aug 11, 2012 at 6:38 PM
>>> Subject: Re: Having traindata files uncombined
>>> To: [email protected]
>>>
>>>
>>> Yeah - it is much better ;-)
>>> Unfortunately at the moment I do not have time for deep testing so here
>>> are my suggestions:
>>>
>>>    - if you are using tesseract via api, try to set rectangles (instead
>>>    of whole image) with coords of city names to avoid "noise" (e.g. 
>>> contours)
>>>    from map. tesseract is "noise sensitive" and noise can decrease ocr 
>>> quality
>>>    - if you are using tesseract executable try to extract city names to
>>>    individual images
>>>    - after this you can start to play with dictionaries ;-)
>>>    - you can use user_words "outside" of traineddata file see [1]
>>>    - try to play with page segmentation parameter (psm)
>>>    - I am not aware how to increase (or decrease) strength of
>>>    dictionaries in tesseract 3.02 (e.g. to force tesseract to output only
>>>    words from dictionaries...)
>>>
>>> I believe after this you can at least evaluate if tesseract is suitable
>>> for your task...
>>>
>>> [1]
>>> http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html#_config_files_and_augmenting_with_user_data
>>>
>>> --
>>> Zdenko
>>>
>>> On Sat, Aug 11, 2012 at 2:23 PM, Chathuri Gunawardhana <
>>> [email protected]> wrote:
>>>
>>>> actually you can use this image under
>>>> http://www.taprobanetravels.com/images/map-of-sri-lanka.jpg. It is
>>>> high quality than above.
>>>>
>>>>
>>>> On Sat, Aug 11, 2012 at 4:40 PM, zdenko podobny <[email protected]>wrote:
>>>>
>>>>>
>>>>> On Sat, Aug 11, 2012 at 12:58 PM, Chathuri Gunawardhana <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Image that I'm trying to identify is attached. Most words in here are
>>>>>> not identified correctly. I added these words to user words and combined.
>>>>>> But still didn't get the expected output.
>>>>>>
>>>>>>
>>>>> your attached image has insufficient quality - I get no output for
>>>>> it...
>>>>>
>>>>> --
>>>>> Zdenko
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To post to this group, send email to [email protected]
>>>>> To unsubscribe from this group, send email to
>>>>> [email protected]
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>>>
>>>>
>>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>>
>>>
>>> --
>>> Chathuri Gunawardhana
>>> Undergraduate at University of Moratuwa
>>> Sri Lanka
>>>
>>
>>
>>
>> --
>> Chathuri Gunawardhana
>> Undergraduate at University of Moratuwa
>> Sri Lanka
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 
Chathuri Gunawardhana
Undergraduate at University of Moratuwa
Sri Lanka

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to