Still not working. I tried attaching the config,, but it won't let me 
because it's binary.

I made a workaround by converting all instances of fi into fi in the output, 
but obviously it would be better to strip the unicode first in tesseract.

On a related note, why is tesseract even generating these characters in the 
first place given the fact that I chose English as the training data?

On Monday, April 29, 2013 9:21:16 AM UTC-4, klo wrote:
>
> Michael,
>
> for example add this line in your config file:
>
> tessedit_char_blacklist    fifl
>
> I don't know how gmail with represent these characters, but make sure file 
> is in UTF8 I guess
>
>
> On Mon, Apr 29, 2013 at 9:45 AM, Michael Sander 
> <[email protected]<javascript:>
> > wrote:
>
>> How did you format your config file? I tried adding the following line 
>> and it doesn't seem to work:
>>
>> tessedit_char_blacklist fi
>>
>>
>> On Sunday, April 1, 2012 5:16:59 AM UTC-4, klo wrote:
>>>
>>> Thanks. I added it to my tesseract configuration file and it works great
>>>
>>> Cheers
>>>
>>>
>>> On Saturday, March 31, 2012 10:12:50 PM UTC+2, zdpo wrote:
>>>>
>>>>  
>>>> Dňa 31.03.2012 16:17, klo  wrote / napísal(a): 
>>>>
>>>> In my simple testing, I find this most common problem, is there a way to 
>>>> instruct tesseract not to use those glyphs without limiting it to ASCII?
>>>>
>>>> I use tesseract 3.01 BTW
>>>>
>>>>
>>>>  put them to blacklist with variable tessedit_char_blacklist (search 
>>>> forum if you do not know how).
>>>>
>>>> Zdenko
>>>>
>>>>   -- 
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]<javascript:>
>> To unsubscribe from this group, send email to
>> [email protected] <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>  
>> --- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "tesseract-ocr" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/tesseract-ocr/jO_4ZMMK9xw/unsubscribe?hl=en
>> .
>> To unsubscribe from this group and all its topics, send an email to 
>> [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>  
>>  
>>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to