Yes, I'm doing something similar in python. Do you know of a list of a
ligatures so I can convert them to ascii? I know fi and fl are the most
popular, but there are probably many more.


Michael Sander
[email protected]
607-227-9859


On Mon, Apr 29, 2013 at 7:48 PM, Greg Dunkel <[email protected]> wrote:

> I couldn't get the config to work on Ubuntu so I wrote a post-processing
> sed script to convert the ligatures to two characters.
>
>
> On Mon, Apr 29, 2013 at 3:45 AM, Michael Sander 
> <[email protected]>wrote:
>
>> How did you format your config file? I tried adding the following line
>> and it doesn't seem to work:
>>
>> tessedit_char_blacklist fi
>>
>>
>> On Sunday, April 1, 2012 5:16:59 AM UTC-4, klo wrote:
>>>
>>> Thanks. I added it to my tesseract configuration file and it works great
>>>
>>> Cheers
>>>
>>>
>>> On Saturday, March 31, 2012 10:12:50 PM UTC+2, zdpo wrote:
>>>>
>>>>
>>>> Dňa 31.03.2012 16:17, klo  wrote / napísal(a):
>>>>
>>>> In my simple testing, I find this most common problem, is there a way to
>>>> instruct tesseract not to use those glyphs without limiting it to ASCII?
>>>>
>>>> I use tesseract 3.01 BTW
>>>>
>>>>
>>>>  put them to blacklist with variable tessedit_char_blacklist (search
>>>> forum if you do not know how).
>>>>
>>>> Zdenko
>>>>
>>>>   --
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
>
>
>
> --
> /greg
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/tesseract-ocr/jO_4ZMMK9xw/unsubscribe?hl=en
> .
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to