You appear to be a fellow Ithacan! (I no longer live there, but remember it
fondly.)

Anyway, other common ligatures include ff, ffi, ffl, fb, fy, ft
http://ilovetypography.com/2007/09/09/decline-and-fall-of-the-ligature/
Sven

On Monday, April 29, 2013, Michael Sander wrote:

> Yes, I'm doing something similar in python. Do you know of a list of a
> ligatures so I can convert them to ascii? I know fi and fl are the most
> popular, but there are probably many more.
>
>
> Michael Sander
> [email protected] <javascript:_e({}, 'cvml',
> '[email protected]');>
> 607-227-9859
>
>
> On Mon, Apr 29, 2013 at 7:48 PM, Greg Dunkel 
> <[email protected]<javascript:_e({}, 'cvml', '[email protected]');>
> > wrote:
>
>> I couldn't get the config to work on Ubuntu so I wrote a post-processing
>> sed script to convert the ligatures to two characters.
>>
>>
>> On Mon, Apr 29, 2013 at 3:45 AM, Michael Sander 
>> <[email protected]<javascript:_e({}, 'cvml', 
>> '[email protected]');>
>> > wrote:
>>
>>> How did you format your config file? I tried adding the following line
>>> and it doesn't seem to work:
>>>
>>> tessedit_char_blacklist fi
>>>
>>>
>>> On Sunday, April 1, 2012 5:16:59 AM UTC-4, klo wrote:
>>>>
>>>> Thanks. I added it to my tesseract configuration file and it works great
>>>>
>>>> Cheers
>>>>
>>>>
>>>> On Saturday, March 31, 2012 10:12:50 PM UTC+2, zdpo wrote:
>>>>>
>>>>>
>>>>> Dňa 31.03.2012 16:17, klo  wrote / napísal(a):
>>>>>
>>>>> In my simple testing, I find this most common problem, is there a way to
>>>>> instruct tesseract not to use those glyphs without limiting it to ASCII?
>>>>>
>>>>> I use tesseract 3.01 BTW
>>>>>
>>>>>
>>>>>  put them to blacklist with variable tessedit_char_blacklist (search
>>>>> forum if you do not know how).
>>>>>
>>>>> Zdenko
>>>>>
>>>>>   --
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to 
>>> [email protected]<javascript:_e({}, 'cvml', 
>>> '[email protected]');>
>>> To unsubscribe from this group, send email to
>>> [email protected] <javascript:_e({}, 'cvml',
>>> 'tesseract-ocr%[email protected]');>
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected]<javascript:_e({}, 
>>> 'cvml', 'tesseract-ocr%[email protected]');>
>>> .
>>>
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>
>>>
>>
>>
>>
>> --
>> /greg
>>
>> --
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to 
>> [email protected]<javascript:_e({}, 'cvml', 
>> '[email protected]');>
>> To unsubscribe from this group, send email to
>> [email protected] <javascript:_e({}, 'cvml',
>> 'tesseract-ocr%[email protected]');>
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "tesseract-ocr" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/tesseract-ocr/jO_4ZMMK9xw/unsubscribe?hl=en
>> .
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected] <javascript:_e({}, 'cvml',
>> 'tesseract-ocr%[email protected]');>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to 
> [email protected]<javascript:_e({}, 'cvml', 
> '[email protected]');>
> To unsubscribe from this group, send email to
> [email protected] <javascript:_e({}, 'cvml',
> 'tesseract-ocr%[email protected]');>
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:_e({},
> 'cvml', 'tesseract-ocr%[email protected]');>.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


-- 
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
>From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to