Thanks Sven. The problem is that no legend appears to be have been 
provided. I am looking to *automatically* produce a word to phonemes 
mapping or list (using a program) and its hard to determine where one 
phoneme ends and another starts. The problem is that a character "a" for 
instance might be used in more than one phoneme e.g. "ax", "ae", and smiple 
greedy matching won't work.

Now if only I could get my hands on the Abbyy Fine Reader project file ... 
I'd represent each phoneme by a unique character for a start and go from 
there.

On Wednesday, January 16, 2013 3:20:04 PM UTC, sventech wrote:
>
> That particular dictionary has already been OCRed with Abbyy Fine Reader:
>
> http://archive.org/stream/everymansenglish00jone/everymansenglish00jone_djvu.txt
>
> Although not perfect, a little cleanup would render that text quite usable.
> --Sven
>
>
> On Wed, Jan 16, 2013 at 8:44 AM, Sven Pedersen 
> <[email protected]<javascript:>
> > wrote:
>
>> You would need to train tesseract to recognize those symbols. The web 
>> page outlines how to do that.
>> --Sven
>>
>>
>> On Tue, Jan 15, 2013 at 6:43 PM, <[email protected] <javascript:>> wrote:
>>
>>> Is Tesseract-OCR capable of recognizing phonetic symbols? I would like 
>>> to extract the phonetic transcriptions of the following (out of copyright) 
>>> document
>>> http://archive.org/stream/everymansenglish00jone#page/2/mode/2up
>>>
>>> Regards,
>>>
>>> - Olumide
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]<javascript:>
>>> To unsubscribe from this group, send email to
>>> [email protected] <javascript:>
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>
>>
>>
>> -- 
>> ``All that is gold does not glitter,
>>   not all those who wander are lost;
>> the old that is strong does not wither,
>>   deep roots are not reached by the frost.
>> From the ashes a fire shall be woken,
>>   a light from the shadows shall spring;
>> renewed shall be blade that was broken,
>>   the crownless again shall be king.” 
>>
>
>
>
> -- 
> ``All that is gold does not glitter,
>   not all those who wander are lost;
> the old that is strong does not wither,
>   deep roots are not reached by the frost.
> From the ashes a fire shall be woken,
>   a light from the shadows shall spring;
> renewed shall be blade that was broken,
>   the crownless again shall be king.” 
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to