Re: Syllable Detection

Lance Norskog Tue, 10 Jul 2012 20:26:06 -0700

Is this in the general case or for specific speech? For example, it
should be possible to create an HMM that breaks medical jargon, based
on work in splitting Simplified Chinese language text. The average
Simplified Chinese "word" is 1.5 ideograms, and you need a
well-trained HMM (or similar) to split Simplified Chinese well. The
language is very context-specific with both prefixes and suffixes that
alter the meaning of "interior" words.


On Mon, Jul 9, 2012 at 4:39 PM, John Stewart <[email protected]> wrote:
> That's right, better use a lexical database.  CELEX2, available fairly
> inexpensively from the Linguistic Data Consortium, has syllable
> boundaries in its phonological representations.
>
> http://www.ldc.upenn.edu/Catalog/readme_files/celex.readme.html#overview
>
> jds
>
> On Mon, Jul 9, 2012 at 6:37 PM, James Kosin <[email protected]> wrote:
>> Adam,
>>
>> Sorry, OpenNLP doesn't detect syllables.  What you probably need is more
>> of a dictionary with pronunciation syllables.
>> It could be trained to do it maybe; but, would be very language specific
>> and not very useful.  The dictionary approach would be best.  Though
>> OpenNLP could help parse the words/tokens for you to use in the dictionary.
>>
>> James
>>
>> On 7/9/2012 5:26 PM, Adam Goodkind wrote:
>>> Hi all,
>>>
>>> Does OpenNLP have the ability to detect syllables? If not, could you point
>>> me to a java toolkit that can do this?
>>>
>>> Thanks,
>>> Adam
>>>
>>
>>



-- 
Lance Norskog
[email protected]

Re: Syllable Detection

Reply via email to