It would be for typed English. On Tue, Jul 10, 2012 at 11:25 PM, Lance Norskog <[email protected]> wrote:
> Is this in the general case or for specific speech? For example, it > should be possible to create an HMM that breaks medical jargon, based > on work in splitting Simplified Chinese language text. The average > Simplified Chinese "word" is 1.5 ideograms, and you need a > well-trained HMM (or similar) to split Simplified Chinese well. The > language is very context-specific with both prefixes and suffixes that > alter the meaning of "interior" words. > > On Mon, Jul 9, 2012 at 4:39 PM, John Stewart <[email protected]> wrote: > > That's right, better use a lexical database. CELEX2, available fairly > > inexpensively from the Linguistic Data Consortium, has syllable > > boundaries in its phonological representations. > > > > http://www.ldc.upenn.edu/Catalog/readme_files/celex.readme.html#overview > > > > jds > > > > On Mon, Jul 9, 2012 at 6:37 PM, James Kosin <[email protected]> > wrote: > >> Adam, > >> > >> Sorry, OpenNLP doesn't detect syllables. What you probably need is more > >> of a dictionary with pronunciation syllables. > >> It could be trained to do it maybe; but, would be very language specific > >> and not very useful. The dictionary approach would be best. Though > >> OpenNLP could help parse the words/tokens for you to use in the > dictionary. > >> > >> James > >> > >> On 7/9/2012 5:26 PM, Adam Goodkind wrote: > >>> Hi all, > >>> > >>> Does OpenNLP have the ability to detect syllables? If not, could you > point > >>> me to a java toolkit that can do this? > >>> > >>> Thanks, > >>> Adam > >>> > >> > >> > > > > -- > Lance Norskog > [email protected] > -- *Adam Goodkind * *w* adamgoodkind.com <http://www.adamgoodkind.com> *t* @adamgreatkind <https://twitter.com/#%21/adamgreatkind>
