I need to determine syllable stress for the top 60,000+ lemmas in a 14 billion 
word web-based corpus that I'm creating. This will allow users an additional 
way to search the corpus, in addition to word, lemma, PoS, synonyms, customized 
wordlists, etc.

-------

Using the Carnegie Mellon Pronouncing Dictionary and 3-4 online dictionaries, 
I'm able to get the data for about 47,000 of these 60,000 lemmas, e.g.

http://www.speech.cs.cmu.edu/cgi-bin/cmudict?in=mechanical&stress=-s
M AH0 K AE1 N IH0 K AH0 L

https://www.merriam-webster.com/dictionary/mechanical
mi-'ka-ni-k?l

But this still leaves about 13,000 (mostly lower-frequency) lemmas with no 
information on word stress. I suppose I could go through these one by one an 
indicate stress myself, but I'm wondering if anyone is aware of another tool 
that could do this.

(BTW, I've also tried http://www.speech.cs.cmu.edu/tools/lextool.html, but it 
doesn't show syllable stress).

Thanks in advance.

============================================
Mark Davies
Professor of Linguistics / Brigham Young University
http://davies-linguistics.byu.edu/

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora@uib.no
https://mailman.uib.no/listinfo/corpora

Reply via email to