Dear Andras, I apologise for the long delay in responding to your email. It has been a very busy month.
I checked out the two links you provided and was intrigued to know how you would propose controlling the Kelly-Lochbaum tube model to produce connected speech, including consonants. I also wonder which/whose demonstrations of gnuspeech you have listened to, and what audio equipment you used for the reproduction. The fidelity of the gnuspeech TRM-based speech is completely vitiated by the typical computer sound system. This is the URL for a .wav file comparing male, female and child speech and wonder if you have heard it before: http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav Also, I wonder if you have read any of the background papers on gnuspeech, especially: http://pages.cpsc.ucalgary.ca/~hill/papers/avios95/index.htm or some of the papers documenting the work we did on rhythm and intonation, and the subjective testing that was part of that.The AVIOS 95 paper provides some of the background references we used. A series of sixteen tube sections of equal length simply fails to provide the degree of control needed to emulate the human vocal tract speaking. In fact we apparently use only eight, but there are an underlying 10 sections that allow a fair approximation to the unequal sections needed for the task. The length was restricted by the need to compute in real time. These days you can get a perfect representation of the 8 unequal-length sections needed to provide completely independent control of the human formants by using 32 sections and combining them appropriately to meet the boundaries determined by Fant and Pauli at KTH, Stockholm (FANT, G. & PAULI, S. (1974) Spatial characteristics of vocal tract resonance models. Proceedings of the Stockholm Speech Communication Seminar, KTH, Stockholm, Sweden There is a fair amount of stuff at: http://pages.cpsc.ucalgary.ca/~hill/gnuspeech/gnuspeech-index.htm and a more complete listing in section F. of: http://pages.cpsc.ucalgary.ca/~hill/papers/index.htm with an overview of gnuspeech at: http://www.gnu.org/software/gnuspeech/ and a recent paper on the history of the work is at: http://pages.cpsc.ucalgary.ca/~hill/papers/creating-n-applying-rhythm-n-intonation.pdf The Kelly and Lochbaum work is quite old, of course. I hope all this helps. I'll be interested in your response. All good wishes. david On Oct 9, 2012, at 9:19 AM, Andras Kadinger wrote: > I listened to the gnuspeech demos a number of times over the years. I was > always fascinated by the pleasant intonation and formant trajectories - but > was quite put off by the unnatural, formant synthesizer-like timbre. > > To rekindle the love, I hacked up a Kelly-Lochbaum TRM in Java: > http://www.youtube.com/watch?v=tzAnkDki8SU > > The toy UI I have on top of it is the "Four Tube Vocal Tract Models of > Vowels" from > http://clas.mq.edu.au/acoustics/frequency/vocal_tract_resonance.html just to > have an simple yet phonetically somewhat meaningful way of playing with it to > get a rough idea of the wovel quality to be expected from such a model. > > Andras > > _______________________________________________ > gnuspeech-contact mailing list > [email protected] > https://lists.gnu.org/mailman/listinfo/gnuspeech-contact > > _______________________________________________ gnuspeech-contact mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gnuspeech-contact
