I'm currently working on a free software replacement for the non-free mbrola.
The hardest part of building a speech synthesis system is actually the creation of a voice library. I decided to use human speech recordings instead of formant synthesis. For me it began in 2011 when I was looking for a singing synthesizer software. I found many nonfree programs such as Myriad Virtual Singer, OGI Flinger, Vocaloid and UTAU. As I was unable to find a free replacement, I decided to write one. In the meanwhile I found out that some plugins for UTAU are free software, but I still had to replace the nonfree GUI, which is also trapped by Windows. One existing GPLv3 UTAU plugin is v.Connect-STAND [1], which is based on WORLD[2]. v.Connect-STAND has a more natural sound[3] than eCantorix[4], but it is limited to the Japanese language. I was able to compile it, but I do not know how to use it. My free program will be based on WORLD, and it will allow speech/singing synthesis by Collaborative Creation. The algorithms used in WORLD are described in [5]. I chose a design that makes it possible to be multilingual.


[1] http://hal-the-cat.music.coocan.jp/ritsu_e.html
[2] http://ml.cs.yamanashi.ac.jp/world/english/
[3] https://www.youtube.com/watch?v=to28rvoNYfY
[4] https://github.com/divVerent/ecantorix/wiki/Songs
[5] http://iwk.mdw.ac.at/lit_db_iwk/download.php?id=18114

Reply via email to