Re: [Mscore-developer] (GSOC 2016) Regarding the Virtual Singer project idea...

David Cuny Sat, 19 Mar 2016 10:49:53 -0700

Non-developer jumping in again.

*Sinsy* supports English, and can be accessed via a web service. Send a
MusicXML file in, and get a .wav file back. For implementation details, see:


https://pypi.python.org/pypi/sinsy-cli/

Since Sinsy says it works well with MuseScore's MusicXML (it says so on the
Sinsy page), this is probably the simplest approach. Of course, it requires
an internet connection.

Here's a demo I recently did using the Sinsy English male voice:

https://soundcloud.com/dcuny/twinkle-twinkle-little-star-sinsy

Clearly, it would have been nice had they used a native English speaker -
all the Sinsy English singers have heavy accents.


Many of the other "traditional" voice synthesis projects that support
singing seem to be suffering from bitrot.

There's a lot of work done to create a Vocaloid clone, and UTAU is pretty
much the leader there. One of the biggest problems is that good English
vocal synthesis requires a large database of recorded transitions, and a
lot of manual effort. I don't think you'd want to expose that sort of
editing complexity in MuseScore.


I've been working on a singing program that uses formant synthesis, but it
hasn't been released because it's not nearly as natural as the HMM
approach, and doesn't handle rapid articulations well:

https://soundcloud.com/dcuny/twinkle-twinkle-little-star-12

-- David


On Sat, Mar 19, 2016 at 12:51 AM, syrma <[email protected]> wrote:

> Hello!
>
> I have been researching the possibility of using a Virtual Singer for
> MuseScore.
>
> I downloaded and compiled from source some of the following software (and
> directly tested others from installing the packages). I will talk about all
> the software I have looked at/tested, before talking about those I consider
> promising. As I lack the experience and the insight to give definite
> judgement, I would be grateful for any input.
>
> - E-Cantorix (https://github.com/divVerent/ecantorix):
>
> A perl singing synthesis software using espeak. This unfortunately doesn't
> look like something that can be directly exploited, the impression given by
> the headache-inducing robotic voice. There could be some good ideas to take
> from it, although I still have nothing in mind.
>
> - Festival Speech Synthesis System's singing mode
> (http://www.festvox.org/festival/ ):
>
> The speech synthesis' singing mode came as way better than e-cantorix in
> matter of usability (from my own experience that might not be
> representative), although the output still lacks quality. The input for
> this
> mode is a special xml file that specifies the notes and their durations for
> each word (festival being foremost a speech synthesis system).
>
> As for the singing mode output, aside from the robotic voice (that is still
> way more decent than e-cantorix'), I have to say it sounded pretty random.
> The British voice would pronounce some words faster than the American one,
> completely messing up the rhythm. Or sometimes the tone gets off. Mainly
> the
> dissimilarity between how we speak and how we sing that can makes huge
> differences.
>
> - Sinsy (http://sinsy.sourceforge.net/):
>
> Aside from the pretty impressive (non-open source) version that is
> presented
> on their website (Japanese (3 voices), Dubious English (2 voices), and
> Chinese (1 voice) singing synthesis from a music xml file), the open source
> version only supports Japanese, and only one voice is available (which is
> clearly of a lesser quality than the ones on the website). It uses the
> hts_engine API (http://hts-engine.sourceforge.net/).
>
> Pros:
> - Quite easy to use; compiled and run with minor trouble.
> - Supports Japanese well.
> - It is straightforward to get results, as it directly converts from
> MusicXML files (as generated from MuseScore) to audio.
> - The free voice can sound pretty decent.
>
> Cons:
> - Depending on what kind of project would be better, the integration into
> Mscore could be a problem. The software takes a descriptive file and a
> voice
> and converts them into audio. It could be fine for an external tool, but I
> am not sure how the audio could be exploited in real time/playback inside
> the software.
> - Only supports Japanese. (there might be a possibility to add other
> languages through espeak)
> - Has only one voice available. (aside from the fact that it is for
> Japanese, the lack of choice might be hindering)
> - The free voice sounds horrible with long notes. (Really.)
>
> - World (https://github.com/mmorise/World):
>
> World is an open source speech synthesis system. Although very unlike
> anything that I've looked at before. World can analyse and synthesize
> voice.
> I must admit that the result is impressive, very natural sounding, or at
> least far from being robot-like (even if we play with unrealistic
> parameters). However it has no idea of language, so something needs to be
> built on top of it. (vConnect-STAND is a possible option. It is built upon
> World, sound nice according to youtube demos, but I haven't tried it yet.
> The documentation I've come across is in Japanese, so I am slowly going
> through it).
>
> Pros:
> - Very good results.
> - Can be used in real time; it might be possible to integrate it into
> Mscore.
>
> Cons:
> - Very low level.
>
> - QTau (https://notabug.org/isengaara/qtau) and Cadencii
> (https://github.com/cadencii/cadencii-nt):
>
> Two free software editors written with C++ and Qt. Although neither of them
> are voice synthesis technologies, they both make use of vConnect-STAND (in
> addition to e-cantorix for QTau, and Utau + Vocaloid for Cadencii). I think
> the way they do things may be interesting, but I have yet to study them in
> depth. I would like to do so after figuring vConnect-STAND out.
>
> The ideas page stated that an external tool would be good to practice
> along,
> but I am not sure what kind of project would be best to consider. Depending
> on this, some tools may or may not be good, so I would really like to
> discuss this project idea.
> I would greatly appreciate any kind of input or guidance. Please let me
> know
> what I am missing, if I disregarded an interesting possibility, or whether
> I
> should keep going on this path.
>
> Thank you!
>
>
>
> --
> View this message in context:
> http://dev-list.musescore.org/GSOC-2016-Regarding-the-Virtual-Singer-project-idea-tp7579698.html
> Sent from the MuseScore Developer mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
> _______________________________________________
> Mscore-developer mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/mscore-developer
>

------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140

_______________________________________________
Mscore-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mscore-developer

Re: [Mscore-developer] (GSOC 2016) Regarding the Virtual Singer project idea...

Reply via email to