Hi all, Jonathan kindly generated some basic Norwegian voice files for eSpeak so I could start testing and giving feedback. He and I have exchanged a few emails about these files, but I'll take it to the open list now so that others can follow the process of figuring out how to do this.
While working on the Norwegian voice I'm also trying to work out a more streamlined procedure for doing this so that it will be easier for others to contribute to other languages later. First, I think it's good to have a standard reference text for each language. I've selected the Wikipedia article on language (http://en.wikipedia.org/wiki/Language), which is itself available in many languages. It's far from identical in all languages, but it only needs to be an internal reference for each language. Ideally when a text file is made from that page it should be frozen so that it will be a reliable reference for discussion. First I cleaned up the text a bit, removing some mark-up, table of contents and bullet points (it even helps to check the spelling in the original text ...). I then split article up into individual files for each paragraph and stored them in my home directory in the following structure: ~/espeak/text/no/no-lang01.txt -02, etc. I then generated .wav files of each text file using the initial voice files. I first made files at the standard speed of -s160, but found that slower files were easier to analyse and settled on -s100. It may be different for other languages or listeners though (and most people will set a much higher speed when actually using the voices). While I was at it I did the same for Spanish, Polish and Swedish (it's good to have some competition among neighbours!). The .wav files are rather large so I ran 'oggenc *' to compress them to ogg. I should also say a few words about setting up eSpeak at this point. I downloaded the latest version which Jonathan provided from here: http://espeak.sourceforge.net/test/espeak-1.17k.zip eSpeak was already installed on my system and I didn't want to play too much with that. I just put in a quick hack to use the old application with the new data files. I unzipped the new espeak in my home directory and placed the data files in ~/espeak/ in /usr/share I did: sudo mv espeak-data/ espeak-data-orig sudo ln -s /home/henrik/espeak/espeak-data/ espeak-data I'm sure someone can come up with a better way to do this :) So, after generating the .ogg files it's time to start debugging them. I found using pre-generated sound files to be quite handy because then you can pause and rewind (unfortunately seeking is degraded in the ogg compression step). We could also get native speakers who are not yet using Linux to listen to the files and report back. Which raises the next question: What is the most useful form I can provide feedback in? I've made some comments on individual words below, mostly vowel sounds, but I suspect a more informed comment about the phenomes might be better. I guess having the native listener tweak the language files directly would be ideal but I'll need to grok more of the eSpeak toolchain to do that. I've tarred up that directory I was working on and uploaded it here: http://people.ubuntu.com/~henrik/espeak/espeak-files-heno.tar.gz But without ogg files, which I've placed separately here: http://people.ubuntu.com/~henrik/espeak/ogg/ There is also a simple python script in there to help with the .wav generation, though that could be much improved. The results from my first listening test: -------------- Språk (ubestemt) betegner menneskenes[1] måter[2] å kommunisere[3] på. Bevisst[4] kommunikasjon skjer[5] ved hjelp av lydspråk[6], tegnspråk og skriftspråk, ubevisst kommunikasjon for eksempel ved kroppsspråk. Språkvitenskap[7] betegnes som lingvistikk[8]. [1] The 3rd 'e' is too long [2] 'r' needs to be more pronounced [3] The last 'e' has the wrong tone/flavour Sounds like an æ, should be like 'Long E' on [*] [4] The 'e' is too long/too much emphasis, and the i should be very short (double consonant rule) [5] needs a longer 'e' More like 'Long E' on [*] [6] the 'y' sounds like the 'ee' in Leeds, but should be like 'Long Y' in [*] [7] 'å' needs to be longer like 'Long Å' on [*] [8] needs a shorter 'i' [*] http://frodo.bruderhof.com/norskklassen/sounds-g.htm ------------- Please try this approach if there is some basic language support for you native language in espeak so we can streamline the process further. Thanks! Henrik -- Ubuntu-accessibility mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-accessibility
