[Sugar-devel] Gsoc proposal: Speech Synthesis
Hi, I proposed for the speech-synthesis in Gsoc 09. My proposal can be viewed at : http://wiki.sugarlabs.org/go/speech-synthesis As a first phase of my development, I have implemented the speech and karoke style coloring of the text. A basic speech configuration manager has also been implemented to alter the volume, pitch and rate of the speech. It would be great if you can test the activity. Please download the speech-synthesis.zip from the link: http://code.google.com/p/speech-synthesis/downloads/list I have also included a detailed documentation of the activity. It would be great if you can send some feedbacks to me so that I can improve upon his activity. Regards -- Chirag Jain Undergraduate Student Netaji Subash Institute of Technology New Delhi ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Gsoc proposal: Speech Synthesis
Chirag, I won't be able to try out your code for awhile, but I did look at it and noticed that while you refer to it as an activity it is not in fact packaged as an Activity. Even if you intend for this code to wind up being a part of Sugar itself, there is no reason you couldn't make it an Activity now, and there would be advantages to doing that. For one thing, it would be easier to try out. The easier something is to test, the more testing is done, and the better quality of testing is done. Plus the Activity could be used later by those unwilling to update their XO's to the latest Sugar. Other than creating an SVG icon with Inkscape it wouldn't take much work to make this a real Activity. James Simmons Date: Thu, 11 Jun 2009 12:33:26 +0530 From: chirag jain chiragjain1...@gmail.com Subject: [Sugar-devel] Gsoc proposal: Speech Synthesis To: sugar-devel sugar-devel@lists.sugarlabs.org Message-ID: e116096a0906110003mf9a841el3fe3da562f7b4...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hi, I proposed for the speech-synthesis in Gsoc 09. My proposal can be viewed at : http://wiki.sugarlabs.org/go/speech-synthesis As a first phase of my development, I have implemented the speech and karoke style coloring of the text. A basic speech configuration manager has also been implemented to alter the volume, pitch and rate of the speech. It would be great if you can test the activity. Please download the speech-synthesis.zip from the link: http://code.google.com/p/speech-synthesis/downloads/list I have also included a detailed documentation of the activity. It would be great if you can send some feedbacks to me so that I can improve upon his activity. Regards ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Gsoc proposal: Speech Synthesis
On Thu, Jun 11, 2009 at 12:03 AM, chirag jainchiragjain1...@gmail.com wrote: Hi, I proposed for the speech-synthesis in Gsoc 09. My proposal can be viewed at : http://wiki.sugarlabs.org/go/speech-synthesis As a first phase of my development, I have implemented the speech and karoke style coloring of the text. Thank you. I have long been waiting for this. http://www.olpcnews.com/content/ebooks/effective_adult_literacy_program.html A basic speech configuration manager has also been implemented to alter the volume, pitch and rate of the speech. Will it be able to handle tonal languages such as Vietnamese or Yoruba (Nigeria)? How would this system handle creation of voices for different languages? Are we at the point where we can request recordings of phoneme samples for the target languages in Pootle? It would be great if you can test the activity. Please download the speech-synthesis.zip from the link: http://code.google.com/p/speech-synthesis/downloads/list I have also included a detailed documentation of the activity. It would be great if you can send some feedbacks to me so that I can improve upon his activity. Regards -- Chirag Jain Undergraduate Student Netaji Subash Institute of Technology New Delhi ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Silent Thunder (默雷/धर्ममेघशब्दगर्ज/دھرممیگھشبدگر ج) is my name And Children are my nation. The Cosmos is my dwelling place, The Truth my destination. http://earthtreasury.org/worknet (Edward Mokurai Cherlin) ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Gsoc proposal: Speech Synthesis
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 On Thu, Jun 11, 2009 at 12:46:22PM -0700, Edward Cherlin wrote: On Thu, Jun 11, 2009 at 12:03 AM, chirag jainchiragjain1...@gmail.com wrote: A basic speech configuration manager has also been implemented to alter the volume, pitch and rate of the speech. Will it be able to handle tonal languages such as Vietnamese or Yoruba (Nigeria)? How would this system handle creation of voices for different languages? Are we at the point where we can request recordings of phoneme samples for the target languages in Pootle? The project seems to be a _frontend_ for speech synthesis, not inventing a whole new method of speech synthesis itself (which would be far too much for a GSoC project). The code uses espeak synthesis as backend. Languages supported are here (linking to a page on how to contribute additional languages): http://espeak.sourceforge.net/languages.html Kind regards, - Jonas - -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEAREDAAYFAkoxa6oACgkQn7DbMsAkQLizVwCbBHAemHthwsoYj+ThYQ24xNp1 TdsAnidpf4rdAyLANEQ45WDW0CrNG9fg =rHLd -END PGP SIGNATURE- ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Gsoc proposal: Speech Synthesis
On Thu, Jun 11, 2009 at 1:40 PM, Jonas Smedegaardd...@jones.dk wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 On Thu, Jun 11, 2009 at 12:46:22PM -0700, Edward Cherlin wrote: On Thu, Jun 11, 2009 at 12:03 AM, chirag jainchiragjain1...@gmail.com wrote: A basic speech configuration manager has also been implemented to alter the volume, pitch and rate of the speech. Will it be able to handle tonal languages such as Vietnamese or Yoruba (Nigeria)? How would this system handle creation of voices for different languages? Are we at the point where we can request recordings of phoneme samples for the target languages in Pootle? The project seems to be a _frontend_ for speech synthesis, not inventing a whole new method of speech synthesis itself (which would be far too much for a GSoC project). Right. So my questions translate to o Do we know whether e-speak can handle tonal languages? o Is this project far enough advanced so that we should ask Sugar Labs for a speech repository, and start recruiting linguists (to give us the phoneme data) and native speakers (to make the recordings)? The code uses espeak synthesis as backend. Languages supported are here (linking to a page on how to contribute additional languages): http://espeak.sourceforge.net/languages.html I see. Lots still to do. I'll ask jonsd, the project contact. Kind regards, - Jonas - -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEAREDAAYFAkoxa6oACgkQn7DbMsAkQLizVwCbBHAemHthwsoYj+ThYQ24xNp1 TdsAnidpf4rdAyLANEQ45WDW0CrNG9fg =rHLd -END PGP SIGNATURE- -- Silent Thunder (默雷/धर्ममेघशब्दगर्ज/دھرممیگھشبدگر ج) is my name And Children are my nation. The Cosmos is my dwelling place, The Truth my destination. http://earthtreasury.org/worknet (Edward Mokurai Cherlin) ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Gsoc proposal: Speech Synthesis
Chirag, I still haven't run your code, but I did take a good look at it. I expected it to look quite a bit like the code I and Aleksey Lim came up with for Read Etexts. I was surprised to find that it didn't, but parts of it did look familiar to me because it looks like you're trying an approach that I tried and was forced to give up on. What it looks like is you're sending the words to espeak one at a time, after highlighting them in the text viewer. If that's what you're doing then you're launching espeak for each and every word, creating a .WAV file for that word, and then using aplay to play the word. On a sufficiently powerful machine that works but sounds awful. On an XO it doesn't work at all. If this is what you are doing then have a look at the code for Read Etexts. In that code I make a version of the text that has markup to indicate the beginning of each word. Originally speech-dispatcher used that markup to do callbacks into my code, telling it which word to highlight. Aleksey Lim wrote a gstreamer plugin for espeak that replaced speech-dispatcher but did the same thing. His plugin works better and requires no configuration. I didn't come up with this myself; someone from the speech-dispatcher mailing list suggested it. It isn't perfect, but I'm pretty sure it works better than what you are attempting. James Simmons James Simmons wrote: Chirag, I won't be able to try out your code for awhile, but I did look at it and noticed that while you refer to it as an activity it is not in fact packaged as an Activity. Even if you intend for this code to wind up being a part of Sugar itself, there is no reason you couldn't make it an Activity now, and there would be advantages to doing that. For one thing, it would be easier to try out. The easier something is to test, the more testing is done, and the better quality of testing is done. Plus the Activity could be used later by those unwilling to update their XO's to the latest Sugar. Other than creating an SVG icon with Inkscape it wouldn't take much work to make this a real Activity. James Simmons Date: Thu, 11 Jun 2009 12:33:26 +0530 From: chirag jain chiragjain1...@gmail.com Subject: [Sugar-devel] Gsoc proposal: Speech Synthesis To: sugar-devel sugar-devel@lists.sugarlabs.org Message-ID: e116096a0906110003mf9a841el3fe3da562f7b4...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hi, I proposed for the speech-synthesis in Gsoc 09. My proposal can be viewed at : http://wiki.sugarlabs.org/go/speech-synthesis As a first phase of my development, I have implemented the speech and karoke style coloring of the text. A basic speech configuration manager has also been implemented to alter the volume, pitch and rate of the speech. It would be great if you can test the activity. Please download the speech-synthesis.zip from the link: http://code.google.com/p/speech-synthesis/downloads/list I have also included a detailed documentation of the activity. It would be great if you can send some feedbacks to me so that I can improve upon his activity. Regards ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] GSoC proposal: Speech Synthesis
Hi ! For comments given on my proposal on google web app, I have given my response on my sugar wiki page and also on my discussion page. Please visit http://wiki.sugarlabs.org/go/speech-synthesis Regards Chirag Jain ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] GSoC proposal: Speech Synthesis
Hi !! I have implemented a small application that works as a system side keyboard speaker in sugar. To test it please download the keboard_speaker.zip from the following link: http://code.google.com/p/speech-synthesis/downloads/list I want some reviews on it. Regards -- Chirag Jain Undergraduate Student Netaji Subash Institute of Technology New Delhi ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] GSoC proposal: Speech Synthesis
Keyboard Speaker As I proposed in my proposal about the keyboard speaker, I want to make it more elaborate and wants you to comment. I am thinking it to implement in two ways 1. A single key speaker. In this option, the keyboard speaker will simply speak the keys as a small child presses them. This functionality can help him learn the alphabets and also the name of some speacial characters. Like on pressing the * key the speaker speeks asterisk, on pressing the # key it will speak the hash word and so on. 2. A single word speaker Now this functionality can be implemented with write activity. This can be achieved with the help of a pykeylogger which will run in the background. Now as the user types words in the write activity, the pykeylogger will store all the characters typed. As the user presses the space, all the words which are actually forming the word typed by the user, and which are stored by the pykeylogger will be sent to the TTS engine. So in this manner as the user types one word the speaker will speak the entire word typed. This feature will be helpful to the children in knowing the exact pronunciation of the word they are typing. Also they can easily memorize the word and at the same time can learn those newly typed words. These two options can be provided with the GUI for the speech configuration which I have already proposed in my application. I want comments on this idea. Regards Chirag ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] GSoC proposal: Speech Synthesis
I like the idea. I would suggest that for (2), when the word is pronounced it should be sounded out first using the grapheme-phoneme correspondences for the language. Preferably highlight the graphemes as they are spoken. Thus bb ah ll - ball, in turn highlighting b, a, ll. This helps the child improve their decoding skills. C.f. http://synphony.wiki.sourceforge.net/goog_1239290206416 On Thu, Apr 9, 2009 at 10:10 AM, chirag jain chiragjain1...@gmail.comwrote: Keyboard Speaker As I proposed in my proposal about the keyboard speaker, I want to make it more elaborate and wants you to comment. I am thinking it to implement in two ways 1. A single key speaker. In this option, the keyboard speaker will simply speak the keys as a small child presses them. This functionality can help him learn the alphabets and also the name of some speacial characters. Like on pressing the * key the speaker speeks asterisk, on pressing the # key it will speak the hash word and so on. 2. A single word speaker Now this functionality can be implemented with write activity. This can be achieved with the help of a pykeylogger which will run in the background. Now as the user types words in the write activity, the pykeylogger will store all the characters typed. As the user presses the space, all the words which are actually forming the word typed by the user, and which are stored by the pykeylogger will be sent to the TTS engine. So in this manner as the user types one word the speaker will speak the entire word typed. This feature will be helpful to the children in knowing the exact pronunciation of the word they are typing. Also they can easily memorize the word and at the same time can learn those newly typed words. These two options can be provided with the GUI for the speech configuration which I have already proposed in my application. I want comments on this idea. Regards Chirag ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- It is difficult to get a man to understand something, when his salary depends upon his not understanding it. -- Upton Sinclair ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] GSoC proposal: Speech Synthesis
Improved my proposal http://wiki.sugarlabs.org/go/speech-synthesis Please give some reviews.. ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel