An Update about Speech Synthesis for Sugar

2008-02-18 Thread Hemant Goyal
Hi,

It s great to see many other developers sharing the idea we have been trying
to implement right within the Sugar Environment.

We have been working on integrating speech-synthesis into Sugar for quite
some time now. You can check out our ideas here :
http://wiki.laptop.org/go/Screen_Reader

We are also documenting all our ideas and requirements with respect to
Speech Synthesis in this Requirements Analysis Document here :
http://www.nsitonline.in/hemant/stuff/Speech%20Synthesis%20on%20XO%20-%20Requirements%20Analysis%20v0.3.5.pdf

It outlines some of our immediate as well as long term goals wrt
speech-synthesis on the XO. Your ideas, comments and suggestions are
welcome.

I'd like to update the list about our progress:

   1. speech-dispatcher has been selected as a speech synthesis server
   which will accept all incoming speech synthesis requests from any sugar
   activity (example: Talk N Type, Speak etc)
   2. speech-dispatcher provides a very simple to use API and client
   specific configuration management.

So whats causing the delays?

   1. speech-dispatcher is not packaged as an RPM for Fedora, so at
   present I am mostly making a RPM package so that it can be accepted by the
   Fedora community and ultimately be dropped into the OLPC Builds. You can
   track the progress here :
   https://bugzilla.redhat.com/show_bug.cgi?id=432259 I am not an expert
   at RPM packaging and hence its taking some time at my end. I'd welcome
   anyone to assist me and help speed up the process.
   2. dotconf packages which speech-dispatcher is being packaged by my
   team mate Assim. You can check its progress here :
   https://bugzilla.redhat.com/show_bug.cgi?id=433253

Some immediate tasks that we plan to carry out once speech-dispatcher is
packaged and dropped into the OLPC builds are :

   1. Provide the much needed play button, with text highlight features
   as discussed by Edward.
   2. Port an AI Chatbot to the XO and hack it enough to make it speak to
   the child :).
   3. Encourage other developers to make use of speech-synthesis to make
   their activities as lively and child friendly as possible :)
   4. Explore orca and other issues to make the XO more friendly for
   blind/low-vision students

@James : We envision that speech-synthesis will surely get integrated with
Read in due time. I think it would be great if maybe Gutenberg text could be
loaded right from Read only?

I was not planning on anything so fancy.  Basically, I was frustrated
 that I had a device that would be wonderfully suited to reading
 Gutenberg etexts and no suitable program to do it with.  I have written
 such an Activity and am putting the finishing touches on it.  As I see
 it, the selling points of the Activity will be that it can display
 etexts one page at a time in a readable proportional font and remember
 what page you were on when you resume the activity.  The child can find
 his book using the Gutenberg site, save the Zip file version to the
 Journal, rename it, resume it, and start reading.  It will also be good
 sample code for new Activity developers to look at, even children,
 because it is easy to understand yet it does something that is actually
 useful.  I have written another Activity which lets you browse through a
 bunch of image files stored in a Zip file, and it also would be good
 sample code for a new developer, as well as being useful.


Warm Regards,
Hemant
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [sugar] An Update about Speech Synthesis for Sugar

2008-02-18 Thread Edward Cherlin
On Feb 18, 2008 6:22 AM, Hemant Goyal [EMAIL PROTECTED] wrote:
 Hi,

 It s great to see many other developers sharing the idea we have been trying
 to implement right within the Sugar Environment.

Yes, thanks to all.

 We have been working on integrating speech-synthesis into Sugar for quite
 some time now. You can check out our ideas here :
 http://wiki.laptop.org/go/Screen_Reader

 We are also documenting all our ideas and requirements with respect to
 Speech Synthesis in this Requirements Analysis Document here :
 http://www.nsitonline.in/hemant/stuff/Speech%20Synthesis%20on%20XO%20-%20Requirements%20Analysis%20v0.3.5.pdf

 It outlines some of our immediate as well as long term goals wrt
 speech-synthesis on the XO. Your ideas, comments and suggestions are
 welcome.

 I'd like to update the list about our progress:

 speech-dispatcher has been selected as a speech synthesis server which will
 accept all incoming speech synthesis requests from any sugar activity
 (example: Talk N Type, Speak etc)
 speech-dispatcher provides a very simple to use API and client specific
 configuration management.So whats causing the delays?

I have a few questions. Let's see what the InterWebs tell us.

* How many languages does speech-dispatcher support?

http://www.freebsoft.org/doc/speechd/speech-dispatcher_5.html#SEC8

SD works with Festival http://www.cstr.ed.ac.uk/projects/festival/
English, Czech, Italian, Spanish, Russian, Polish...

What is the mechanism for adding additional languages? Phoneset
recording, dictionary, and what?

http://www.freebsoft.org/doc/speechd/speech-dispatcher_23.html#SEC82
Develop new voices and language definitions for Festival: In the world
of Free Software, currently Festival is the most promising interface
for Text-to-Speech processing and speech synthesis. It's an extensible
and highly configurable platform for developing synthetic voices. If
there is a lack of synthetic voices or no voices at all for some
language, we believe the wisest solution is to try to develop a voice
in Festival. It's certainly not advisable to develop your own
synthesizer if the goal is producing a quality voice system in a
reasonable time. Festival developers provide nice documentation about
how to develop a voice and a lot of tools that help doing this. We
found that some language definitions can be constructed by
canibalizing the already existing definitions and can be tuned later.
As for the voice samples, one can temporarily use the MBROLA project
voices. But please note that, although they are downloadable for free
(as price), they are not Free Software and it would be wonderful if we
could replace them by Free Software alternatives as soon as possible.
See http://www.cstr.ed.ac.uk/projects/festival/.

Which in turn says:

Externally configurable language independent modules:

* phonesets
* lexicons
* letter-to-sound rules
* tokenizing
* part of speech tagging
* intonation and duration

That answers most of my technical questions, including how (in
principle, anyway) we are going to support tonal languages such as
Yoruba. Now for organization.

Where should we put TTS projects for language support? Can we create
http://dev.laptop.org/tts? Who should be in charge? What sort of
process should we have for creating projects? Should we just
automatically create a TTS project for every translate project?

 speech-dispatcher is not packaged as an RPM for Fedora,

I see Debian packages. Is there a converter?

so at present I am
 mostly making a RPM package so that it can be accepted by the Fedora
 community and ultimately be dropped into the OLPC Builds. You can track the
 progress here : https://bugzilla.redhat.com/show_bug.cgi?id=432259 I am not
 an expert at RPM packaging and hence its taking some time at my end. I'd
 welcome anyone to assist me and help speed up the process.
 dotconf packages which speech-dispatcher is being packaged by my team mate
 Assim. You can check its progress here :
 https://bugzilla.redhat.com/show_bug.cgi?id=433253
  Some immediate tasks that we plan to carry out once speech-dispatcher is
 packaged and dropped into the OLPC builds are :

 Provide the much needed play button, with text highlight features as
 discussed by Edward.

Thank you.

 Port an AI Chatbot to the XO and hack it enough to make it speak to the
 child :).
 Encourage other developers to make use of speech-synthesis to make their
 activities as lively and child friendly as possible :)
 Explore orca and other issues to make the XO more friendly for
 blind/low-vision students

Have you looked at Oralux, the Linux distro for the blind and visually-impaired?
http://oralux.net/
We should invite them to join our efforts.

 @James : We envision that speech-synthesis will surely get integrated with
 Read in due time. I think it would be great if maybe Gutenberg text could be
 loaded right from Read only?

  I was not planning on anything so fancy.  Basically, I was frustrated
  that I had a device