Dave Donovan,
Thank you for the very helpful explanation.
Thank you again.
Lloyd
On Sat, Sep 20, 2008 at 11:10 AM, Dave Donovan <[EMAIL PROTECTED]>wrote:
> Lloyd,
>
> I'm not an expert in these things, and the last time I looked at it
> was over a year ago but I'll tell you what I learned and hope that it
> saves you some time.
>
> You're not going to find a program like that just does this conversion
> like: wav2txt outfile.txt infile.wav.
>
> There are packages like CMU Sphinx and it's successors. I used the
> java one and it was pretty cool what it could do but it's not a
> trivial thing to understand it and tune it well for your application.
> It's doable, but it's not something you're going to download and run
> like zipping a file. The reason you were point to ZOIP is that it is
> a good example of using Asterisk and speech reconition. I think it
> started using Sphinx and moved to Cepstral, but I could be wrong.
>
> It's a pretty complicated requirement you have. The reason is that
> you're talking about unrestricted speech. That is, the speaker could
> say any word, not just {yes, no} {1,2,3,4,5,6,7,8,9,0,o} 1 The larger
> the set of words being used, the higher the rate of misdetection and
> the higher the load on the system. Load and can translate to delay on
> a packed system but since your application isn't interactive, you
> don't have to worry about that too much.
>
> Dictionaries and Grammars are two things critical to Sphinx (and
> others I guess).
>
> A dictionary is a list of all the words that could be spoken in a
> particular context and what their associated sounds (phonemes).
> Tomato is one word with two phonemes, a long 'a' and a short 'a'.
> (You say tomaeto, I say tomaato). Their and There are two words with
> the same phoneme.
>
> Ideally, an IVR would ask a question with a limited domain of answers
> like "Which department would you like to speak with? You can say
> things like Billing, Customer Service, Technical Support." and then
> you would have a dictionary with things like {Billing, Customer
> Service, Technical Support, AP, AR, Helpdesk}. Note that this is an
> extremely small set of possibilities relative to the number of words
> that could be spoken in a voicemail. The bigger the dictionary the
> slower the system. It's like doing a SQL SELECT, if there are 10
> rows, you're going to get a quick response but if you start looking
> for patters of characters in millions of rows, expect to wait a second
> or two unless you've got big horsepower.
>
> You also have to consider grammars. If I remember correctly, grammars
> tell the system what words can come in what order based on words
> around them. This is so that when your users says "My IP Address is
> 192.168.0.1" You don't end up with a text file that says "My eye pea
> address is won nine to dot one six ate dot oh dot won." All of those
> sounds were correctly converted into text but this is not very useful
> as output. The grammar would tell the system that numbers followed by
> the word address are to be recorded as digits and periods. It can
> also help the system distinguish between homonyms (words that sound
> the same) like 'there' and 'their' and 'they're'. If the sound if
> followed by a noun like 'chair' then use 'their'. If it's followed by
> a verb like 'running' then use 'they're'. If it's preceded by a verb
> or preposition then use 'there'. That's just a simple example, you
> can see how complicated this could get.
>
> Fortunately many dictionaries and grammars already exist. Chances are
> you will need to understand them though and do some work to fit them
> to your application. They usually don't contain jargon. If your
> client is a cellular company and a customer can call for support on
> their 'iphone' then you're going to need to configure the system for
> that.
>
> Once you get a system up and running, a fun test phrase to use is
> "recognize speech". Depending on how you say it, this is often
> detected as 'Wreck a nice beach." It depends on your accent. Try to
> say it like someone from the southern US.
>
> In short: googling for wav2txt.gz is not going to get it done. You're
> going to have to put in some substantial work before your application
> is ready to wreck a nice beach.
>
> Good luck,
>
> Dave
>
>
> On Sat, Sep 20, 2008 at 9:09 AM, Aloysius Thevarajah Lloyd
> <[EMAIL PROTECTED]> wrote:
> > Thank you Duane.
> >
> > I did not get enough information from the Link.
> >
> > I am looking for application convert a* wav -> text* ? is there any open
> > source application available to do this task.
> >
> > THank you
> > Lloyd
> >
> >
> >
> > On Fri, Sep 19, 2008 at 9:27 PM, Duane at e164 dot org <[EMAIL PROTECTED]
> >wrote:
> >
> >> Aloysius Thevarajah Lloyd wrote:
> >> > Hello,
> >> >
> >> > Is there any Open source Automate the translation or conversion of
> voice
> >> > mail files into text ?
> >>
> >> Have a look at ZoIP
> >>
> >> http://www.uc.org/read/ZoIP
> >>
> >> Not what you want but lays the ground work for it.
> >>
> >> --
> >>
> >> Best regards,
> >> Duane
> >>
> >> http://www.freeauth.org - Enterprise Two Factor Authentication
> >> http://www.nodedb.com - Think globally, network locally
> >> http://www.sydneywireless.com - Telecommunications Freedom
> >> http://e164.org - Global Communication for the 21st Century
> >>
> >> "In the long run the pessimist may be proved right,
> >> but the optimist has a better time on the trip."
> >>
> >>
> >
> >
> > --
> > Thanks
> >
> > LLoyd
> >
> > Tel : 416-628-6090
> > Fax : 416-628-6095
> > Cell : 416-500-8014
> > Toll Free : 1-888-401-3735
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
Thanks
LLoyd
Tel : 416-500-8014