Re: [asterisk-users] Speech recognition in asterisk using google voice API
Hey Zaf, Just checking the Google Speech Recognition package again and I can't see WolframAlpha.agi file. I check all of your projects on Git hub but can't find wolframalpha.agi. Please let us know what the URL is. Thanks, Bruce On Thu, Jan 12, 2012 at 2:49 PM, Lefteris Zafiris zaf@gmail.com wrote: On 01/12/2012 05:50 PM, Danny Nicholas wrote: Two more offerings - #1 - add DTMF parameter so function can be stopped by pressing a digit or digits other than * or # - #2 - add an option to silence the beep. If you were using this in an IVR and wanted to say press 1 or say help for help, silencing the beep before recording would (IMO) make the rendering sound more professional/less mechanical. Both features added: - Usage - agi(speech-recog.agi,[lang],[timeout],[intkey],[NOBEEP]) Records from the current channel untill the timeout (set to 10 seconds by default, -1 for no timeout) is reached or the interrupt key (# by default) is pressed. If NOBEEP is set, no beep sound is played back to the user to indicate the start of the recording. There is now also the option to enable SSL for encrypted communication between your pbx and the google voice server. Updated code can be found here: https://github.com/zaf/asterisk-speech-recog/tarball/master Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Two more offerings - #1 - add DTMF parameter so function can be stopped by pressing a digit or digits other than * or # - #2 - add an option to silence the beep. If you were using this in an IVR and wanted to say press 1 or say help for help, silencing the beep before recording would (IMO) make the rendering sound more professional/less mechanical. -Original Message- From: asterisk-users-boun...@lists.digium.com [mailto:asterisk-users-boun...@lists.digium.com] On Behalf Of Lefteris Zafiris Sent: Saturday, January 07, 2012 6:22 AM To: Asterisk Users Mailing List - Non-Commercial Discussion Subject: Re: [asterisk-users] Speech recognition in asterisk using google voice API On 01/07/2012 09:34 AM, Bruce B wrote: Added two new features to the script: Timeout value and speechdata type. *exten = s,n,agi(speech-recog.agi,en-US,3000,phoneNumb)* - Will listen for 3 seconds and sanitize return as a single number without any spaces in between. This helps when one reads phone number in format 415-554-2323 and google returns, 415 554 2323 as result which is not very usable. *exten = s,n,agi(speech-recog.agi,en-US,2,string)* - Will listen for 20 second and return result as provided by Google untouched. It would be great to see them in future versions as I seem to need them dearly in a real life scenario. Updated script attached. -Bruce Thank you Bruce for the testing and the suggestions. Both features added in the script. Timeout can now be set by the user, also -1 means no timeout and the recording keeps going till # is pressed. Space gets stripped between digits, this is now the default behavior and there's no need to determine the 'speechdata' type. The updated code can be found here: https://github.com/zaf/asterisk-speech-recog/tarball/master Next on my TODO list is to make use of the asterisk speech recognition API (https://wiki.asterisk.org/wiki/display/AST/Speech+Recognition+API) This will make the application actually usable for real case scenarios and not a proof of concept as it is now. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On 01/12/2012 05:50 PM, Danny Nicholas wrote: Two more offerings - #1 - add DTMF parameter so function can be stopped by pressing a digit or digits other than * or # - #2 - add an option to silence the beep. If you were using this in an IVR and wanted to say press 1 or say help for help, silencing the beep before recording would (IMO) make the rendering sound more professional/less mechanical. Both features added: - Usage - agi(speech-recog.agi,[lang],[timeout],[intkey],[NOBEEP]) Records from the current channel untill the timeout (set to 10 seconds by default, -1 for no timeout) is reached or the interrupt key (# by default) is pressed. If NOBEEP is set, no beep sound is played back to the user to indicate the start of the recording. There is now also the option to enable SSL for encrypted communication between your pbx and the google voice server. Updated code can be found here: https://github.com/zaf/asterisk-speech-recog/tarball/master Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On 01/07/2012 09:34 AM, Bruce B wrote: Added two new features to the script: Timeout value and speechdata type. *exten = s,n,agi(speech-recog.agi,en-US,3000,phoneNumb)* - Will listen for 3 seconds and sanitize return as a single number without any spaces in between. This helps when one reads phone number in format 415-554-2323 and google returns, 415 554 2323 as result which is not very usable. *exten = s,n,agi(speech-recog.agi,en-US,2,string)* - Will listen for 20 second and return result as provided by Google untouched. It would be great to see them in future versions as I seem to need them dearly in a real life scenario. Updated script attached. -Bruce Thank you Bruce for the testing and the suggestions. Both features added in the script. Timeout can now be set by the user, also -1 means no timeout and the recording keeps going till # is pressed. Space gets stripped between digits, this is now the default behavior and there's no need to determine the 'speechdata' type. The updated code can be found here: https://github.com/zaf/asterisk-speech-recog/tarball/master Next on my TODO list is to make use of the asterisk speech recognition API (https://wiki.asterisk.org/wiki/display/AST/Speech+Recognition+API) This will make the application actually usable for real case scenarios and not a proof of concept as it is now. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Does sox have more features on a Debian system than RHEL? Is that why it won't work on RHEL? Cheers, On Wed, Jan 4, 2012 at 6:42 PM, Lefteris Zafiris zaf@gmail.com wrote: Fresh code is out! The use of sox can be now optionally enabled by the user if the system has a recent version of the program (won't work in RHEL/Centos 5) This is done by editing the script and setting the variable 'use_sox'. When sox is used the audio gets normalized, low frequency noise (100Hz) is removed and also possible DC offset is corrected. Those are supposed to improve the recognition results(?). The settings are still a bit experimental, feel free to play with them and report what settings improved your results. get the new version here: https://github.com/downloads/zaf/asterisk-speech-recog/asterisk-speech-recog-0.3.tar.gz Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On Fri, 6 Jan 2012 20:46:14 -0500 Bruce B bruceb...@gmail.com wrote: Does sox have more features on a Debian system than RHEL? Is that why it won't work on RHEL? RHEL's 5 version of sox is really old and outdated. The command syntax and the switches are totally different compared to recent versions of sox. Anyway I'm not sure audio normalization and the rest we use sox for is really needed. My tests so far didn't show any improvements in detection rates. Keep in mind that all this is still WIP and the option to use sox is more for testing than for serious use. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Thanks. I have been testing Aastra phones with SIP and had great results. I am testing my cell phone now and sometimes get -1 for id, status, utterance, and confidence. What does that mean? Cheers On Fri, Jan 6, 2012 at 9:40 PM, Lefteris Zafiris zaf@gmail.com wrote: On Fri, 6 Jan 2012 20:46:14 -0500 Bruce B bruceb...@gmail.com wrote: Does sox have more features on a Debian system than RHEL? Is that why it won't work on RHEL? RHEL's 5 version of sox is really old and outdated. The command syntax and the switches are totally different compared to recent versions of sox. Anyway I'm not sure audio normalization and the rest we use sox for is really needed. My tests so far didn't show any improvements in detection rates. Keep in mind that all this is still WIP and the option to use sox is more for testing than for serious use. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
NVM. I explored the code and see the logic. I had sox = 1 so it was failing on RHEL. To report, my cell phone from a PRI gets same confidence level just like SIP. Building my control app now. Should make my life much easier while driving. Thanks again :-) -Bruce On Fri, Jan 6, 2012 at 10:50 PM, Bruce B bruceb...@gmail.com wrote: Thanks. I have been testing Aastra phones with SIP and had great results. I am testing my cell phone now and sometimes get -1 for id, status, utterance, and confidence. What does that mean? Cheers On Fri, Jan 6, 2012 at 9:40 PM, Lefteris Zafiris zaf@gmail.comwrote: On Fri, 6 Jan 2012 20:46:14 -0500 Bruce B bruceb...@gmail.com wrote: Does sox have more features on a Debian system than RHEL? Is that why it won't work on RHEL? RHEL's 5 version of sox is really old and outdated. The command syntax and the switches are totally different compared to recent versions of sox. Anyway I'm not sure audio normalization and the rest we use sox for is really needed. My tests so far didn't show any improvements in detection rates. Keep in mind that all this is still WIP and the option to use sox is more for testing than for serious use. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Added two new features to the script: Timeout value and speechdata type. *exten = s,n,agi(speech-recog.agi,en-US,3000,phoneNumb)* - Will listen for 3 seconds and sanitize return as a single number without any spaces in between. This helps when one reads phone number in format 415-554-2323 and google returns, 415 554 2323 as result which is not very usable. *exten = s,n,agi(speech-recog.agi,en-US,2,string)* - Will listen for 20 second and return result as provided by Google untouched. It would be great to see them in future versions as I seem to need them dearly in a real life scenario. Updated script attached. -Bruce On Fri, Jan 6, 2012 at 11:03 PM, Bruce B bruceb...@gmail.com wrote: NVM. I explored the code and see the logic. I had sox = 1 so it was failing on RHEL. To report, my cell phone from a PRI gets same confidence level just like SIP. Building my control app now. Should make my life much easier while driving. Thanks again :-) -Bruce On Fri, Jan 6, 2012 at 10:50 PM, Bruce B bruceb...@gmail.com wrote: Thanks. I have been testing Aastra phones with SIP and had great results. I am testing my cell phone now and sometimes get -1 for id, status, utterance, and confidence. What does that mean? Cheers On Fri, Jan 6, 2012 at 9:40 PM, Lefteris Zafiris zaf@gmail.comwrote: On Fri, 6 Jan 2012 20:46:14 -0500 Bruce B bruceb...@gmail.com wrote: Does sox have more features on a Debian system than RHEL? Is that why it won't work on RHEL? RHEL's 5 version of sox is really old and outdated. The command syntax and the switches are totally different compared to recent versions of sox. Anyway I'm not sure audio normalization and the rest we use sox for is really needed. My tests so far didn't show any improvements in detection rates. Keep in mind that all this is still WIP and the option to use sox is more for testing than for serious use. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users speech-recog.agi Description: Binary data -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On 01/04/2012 07:51 AM, Bruce B wrote: And with recent version 14.3.2 I get: /usr/local/bin/sox FAIL formats: no handler for file extension `flac' -- speech-recog.agi: /usr/local/bin/sox failed: 512 -- SIP/-002eAGI Script speech-recog.agi completed, returning 0 Regards, On Wed, Jan 4, 2012 at 12:43 AM, Bruce B bruceb...@gmail.com wrote: Very interesting. I just tried to get it to work but it complains about sox. Probably you used a different version of sox? *PBX-*CLI /usr/bin/sox: invalid option -- -* */usr/bin/sox: invalid option -- n* */usr/bin/sox: invalid option -- o* */usr/bin/sox: -r must be given a positive integer* * -- speech-recog.agi: /usr/bin/sox failed: 512* I am using: *Package sox-12.18.1-1.el5_5.1.i386 * Thanks, Note to self: Never release anything asterisk related without testing on RHEL/Centos 5 Thank you for reporting this. I have replaced sox with flac and it seems to work now on older platforms too (tested on Centos 5 with asterisk 1.4). You can get the updated code here: https://github.com/zaf/asterisk-speech-recog/tarball/master Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
this looks great - is there any chance of coverting the googletts.agi to use flac as well ? Julian On 4 January 2012 09:06, Lefteris Zafiris zaf@gmail.com wrote: On 01/04/2012 07:51 AM, Bruce B wrote: And with recent version 14.3.2 I get: /usr/local/bin/sox FAIL formats: no handler for file extension `flac' -- speech-recog.agi: /usr/local/bin/sox failed: 512 -- SIP/-002eAGI Script speech-recog.agi completed, returning 0 Regards, On Wed, Jan 4, 2012 at 12:43 AM, Bruce B bruceb...@gmail.com wrote: Very interesting. I just tried to get it to work but it complains about sox. Probably you used a different version of sox? *PBX-*CLI /usr/bin/sox: invalid option -- -* */usr/bin/sox: invalid option -- n* */usr/bin/sox: invalid option -- o* */usr/bin/sox: -r must be given a positive integer* * -- speech-recog.agi: /usr/bin/sox failed: 512* I am using: *Package sox-12.18.1-1.el5_5.1.i386 * Thanks, Note to self: Never release anything asterisk related without testing on RHEL/Centos 5 Thank you for reporting this. I have replaced sox with flac and it seems to work now on older platforms too (tested on Centos 5 with asterisk 1.4). You can get the updated code here: https://github.com/zaf/asterisk-speech-recog/tarball/master Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- Julian Lyndon-Smith IT Director, Dot R Limited I don’t care if it works on your machine! We are not shipping your machine!” The kangaroo dances: http://www.youtube.com/watch?v=MAWl5iYOaUg -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On 01/04/2012 04:07 PM, Julian Lyndon-Smith wrote: this looks great - is there any chance of coverting the googletts.agi to use flac as well ? Julian In googletts.agi we get the voice data from google in mp3 and we convert it in a format that asterisk can read and playback (slin). If we store it in flac asterisk wont be able to read it natively and we would have to convert it each time we want to play it back to the user. In the speech recognition script we have to convert the voice data in flac before sending it to google because that's the accepted format. Is there some particular reason you want the googletts.agi data in flac? Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
the only reason is that I didn't want to have to install sox. Lazy. that's all ;) Just another piece of software to find and install running on amazon ec2, is the best thing to download the source and compile sox ? Thanks Julian On 4 January 2012 14:18, Lefteris Zafiris zaf@gmail.com wrote: On 01/04/2012 04:07 PM, Julian Lyndon-Smith wrote: this looks great - is there any chance of coverting the googletts.agi to use flac as well ? Julian In googletts.agi we get the voice data from google in mp3 and we convert it in a format that asterisk can read and playback (slin). If we store it in flac asterisk wont be able to read it natively and we would have to convert it each time we want to play it back to the user. In the speech recognition script we have to convert the voice data in flac before sending it to google because that's the accepted format. Is there some particular reason you want the googletts.agi data in flac? Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- Julian Lyndon-Smith IT Director, Dot R Limited I don’t care if it works on your machine! We are not shipping your machine!” The kangaroo dances: http://www.youtube.com/watch?v=MAWl5iYOaUg -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On 01/04/2012 04:24 PM, Julian Lyndon-Smith wrote: the only reason is that I didn't want to have to install sox. Lazy. that's all ;) Just another piece of software to find and install running on amazon ec2, is the best thing to download the source and compile sox ? Thanks It should be on your distro repos already. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
nope :( On 4 January 2012 14:29, Lefteris Zafiris zaf@gmail.com wrote: On 01/04/2012 04:24 PM, Julian Lyndon-Smith wrote: the only reason is that I didn't want to have to install sox. Lazy. that's all ;) Just another piece of software to find and install running on amazon ec2, is the best thing to download the source and compile sox ? Thanks It should be on your distro repos already. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- Julian Lyndon-Smith IT Director, Dot R Limited I don’t care if it works on your machine! We are not shipping your machine!” The kangaroo dances: http://www.youtube.com/watch?v=MAWl5iYOaUg -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Note to self: Never release anything asterisk related without testing on RHEL/Centos 5 Thank you for reporting this. I have replaced sox with flac and it seems to work now on older platforms too (tested on Centos 5 with asterisk 1.4). You can get the updated code here: https://github.com/zaf/asterisk-speech-recog/tarball/master Lefteris Zafiris Works beautifully. Amazing job Lefteris. Thanks. The best result I got in probability was 0.9725632 by saying, hello. I think there is some non-phonetic logic built-in as well. I tried, 1, 2 and I got 0.86534226 in accuracy. While I tried 1, 2, 3, 4, 5 I got, 0.97256315. Probably Google sees the pattern?! What are some of the other tricks (if any) or consideration that one should make while creating a strong speech recognition enabled IVR? Best, -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Does anyone know what languages are supported? -Original Message- From: Bruce B bruceb...@gmail.com Sender: asterisk-users-boun...@lists.digium.com Date: Wed, 4 Jan 2012 13:25:18 To: Asterisk Users Mailing List - Non-Commercial Discussionasterisk-users@lists.digium.com Reply-To: Asterisk Users Mailing List - Non-Commercial Discussion asterisk-users@lists.digium.com Subject: Re: [asterisk-users] Speech recognition in asterisk using google voice API -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Wow - nice! A few quick questions: 1. How long can the recording be for translation? 2. Any limitation on how much text the return (transcribed) variable can hold? 3. Any commercial / terms of use limitations? From: asterisk-users-boun...@lists.digium.com [asterisk-users-boun...@lists.digium.com] On Behalf Of Bruce B [bruceb...@gmail.com] Sent: Wednesday, January 04, 2012 1:25 PM To: Asterisk Users List Subject: Re: [asterisk-users] Speech recognition in asterisk using google voice API Note to self: Never release anything asterisk related without testing on RHEL/Centos 5 Thank you for reporting this. I have replaced sox with flac and it seems to work now on older platforms too (tested on Centos 5 with asterisk 1.4). You can get the updated code here: https://github.com/zaf/asterisk-speech-recog/tarball/master Lefteris Zafiris Works beautifully. Amazing job Lefteris. Thanks. The best result I got in probability was 0.9725632 by saying, hello. I think there is some non-phonetic logic built-in as well. I tried, 1, 2 and I got 0.86534226 in accuracy. While I tried 1, 2, 3, 4, 5 I got, 0.97256315. Probably Google sees the pattern?! What are some of the other tricks (if any) or consideration that one should make while creating a strong speech recognition enabled IVR? Best, -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On Wed, Jan 4, 2012 at 8:47 PM, Michelle Dupuis mdup...@ocg.ca wrote: Wow - nice! A few quick questions: 1. How long can the recording be for translation? At the moment the recording timeout is set at 15sec. I haven't tested yet the max length of voice data ta google accepts (all this voice recognition stuff is undocumented). I have read that it is between 10-20 seconds but havent really went to test this yet. On my todo list is to add the option to cut the sound data in smaller chunks before sending them to google and get rid of the recording length limitations. 2. Any limitation on how much text the return (transcribed) variable can hold? This better be answered by the astsrisk devs but empirically talking i have loaded in dialplan variables really big chunks of text (like the complete gpl license) without having any problems. 3. Any commercial / terms of use limitations? This is a gray area at the moment. Voice recognition is undocumented in google's API and i guess not officially supported yet. I hope it gets covered by the general TOS of google services: http://www.google.com/accounts/TOS Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On Wed, Jan 4, 2012 at 8:27 PM, isr...@gmail.com wrote: Does anyone know what languages are supported? For sure english and spanish, since its undocumented i don't have a complete list yet. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Works beautifully. Amazing job Lefteris. Thanks. The best result I got in probability was 0.9725632 by saying, hello. I think there is some non-phonetic logic built-in as well. I tried, 1, 2 and I got 0.86534226 in accuracy. While I tried 1, 2, 3, 4, 5 I got, 0.97256315. Probably Google sees the pattern?! What are some of the other tricks (if any) or consideration that one should make while creating a strong speech recognition enabled IVR? Google accepts sound files at any sampling rate (up to 44.1kHz) so if you can use some wideband codec ( eg g722) It can greatly improve the sound quality and the detection rates. For now the script supports 8kHz and 16kHz sampling rates for recording and it can be set by editing the scripts user defined parameters ( the variable $samplerate). Anything that improves the recording sound clarity will help, a good phone, low background noise level etc. I have also read that normalizing the recording and setting the gain to -5 db improves detection rates. I m experimenting with this at the moment and there will be some new code soon (as soon as i get sox working in RHEL/Centos 5 :P ). Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On 1/4/2012 2:26 PM, Lefteris Zafiris wrote: Works beautifully. Amazing job Lefteris. Thanks. The best result I got in probability was 0.9725632 by saying, hello. I think there is some non-phonetic logic built-in as well. I tried, 1, 2 and I got 0.86534226 in accuracy. While I tried 1, 2, 3, 4, 5 I got, 0.97256315. Probably Google sees the pattern?! What are some of the other tricks (if any) or consideration that one should make while creating a strong speech recognition enabled IVR? Google accepts sound files at any sampling rate (up to 44.1kHz) so if you can use some wideband codec ( eg g722) It can greatly improve the sound quality and the detection rates. For now the script supports 8kHz and 16kHz sampling rates for recording and it can be set by editing the scripts user defined parameters ( the variable $samplerate). Anything that improves the recording sound clarity will help, a good phone, low background noise level etc. I have also read that normalizing the recording and setting the gain to -5 db improves detection rates. I m experimenting with this at the moment and there will be some new code soon (as soon as i get sox working in RHEL/Centos 5 :P ). This is really spectacular. Thanks. I'm running Fedora 15, so I can use flac or sox. Any reason to prefer one over the other? sean -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
wow i just tried in hebrew and i'll say just 1 word WOW On Wed, Jan 4, 2012 at 9:48 PM, sean darcy seandar...@gmail.com wrote: On 1/4/2012 2:26 PM, Lefteris Zafiris wrote: Works beautifully. Amazing job Lefteris. Thanks. The best result I got in probability was 0.9725632 by saying, hello. I think there is some non-phonetic logic built-in as well. I tried, 1, 2 and I got 0.86534226 in accuracy. While I tried 1, 2, 3, 4, 5 I got, 0.97256315. Probably Google sees the pattern?! What are some of the other tricks (if any) or consideration that one should make while creating a strong speech recognition enabled IVR? Google accepts sound files at any sampling rate (up to 44.1kHz) so if you can use some wideband codec ( eg g722) It can greatly improve the sound quality and the detection rates. For now the script supports 8kHz and 16kHz sampling rates for recording and it can be set by editing the scripts user defined parameters ( the variable $samplerate). Anything that improves the recording sound clarity will help, a good phone, low background noise level etc. I have also read that normalizing the recording and setting the gain to -5 db improves detection rates. I m experimenting with this at the moment and there will be some new code soon (as soon as i get sox working in RHEL/Centos 5 :P ). This is really spectacular. Thanks. I'm running Fedora 15, so I can use flac or sox. Any reason to prefer one over the other? sean -- __**__**_ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/**mailman/listinfo/asterisk-**usershttp://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
On Wed, 04 Jan 2012 14:48:22 -0500 sean darcy seandar...@gmail.com wrote: This is really spectacular. Thanks. I'm running Fedora 15, so I can use flac or sox. Any reason to prefer one over the other? sean We have to convert the voice data to flac format before sending them to google, this can be done by both sox and flac encoder. For now the script uses flac encoder for compatibility with older distros (mainly RHEL 5). Sox is a bit more flexible and also gives you the option to edit the sound data (normalizing, changing levels etc). Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Fresh code is out! The use of sox can be now optionally enabled by the user if the system has a recent version of the program (won't work in RHEL/Centos 5) This is done by editing the script and setting the variable 'use_sox'. When sox is used the audio gets normalized, low frequency noise (100Hz) is removed and also possible DC offset is corrected. Those are supposed to improve the recognition results(?). The settings are still a bit experimental, feel free to play with them and report what settings improved your results. get the new version here: https://github.com/downloads/zaf/asterisk-speech-recog/asterisk-speech-recog-0.3.tar.gz Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
[asterisk-users] Speech recognition in asterisk using google voice API
Hello, I have written an agi script that uses google voice API for voice recognition. The script records from the current channel untill the pound key (#) is pressed or the timeout (15 seconds) is reached. The recording is send over to google speech recognition service and the returned text string is assigned to a channel variable. More info and dialplan examples can be found in the README file: https://raw.github.com/zaf/asterisk-speech-recog/master/README The script is available here: https://github.com/zaf/asterisk-speech-recog The code is still young and not roughly tested so comments, suggestions and bug reports are more than welcome. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Very interesting. I just tried to get it to work but it complains about sox. Probably you used a different version of sox? *PBX-*CLI /usr/bin/sox: invalid option -- -* */usr/bin/sox: invalid option -- n* */usr/bin/sox: invalid option -- o* */usr/bin/sox: -r must be given a positive integer* * -- speech-recog.agi: /usr/bin/sox failed: 512* I am using: *Package sox-12.18.1-1.el5_5.1.i386 * Thanks, On Tue, Jan 3, 2012 at 9:42 PM, Lefteris Zafiris zaf@gmail.com wrote: Hello, I have written an agi script that uses google voice API for voice recognition. The script records from the current channel untill the pound key (#) is pressed or the timeout (15 seconds) is reached. The recording is send over to google speech recognition service and the returned text string is assigned to a channel variable. More info and dialplan examples can be found in the README file: https://raw.github.com/zaf/asterisk-speech-recog/master/README The script is available here: https://github.com/zaf/asterisk-speech-recog The code is still young and not roughly tested so comments, suggestions and bug reports are more than welcome. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
And with recent version 14.3.2 I get: /usr/local/bin/sox FAIL formats: no handler for file extension `flac' -- speech-recog.agi: /usr/local/bin/sox failed: 512 -- SIP/-002eAGI Script speech-recog.agi completed, returning 0 Regards, On Wed, Jan 4, 2012 at 12:43 AM, Bruce B bruceb...@gmail.com wrote: Very interesting. I just tried to get it to work but it complains about sox. Probably you used a different version of sox? *PBX-*CLI /usr/bin/sox: invalid option -- -* */usr/bin/sox: invalid option -- n* */usr/bin/sox: invalid option -- o* */usr/bin/sox: -r must be given a positive integer* * -- speech-recog.agi: /usr/bin/sox failed: 512* I am using: *Package sox-12.18.1-1.el5_5.1.i386 * Thanks, On Tue, Jan 3, 2012 at 9:42 PM, Lefteris Zafiris zaf@gmail.comwrote: Hello, I have written an agi script that uses google voice API for voice recognition. The script records from the current channel untill the pound key (#) is pressed or the timeout (15 seconds) is reached. The recording is send over to google speech recognition service and the returned text string is assigned to a channel variable. More info and dialplan examples can be found in the README file: https://raw.github.com/zaf/asterisk-speech-recog/master/README The script is available here: https://github.com/zaf/asterisk-speech-recog The code is still young and not roughly tested so comments, suggestions and bug reports are more than welcome. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Speech recognition in asterisk using google voice API
Hi there, I've developed an agi script a while ago to use google speech recognition and by then I've used http://legroom.net/files/software/convtoflac.sh to convert files from wav to flac. You can the use the command: */usr/local/bin/convtoflac.sh -o /var/lib/asterisk/sounds/myfile.wav* It will then create create a flac file in the same directory as the source file. I hope it helps. regards Lobito On 1/4/2012 5:51 AM, Bruce B wrote: And with recent version 14.3.2 I get: /usr/local/bin/sox FAIL formats: no handler for file extension `flac' -- speech-recog.agi: /usr/local/bin/sox failed: 512 -- SIP/-002eAGI Script speech-recog.agi completed, returning 0 Regards, On Wed, Jan 4, 2012 at 12:43 AM, Bruce B bruceb...@gmail.com mailto:bruceb...@gmail.com wrote: Very interesting. I just tried to get it to work but it complains about sox. Probably you used a different version of sox? *PBX-*CLI /usr/bin/sox: invalid option -- -* */usr/bin/sox: invalid option -- n* */usr/bin/sox: invalid option -- o* */usr/bin/sox: -r must be given a positive integer* * -- speech-recog.agi: /usr/bin/sox failed: 512* I am using: *Package sox-12.18.1-1.el5_5.1.i386 * Thanks, On Tue, Jan 3, 2012 at 9:42 PM, Lefteris Zafiris zaf@gmail.com mailto:zaf@gmail.com wrote: Hello, I have written an agi script that uses google voice API for voice recognition. The script records from the current channel untill the pound key (#) is pressed or the timeout (15 seconds) is reached. The recording is send over to google speech recognition service and the returned text string is assigned to a channel variable. More info and dialplan examples can be found in the README file: https://raw.github.com/zaf/asterisk-speech-recog/master/README The script is available here: https://github.com/zaf/asterisk-speech-recog The code is still young and not roughly tested so comments, suggestions and bug reports are more than welcome. Lefteris Zafiris -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users