Hi Ben, I'm really excited to see this and wanted to bring up a couple use-cases that we have implemented to see if it's something you think aligns with the goals of the Asterisk implementation. We primarily do real time speech-to-text for calls or conferences however a big limitation with Asterisk and Speech APIs is speaker diarization. We have gotten around this with phone calls by assigning each party of the call to the left and right channel of a stereo recording, which some Speech APIs support. However, this is inadequate for multi-party conferences. Being able to use this new speech to text feature on a particular channel, and then including that channel ID in the protocol would be helpful. Or perhaps even better, support a generic user_data JSON property for us to pass custom application specific data to the external applicatication.
I think another interesting use case would be real-time translation of a phone call. For example, if the external application was receiving audio from Asterisk and then sending back audio that's been translated to another language, it would be very powerful. The audio could be sent to a separate channel so that speakers of different languages could hear what was being said without a translator. Lastly, I'd love some clarification on the intended use cases of this versus the Audio_Socket Application and EAGI, perhaps those are the more appropriate tools for these use cases. Benjamin Fitzgerald ᐧ ᐧ On Mon, Mar 22, 2021 at 12:14 PM Ben Ford <bf...@digium.com> wrote: > Hello everyone, > > The Asterisk team has been working on planning better text-to-speech and > speech-to-text functionality for Asterisk. We’ll be using a speech service > in conjunction with an external application that connects it to Asterisk. > More information on the protocol used for this and the overall project can > be found here: > > https://wiki.asterisk.org/wiki/pages/viewpage.action?pageId=45482453 > > After reading the wiki page, if there is anything you feel could be > improved, we’d love to hear about it. The goal for the protocol is to make > it generic enough to where we would be able to use it for other things > besides text-to-speech and speech-to-text in the future. This means it > should remain as simple as possible. We tried to come up with basic > scenarios and give examples of what it might look like, but this may not > cover all bases. If you see a case that the protocol would not be able to > handle, we want to hear about that, too! > > > -- > Benjamin Ford > Software Engineer > 256-428-6147 > Check us out at www.sangoma.com and www.asterisk.org > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-dev mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-dev
-- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev