Well I also want to have someone take pocketsphinx and flite and build
an opensource speech server and maybe gain some momentum to improve
it. btw pocketsphinx supports jsgf, I need to update mod_pocketsphinx
to do that but I want to work with DHD to figure out how to load the
dictionary once... and just load grammar files moving forward.
/b
On Jan 13, 2009, at 12:31 PM, mszla...@aol.com wrote:
Hi Paul,
If you mean fixing up pocketsphinx (ps) for telephony instead of or
in addition to working on unimrcp then this is the site of the
person who created ps and he may have some advice.
http://www.cs.cmu.edu/~dhuggins/
Also, this was a post from the sphinx forums for adapting
pocketsphinx for telephony.
http://sourceforge.net/forum/message.php?msg_id=5621913
I don't know how accurate it is but if accurate then here is that
post to give you some of the issues involved:
-----------------
Well, there are issues in both the decoder and the interface with the
telephony application.
First about the decoder, pocketsphinx right now is the most supported
and most feature-reach decoder of the family, but in general it's
still
oriented on the embedded devices. For telephony applications you
probably need to extend it a lot. The features that are currently
missing are probably:
* Out-of-box support for multiple recognizers (probably more a
freeswitch
issue and a model training issue, for example we have no free
male/female model).
* Speaker clustering.
* Automatic VTLN estimation from pitch (This looks simple).
* Good endpointer.
* Discriminative training support in SphinxTrain (Huge task).
* Good and clean support for a garbage model to be able to filter out
out of grammar words.
* Embedded RASTA extraction and RASTA model training.
* Advanced features extraction
Another issue is dialog tracking and understanding. CMU folks are
doing
work on dialog systems, for example Raven is available
http://www.ravenclaw-olympus.org/systems_overview.html
It would be worth to look on it and try to integrate it into
freepbx. Decoder will need to support combined language model. As well
as you'll need a component for postprocessing. The postprocessing
includes
disfluency removal, text normalization, text boundary detection.
Integration
with nltk probably useful for sense extraction.
If you need more details on any of the above, feel free to ask.
-------------------
-----Original Message-----
From: Paul Herring <pa...@instruments.com>
To: freeswitch-users@lists.freeswitch.org
Sent: Tue, 13 Jan 2009 8:18 am
Subject: [Freeswitch-users] FreeSWITCH, MRCP and Perl
What would it take to put a budget together to for this project?
Date: Tue, 13 Jan 2009 01:55:36 -0500
From: mszla...@aol.com
Subject: Re: [Freeswitch-users] FreeSWITCH, MRCP and Perl
To: freeswitch-users@lists.freeswitch.org
Message-ID: <8cb436312a08329-80c-1...@mblk-m24.sysops.aol.com>
Content-Type: text/plain; charset="us-ascii"
"My god" I would LOVE it if this is really the case and would praise
pocketsphinx (PS) and FS to no end. But my experience has been
different.
First, I tried the pizza demo with a soft phone and later by outside
phone
calls to my Linksys 3102 pstn-to-voip gateway.
Second, I tried these two set-ups again but with Voxeo's Prophecy ASR.
Both are as is and by this I mean there was no training of
PocketSphinx just
running the pizza demo and with Prophecy there is no training
because it
can't be trained.
Prophecy is quite good but the FS/Pocketsphinx pizza demo isn't and I
couldn't use it at a pizza join. Also, I get a much better
experience when
calling LumenVox and trying their pizza demo.
Now, maybe Prophecy is the type of asr that doesn't require hours of
training to make it speaker independent. I know that the Sphinx
family are
the types of ASR that do need this.
So, if there is some settings for adaptation of Pocketsphinx for
speaker
independence then are they turned on?
?
How many hours of calls to a business should an owner expect before
PocketSphinx gets good enough not to scare customers away?
If there are many hours needed then I could see using another ASR in
the
mean time, recording their calls and feeding the audio to
Pocketsphinx for
training, then switching to Pocketspinx once it's "tuned up." At
least this
way a business doesn't have to deal with a "virgin" pocketsphinx.
Mark
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
_______________________________________________
Freeswitch-users mailing list
Freeswitch-users@lists.freeswitch.org
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org
A Good Credit Score is 700 or Above. See yours in just 2 easy steps!
_______________________________________________
Freeswitch-users mailing list
Freeswitch-users@lists.freeswitch.org
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org
_______________________________________________
Freeswitch-users mailing list
Freeswitch-users@lists.freeswitch.org
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org