On 2020-08-30 18:10, Christian Gollwitzer wrote:
Am 30.08.20 um 17:25 schrieb MRAB:
On 2020-08-30 07:23, Muskan Sanghai wrote:
On Sunday, August 30, 2020 at 11:46:15 AM UTC+5:30, Chris Angelico wrote:
I recommend looking into CMU Sphinx then. I've used that from Python.
The results are highly entertaining.
ChrisA
Okay I will try it, thank you.

Speech recognition works best when there's a single voice, speaking clearly, with little or no background noise. Movies tend not to be like that.

Which is why the results are "highly entertaining"...


Well, with enough effort it is possible to build a system that is more
useful than "entertaining". Google did that, English youtube videos can
be annotated with subtitles from speech recognition. For example, try
this video:
https://www.youtube.com/watch?v=lYVLpC_8SQE

Go to the settings thing (the little gear icon in the nav bar) and
switch on subtitles, English autogenerated. You'll see a word-by-word
transcription of the text, and most of it is accurate.

There's not much background noise there; it takes place in a quiet room.

There are strong arguments that anything one can build with open source
tools will be inferior. 1) They'll probably have a bunch of highly
qualified KI experts working on this thing 2) They have an enormous
corpus of training data. Many videos already have user-provided
subtitles. They can feed all of this into the training.

I'm waiting to be disproven on this point ;)

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to