On Tue, Jan 11, 2022 at 12:11 PM Ken Fallon <k...@fallon.ie> wrote:
> In the past it has been argued that the more natural voices are
> difficult to understand when sped up. So I took the two most natural
> voices from the list and posted a side by side comparison to espeak at
> 150%, 200%, 250%,  300%, 350%, 400%, 450%, and 500%. In my opinion the
> coqui-tts_en_en_ljspeech is more understandable than espeak at every speed.
>
> Can everyone have a listen to this and tell me your preference
> https://hackerpublicradio.org/tts-espeak-ljspeech-vctk-normal-150-200-250-300-350-400-450-500-percent.ogg

I rarely listen faster than 2x (I prefer 1x but will speed up if I
have a lot of content to get through), so I can imagine someone who
deals with audio navigation day after day would have a much more
nuanced (and, I think, valuable opinion).
That said, here's my feedback:
- I found voice #2 the most pleasant of the 3, particularly at 1x
- Both voices #2 and #3 were more pleasant than #1 at all speeds
- All the voices were intelligible at 1x
- At higher speeds, I had the easiest time understanding voice #3, but
this could just be due to my own American accent
- I'd like to hear from folks like Mike who routinely listen at high speeds

_______________________________________________
Hpr mailing list
Hpr@hackerpublicradio.org
http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org

Reply via email to