Aaron Chantrill, le dim. 16 nov. 2025 18:25:44 -0500, a ecrit: > On 11/12/25 19:08, Jason J.G. White wrote: > > > > On 12/11/25 10:17, Aaron Chantrill wrote: > > > I'm working on an article for Linux Magazine. For this article, I'm > > > interested in talking about setting up speech dispatcher with > > > different text to speech engines, like Piper TTS or Coqui TTS. This > > > is based on a question from this mailing list a couple of months > > > ago. I'm hoping to start a series on accessibility issues while > > > deepening my own understanding. > > > > For screen reader users, minimizing audio latency is important. > > Unfortunately, > > > > the neural network-based TTS systems, including Coqui and Piper, have a > > reputation for producing high latency. This is an important reason why > > screen reader users tend not to use them. > > > > I don't know whether this is improved if you have appropriate GPU > > processing for the neural network models. Piper was unusably slow on my > > machine, but I didn't investigate deeply enough to find out whether it > > was using the GPU. > > > Piper when run as a command line program is unusably slow because it has to > load the full onnx model every time you call it. My goal is to use piper's > built-in http server.
To be noted: there is a module with native support: https://github.com/brailcom/speechd/blob/master/src/modules/cxxpiper.cpp This would provide much more flexibility than through an http server. Samuel

