Package: release.debian.org
Severity: normal
Tags: bullseye
User: release.debian....@packages.debian.org
Usertags: pu

Hello,

I'd like to have version 0.10.2-2+deb11u2 of speech-dispatcher uploaded
to bullseye.

[ Reason ]
As reported on

https://github.com/espeak-ng/espeak-ng/issues/1554

there is a longstanding buffer overflow issue in espeak-ng. It
appears to users as artifacts in the speech that makes it sometimes
less inteligible. But since they are due to a buffer overflow, the
consequences could actually be way worse, notably since in the context
of a screen reader using espeak-ng as speech synthesis, the data
produced in the buffer comes from e.g. whatever webpage that the user is
browsing, thus potential for security issues.

This is not a regression from oldstable, since the bug has basically
always been there.

[ Impact ]
If it is no included, users will still hear artifacts, and would also
potentially be exposed to security issues due to the buffer overflows.

[ Tests ]
We made manual tests that confirmed that the change fixes the issue.
We also confirmed the buffer overflow in espeak-ng's valgrind/asan CI.

[ Risks ]
The change is very trivial. It makes the speech synthesis processing a
bit more expensive, but that will be very lightweight.

[ Checklist ]
  [X] *all* changes are documented in the d/changelog
  [X] I reviewed all changes and I approve them
  [X] attach debdiff against the package in (old)stable
  [X] the issue is verified as fixed in unstable

[ Changes ]
The patch simply lowers the audio buffer size from 3s to 300ms.

3s was originally chosen so that large chunks of audio are processed
at a time, but this is what is making espeak-ng overflow its synthesis
buffers.

The exact overflows should ideally be fixed in espeak-ng of course, but
apparently this will be very involved, and it is way simpler to just
make speech-dispatcher not request so large buffers. espeak-ng itself
defaults to 60ms, and on Windows NVDA uses 300ms and does not suffer
from buffer overflows. In practice, the overflows that we observed in

https://github.com/espeak-ng/espeak-ng/issues/1554

happens with buffer around 900ms. 300ms seems a fairly safe value while
being not too small so as to process not-so-small chunks of audio at a
time.

Samuel
diff --git a/debian/changelog b/debian/changelog
index 6fea892..b6120a2 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+speech-dispatcher (0.10.2-2+deb11u2) bullseye; urgency=medium
+
+  * patches/buffer_size: Reduce espeak buffer size to avoid synth artifacts.
+
+ -- Samuel Thibault <sthiba...@debian.org>  Wed, 30 Nov 2022 21:10:17 +0100
+
 speech-dispatcher (0.10.2-2+deb11u1) bullseye; urgency=medium
 
   * patches/generic-set-voice-name: Fix setting voice name for the generic
diff --git a/debian/patches/buffer_size b/debian/patches/buffer_size
new file mode 100644
index 0000000..2b13443
--- /dev/null
+++ b/debian/patches/buffer_size
@@ -0,0 +1,35 @@
+commit be4c3585ead45716b8f49300b50c30fdb6eee266
+Author: Alexander Epaneshnikov <aarnaa...@gmail.com>
+Date:   Thu Nov 24 01:41:40 2022 +0300
+
+    espeak: set buffer size to 300
+    
+    this fixes #793
+    ref for buffer size: 
https://github.com/nvaccess/nvda/blob/a6fb2392083b5fa1bae926102135ad452746ad3c/source/synthDrivers/_espeak.py#L338
+
+diff --git a/config/modules/espeak-ng.conf b/config/modules/espeak-ng.conf
+index d02704e9..9677bc99 100644
+--- a/config/modules/espeak-ng.conf
++++ b/config/modules/espeak-ng.conf
+@@ -41,7 +41,7 @@ EspeakMaxRate 449
+ # -- Internal parameters --
+ 
+ # Number of ms of audio returned by the espeak callback function.
+-EspeakAudioChunkSize 3000
++EspeakAudioChunkSize 300
+ 
+ # Maximum number of samples to buffer in playback queue.
+ EspeakAudioQueueMaxSize 441000
+diff --git a/config/modules/espeak.conf b/config/modules/espeak.conf
+index 3abc422b..cbd453b7 100644
+--- a/config/modules/espeak.conf
++++ b/config/modules/espeak.conf
+@@ -41,7 +41,7 @@ EspeakMaxRate 449
+ # -- Internal parameters --
+ 
+ # Number of ms of audio returned by the espeak callback function.
+-EspeakAudioChunkSize 3000
++EspeakAudioChunkSize 300
+ 
+ # Maximum number of samples to buffer in playback queue.
+ EspeakAudioQueueMaxSize 441000
diff --git a/debian/patches/series b/debian/patches/series
index 1f09fb7..5619c8a 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -2,3 +2,4 @@ doc-figures
 systemd-debian
 mbrola-paths
 generic-set-voice-name
+buffer_size

Reply via email to