branch: externals/greader
commit 49acab1718b78ecd2f6a7ef03ba3e8ffc8b9f705
Author: Michelangelo Rodriguez <[email protected]>
Commit: Michelangelo Rodriguez <[email protected]>
feat(audiobook): Add WAV size validation against expected size
* greader-audiobook.el (greader-audiobook-expected-sample-rate): New
defcustom; sample rate assumed for the expected-size estimate (default
22050 Hz).
(greader-audiobook-size-check-tolerance): New defcustom; lower-bound
tolerance percentage. A WAV smaller than expected*(100-tol)/100 is
flagged; WAVs larger than expected are never flagged.
(greader-audiobook-size-check-min-words): New defcustom; blocks with
fewer words than this threshold are exempt from the check (default 10).
(greader-audiobook-size-mismatch-max-retries): New defcustom; max
retry attempts when on-size-mismatch is 'retry (default 2).
(greader-audiobook-on-size-mismatch): New defcustom; action on size
deviation: ignore, warn (default), error, retry, ask, or a function
called with (filename wav-size expected-size).
(greader-audiobook--size-warnings): New defvar; counter reset at the
start of each greader-audiobook-buffer call and incremented on each
warn/ask-continue event.
(greader-audiobook--expected-wav-size): New function; computes the
expected WAV size from word count, WPM, and sample rate. Accepts an
optional pre-computed WORD-COUNT to avoid double-splitting.
(greader-audiobook-convert-block): Wrap the audio-write call in a
retry loop controlled by done/will-retry flags. After the minimum-size
check, run the expected-size deviation check when tolerance > 0 and
word count >= min-words threshold.
(greader-audiobook-get-block-by-number): New function; return the raw
text of block N (1-based) using save-excursion from point-min.
(greader-audiobook-buffer): Reset greader-audiobook--size-warnings to
0 before the conversion loop; append "; size warnings: N" to the
completion message when the size check is active.
* readme.md: Document the expected-size deviation check subsection and
add the new variables to the customization table.
* greader.texi: Same additions, plus new menu entry
"Expected-size deviation check".
* CLAUDE.md: Update the greader-audiobook-convert-block note to list
the new validation variables.
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
---
CLAUDE.md | 10 ++-
greader-audiobook.el | 188 ++++++++++++++++++++++++++++++++++++++++++++++-----
greader.texi | 73 +++++++++++++++++---
readme.md | 32 ++++++++-
4 files changed, 273 insertions(+), 30 deletions(-)
diff --git a/CLAUDE.md b/CLAUDE.md
index 8415784d61..19f0e309f9 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -225,5 +225,11 @@ Tests are in `greader-dict-tests.el` (covers dict
functionality).
- `greader-tired-mode` internals: the wakeup intercept uses
`greader--tired-intercept-mode`, a transient buffer-local minor mode with a
`[t]` catch-all keymap. Any key press calls `greader--tired-wakeup` (swallowing
the command) and resumes reading. With `greader-soft-timer` on, the idle timer
is armed in `greader--default-action` (after the last sentence finishes) via
`greader--tired-pending`, not in `greader-stop-timer-callback`. `greader-stop`
clears `greader--tired-pending` to preven [...]
- `greader-audiobook-convert-block` uses `greader-call-backend 'audio-write` —
do not
hardcode espeak-ng there. Error handling: non-zero exit code + WAV size check
- (`greader-audiobook-min-wav-size`). Policy controlled by
`greader-audiobook-on-error`
- (stop/skip/ask). Error buffer name derived via
`greader-audiobook--backend-error-buffer`.
+ (`greader-audiobook-min-wav-size`). Hard-error policy:
`greader-audiobook-on-error`
+ (stop/skip/ask). Expected-size deviation check:
`greader-audiobook-on-size-mismatch`
+ (ignore/warn/error/retry/ask/function); tolerance via
+ `greader-audiobook-size-check-tolerance`; min-words threshold via
+ `greader-audiobook-size-check-min-words` (blocks below threshold are skipped
+ silently); retry limit via `greader-audiobook-size-mismatch-max-retries`.
+ Error buffer name derived via
+ `greader-audiobook--backend-error-buffer`.
diff --git a/greader-audiobook.el b/greader-audiobook.el
index ec3aefc9c2..a6c85c7e7c 100644
--- a/greader-audiobook.el
+++ b/greader-audiobook.el
@@ -37,8 +37,19 @@
;; conversion will take more time, but as sayd first, the quality is
;; definitely better.
;;
-;;
-;;
+;; WAV size validation
+;; After each block is converted, greader-audiobook compares the actual
+;; WAV file size against an estimate derived from the word count and the
+;; current TTS rate (WPM). The tolerance is controlled by
+;; `greader-audiobook-size-check-tolerance' (default 50%). When the
+;; actual size falls outside the expected range, the action is governed
+;; by `greader-audiobook-on-size-mismatch': warn (default), error,
+;; retry, ask, or a custom function. With `retry', the block is
+;; re-converted up to `greader-audiobook-size-mismatch-max-retries'
+;; times before signalling an error. Set the tolerance to 0 to
+;; disable the check entirely.
+;;
+;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;;; Change Log:
@@ -238,6 +249,62 @@ A WAV header alone is 44 bytes; a silent one-second clip
at 22050 Hz
is about 44 KB, so 1000 bytes is a conservative lower bound."
:type 'natnum
:group 'greader-audiobook)
+
+(defcustom greader-audiobook-expected-sample-rate 22050
+ "Sample rate in Hz assumed when estimating expected WAV file size.
+All built-in backends (espeak, piper, mac) produce 22050 Hz 16-bit mono PCM.
+Adjust only if using a custom backend with a different sample rate."
+ :type 'natnum
+ :group 'greader-audiobook)
+
+(defcustom greader-audiobook-size-check-tolerance 50
+ "Lower-bound tolerance percentage for WAV size against the expected value.
+If the actual WAV size is less than (expected × (100 - tolerance) / 100),
+the block is considered suspiciously short and
+`greader-audiobook-on-size-mismatch' is consulted. A WAV larger than
+expected is normal (TTS backends add pauses and expand abbreviations) and
+is never flagged. Set to 0 to disable the expected-size check entirely."
+ :type 'natnum
+ :group 'greader-audiobook)
+
+(defcustom greader-audiobook-size-check-min-words 10
+ "Minimum word count for a block to undergo the expected-size deviation check.
+Blocks with fewer words than this threshold are skipped silently: short
+blocks produce unreliable size estimates because TTS backends add
+proportionally more silence and overhead relative to the text content.
+Set to 0 to check all blocks regardless of length."
+ :type 'natnum
+ :group 'greader-audiobook)
+
+(defcustom greader-audiobook-size-mismatch-max-retries 2
+ "Maximum re-conversion attempts when `greader-audiobook-on-size-mismatch' is
+\\='retry. After this many retries the block is treated as failed and an error
+is signalled."
+ :type 'natnum
+ :group 'greader-audiobook)
+
+(defcustom greader-audiobook-on-size-mismatch 'warn
+ "Action when a WAV block size deviates significantly from the expected size.
+
+\\='ignore — do nothing; continue normally.
+\\='warn — log a message to *Messages* and continue (default).
+\\='error — signal an error; `greader-audiobook-on-error\\=' policy applies.
+\\='retry — repeat the TTS conversion up to
+ `greader-audiobook-size-mismatch-max-retries\\=' times; if the
+ size is still outside the tolerance after all retries, signal
+ an error.
+\\='ask — prompt the user; answering no signals an error, yes continues.
+FUNCTION — a function called with three arguments: FILENAME, WAV-SIZE,
+ EXPECTED-SIZE. It must return one of the symbols above
+ (\\='ignore, \\='warn, \\='error, \\='retry, \\='ask) or nil
(treated
+ as \\='ignore). Use this to implement custom policies."
+ :type '(choice (const :tag "Ignore" ignore)
+ (const :tag "Warn and continue" warn)
+ (const :tag "Signal an error" error)
+ (const :tag "Retry conversion" retry)
+ (const :tag "Ask each time" ask)
+ (function :tag "Custom function"))
+ :group 'greader-audiobook)
;; functions
(defun greader-audiobook--backend-error-buffer ()
@@ -248,6 +315,23 @@ each backend's `audio-write' handler writes to."
(file-name-base
(or (greader-call-backend 'executable) "greader-backend"))))
+(defvar greader-audiobook--size-warnings 0
+ "Number of size-mismatch warnings issued during the current conversion.
+Reset to 0 at the start of each `greader-audiobook-buffer' call.")
+
+(defun greader-audiobook--expected-wav-size (text &optional word-count)
+ "Return the expected WAV file size in bytes for TEXT.
+Uses the current backend rate (WPM) and the sample rate from
+`greader-audiobook-expected-sample-rate'. Assumes 16-bit mono PCM.
+The estimate is approximate: TTS engines expand abbreviations and numbers
+and add silence for punctuation.
+Optional WORD-COUNT avoids recomputing the word count when already available."
+ (let* ((words (or word-count (length (split-string text nil t))))
+ (wpm (max 1 (greader-get-rate)))
+ (duration-seconds (/ (* words 60.0) wpm))
+ (bytes-per-second (* greader-audiobook-expected-sample-rate 2)))
+ (+ 44 (round (* duration-seconds bytes-per-second)))))
+
(defun greader-audiobook--percentage ()
"Return the percentage read of the buffer."
(let ((unit (/ (point-max) 100)) result)
@@ -329,19 +413,65 @@ Return the generated file name, or nil if at end of the
buffer."
(setq text (greader-dict-check-and-replace text)))
(when greader-audiobook-pause-at-end-of-track
(setq text (concat text greader-audiobook-pause-string)))
- (let ((result (greader-call-backend 'audio-write (list text
filename))))
- (when (eq result 'not-implemented)
- (error "Current TTS backend does not support audiobook
generation; \
+ (let ((attempt 0)
+ (max-retries greader-audiobook-size-mismatch-max-retries)
+ (done nil))
+ (while (not done)
+ (let ((result (greader-call-backend 'audio-write (list text
filename))))
+ (when (eq result 'not-implemented)
+ (error "Current TTS backend does not support audiobook
generation; \
switch to espeak, piper, or mac"))
- (unless (= result 0)
- (error "TTS backend error (exit code %d); see buffer %s"
- result (greader-audiobook--backend-error-buffer)))
- (let ((wav-size (file-attribute-size (file-attributes filename))))
- (when (or (null wav-size)
- (< wav-size greader-audiobook-min-wav-size))
- (error "Block %s appears empty or corrupt (WAV: %s bytes)"
- (file-name-nondirectory filename)
- (or wav-size 0)))))
+ (unless (= result 0)
+ (error "TTS backend error (exit code %d); see buffer %s"
+ result (greader-audiobook--backend-error-buffer)))
+ (let* ((wav-size (file-attribute-size (file-attributes
filename)))
+ (will-retry nil))
+ (when (or (null wav-size)
+ (< wav-size greader-audiobook-min-wav-size))
+ (error "Block %s appears empty or corrupt (WAV: %s bytes)"
+ (file-name-nondirectory filename)
+ (or wav-size 0)))
+ (let ((word-count (length (split-string text nil t))))
+ (when (and (> greader-audiobook-size-check-tolerance 0)
+ (or (= greader-audiobook-size-check-min-words 0)
+ (>= word-count
+ greader-audiobook-size-check-min-words)))
+ (let* ((expected (greader-audiobook--expected-wav-size
+ text word-count))
+ (tol greader-audiobook-size-check-tolerance)
+ (lower (round (* expected (/ (- 100.0 tol) 100.0)))))
+ (when (< wav-size lower)
+ (let* ((policy greader-audiobook-on-size-mismatch)
+ (action (if (functionp policy)
+ (or (funcall policy filename
wav-size expected)
+ 'ignore)
+ policy))
+ (msg (format "Block %s: WAV is %d bytes,
expected ~%d (-%d%%)"
+ (file-name-nondirectory filename)
+ wav-size expected tol)))
+ (pcase action
+ ('ignore)
+ ('warn
+ (cl-incf greader-audiobook--size-warnings)
+ (message "greader-audiobook: %s" msg))
+ ('error (error "%s" msg))
+ ('ask
+ (if (yes-or-no-p (format "%s — continue? " msg))
+ (cl-incf greader-audiobook--size-warnings)
+ (error "%s" msg)))
+ ('retry
+ (if (< attempt max-retries)
+ (progn
+ (setq attempt (1+ attempt))
+ (message "greader-audiobook: %s — retrying
(%d/%d)"
+ msg attempt max-retries)
+ (setq will-retry t))
+ (error "Block %s: size mismatch after %d %s"
+ (file-name-nondirectory filename)
+ max-retries
+ (if (= max-retries 1) "retry"
"retries"))))))))))
+ (unless will-retry
+ (setq done t))))))
(goto-char (cdr block))
filename)
nil)))
@@ -366,6 +496,27 @@ return the value returned by the associated function."
blocks)))
+(defun greader-audiobook-get-block-by-number (n)
+ "Return the text of block number N in the current buffer.
+Blocks are numbered starting from 1. Return nil if N is out of range
+or if the buffer has no blocks.
+The block boundaries are determined by `greader-audiobook--get-block',
+respecting `greader-audiobook-block-size', `greader-audiobook-modes',
+and all other block-size criteria. The returned string is the raw
+buffer text before any transformations (dict substitution, dehyphenation,
+etc.)."
+ (when (> n 0)
+ (save-excursion
+ (goto-char (point-min))
+ (let ((block (greader-audiobook--get-block))
+ (current 1))
+ (while (and block (< current n))
+ (goto-char (cdr block))
+ (setq block (greader-audiobook--get-block))
+ (setq current (1+ current)))
+ (when (and block (= current n))
+ (buffer-substring (car block) (cdr block)))))))
+
(defun greader-audiobook-transcode-file (filename)
"Transcode FILENAME using ffmpeg.
You have certain control of how this happens by configuring
@@ -563,6 +714,7 @@ buffer without the extension, if any."
(unless greader-audiobook-buffer-quietly
(message "Starting conversion of %s ."
book-directory))
+ (setq greader-audiobook--size-warnings 0)
(let ((failed-blocks nil))
(while (greader-audiobook--get-block)
(setq output-file-name
@@ -625,9 +777,13 @@ buffer without the extension, if any."
(setq book-directory (concat (string-remove-suffix "/"
book-directory)
".zip"))))
- (message "conversion terminated and saved in %s"
+ (message "conversion terminated and saved in %s%s"
(concat greader-audiobook-base-directory
- book-directory)))))))
+ book-directory)
+ (if (> greader-audiobook-size-check-tolerance 0)
+ (format "; size warnings: %d"
+ greader-audiobook--size-warnings)
+ "")))))))
(defvar greader-audiobook-transcode-history nil)
diff --git a/greader.texi b/greader.texi
index 15e7c3f44b..299e0caa45 100644
--- a/greader.texi
+++ b/greader.texi
@@ -811,6 +811,7 @@ MP3/M4A/FLAC and bundling them into a ZIP or M4B audiobook
container.
* Backend support::
* Piper and audiobook generation::
* Error handling::
+* Expected-size deviation check::
* Audiobook customization::
@end menu
@@ -911,45 +912,95 @@ Value
When blocks are skipped a summary message lists their numbers at the end
of conversion.
+@node Expected-size deviation check
+@subsection Expected-size deviation check
+In addition to the minimum-size floor, Greader estimates the expected
+WAV size from the word count of the block and the current TTS rate
+(WPM). If the actual file size is less than
+@code{expected × (100 - tolerance) / 100}, the block is considered
+suspiciously short and the action is controlled by
+@code{greader-audiobook-on-size-mismatch}. A WAV larger than expected
+is never flagged --- TTS backends normally produce more audio than the
+word count alone predicts (pauses, abbreviation expansion, etc.).
+
+@multitable {@code{function}} {Re-convert the block up to
@code{greader-audiobook-size-mismatch-max-retries} times; signal an error if
still mismatched.}
+@headitem Value @tab Behaviour
+@item @code{ignore} @tab Do nothing; continue normally.
+@item @code{warn} @tab Log a message to @code{*Messages*} and continue
(default).
+@item @code{error} @tab Signal an error; @code{greader-audiobook-on-error}
policy applies.
+@item @code{retry} @tab Re-convert the block up to
@code{greader-audiobook-size-mismatch-max-retries} times; signal an error if
still mismatched. A @samp{retrying (X/N)} message is shown during each attempt.
+@item @code{ask} @tab Prompt the user; answering no signals an error, yes
continues.
+@item @code{function} @tab A function called with @code{(filename wav-size
expected-size)}; must return one of the symbols above, or @code{nil} (treated
as @code{ignore}).
+@end multitable
+
+Blocks shorter than @code{greader-audiobook-size-check-min-words} words
+(default 10) are exempt from the check: short blocks such as chapter
+headings, footnotes, and section endings produce unreliable estimates
+because TTS backends add proportionally more silence and overhead
+relative to the text content. Set to @code{0} to check all blocks.
+
+Set @code{greader-audiobook-size-check-tolerance} to @code{0} to
+disable the check entirely.
+
@node Audiobook customization
@subsection Audiobook customization
Use @code{M-x customize-group RET greader-audiobook RET} for the full
list of options. Key variables:
-@multitable @columnfractions 0.33 0.33 0.33
-@headitem
+@multitable @columnfractions 0.33 0.33 0.33
+@headitem
Variable
@tab Default
@tab Description
-@item
+@item
@code{greader-audiobook-base-directory}
@tab @code{~/.emacs.d/audiobooks/}
@tab Root directory for generated audiobooks.
-@item
+@item
@code{greader-audiobook-block-size}
@tab @code{"15"}
@tab Block size: a string means minutes, a number means characters.
-@item
+@item
@code{greader-audiobook-transcode-wave-files}
@tab @code{nil}
@tab Transcode WAV blocks via ffmpeg.
-@item
+@item
@code{greader-audiobook-transcode-format}
@tab @code{"mp3"}
@tab Target format for transcoding.
-@item
+@item
@code{greader-audiobook-on-error}
@tab @code{stop}
- @tab Error policy: @code{stop}, @code{skip}, or @code{ask}.
-@item
+ @tab Hard-error policy: @code{stop}, @code{skip}, or @code{ask}.
+@item
@code{greader-audiobook-min-wav-size}
@tab @code{1000}
@tab Minimum WAV size in bytes; smaller files are treated as corrupt.
-@item
+@item
+@code{greader-audiobook-expected-sample-rate}
+ @tab @code{22050}
+ @tab Sample rate (Hz) used to estimate expected WAV size.
+@item
+@code{greader-audiobook-size-check-tolerance}
+ @tab @code{50}
+ @tab Lower-bound tolerance %: flag if WAV < expected×(100-tol)/100; @code{0}
disables.
+@item
+@code{greader-audiobook-size-check-min-words}
+ @tab @code{10}
+ @tab Min word count to run the size check; @code{0} checks all blocks.
+@item
+@code{greader-audiobook-on-size-mismatch}
+ @tab @code{warn}
+ @tab Action on size deviation: @code{ignore}, @code{warn}, @code{error},
@code{retry}, @code{ask}, or a function.
+@item
+@code{greader-audiobook-size-mismatch-max-retries}
+ @tab @code{2}
+ @tab Max re-conversion attempts when @code{on-size-mismatch} is @code{retry}.
+@item
@code{greader-audiobook-create-m4b}
@tab @code{nil}
@tab Bundle all blocks into a single M4B audiobook file.
-@item
+@item
@code{greader-audiobook-compress}
@tab @code{t}
@tab Compress the audiobook directory into a ZIP file.
diff --git a/readme.md b/readme.md
index a966dd4eb2..98d23d3e72 100644
--- a/readme.md
+++ b/readme.md
@@ -394,6 +394,31 @@ The variable `greader-audiobook-on-error` controls what
happens when a block fai
When blocks are skipped a summary message lists their numbers at the end of
conversion.
+#### Expected-size deviation check
+
+In addition to the minimum-size floor, Greader estimates the expected WAV size
from the
+word count of the block and the current TTS rate (WPM). If the actual file
size is less
+than `(expected × (100 - tolerance) / 100)`, the block is considered
suspiciously short
+and the action is controlled by `greader-audiobook-on-size-mismatch`. A WAV
larger than
+expected is never flagged — TTS backends normally produce more audio than the
word count
+alone would predict (pauses, abbreviation expansion, etc.).
+
+| Value | Behaviour |
+|---|---|
+| `ignore` | Do nothing; continue normally. |
+| `warn` | Log a message to `*Messages*` and continue (default). |
+| `error` | Signal an error; `greader-audiobook-on-error` policy applies. |
+| `retry` | Re-convert the block up to
`greader-audiobook-size-mismatch-max-retries` times; signal an error if still
mismatched. During each retry a `retrying (X/N)` message is shown. |
+| `ask` | Prompt the user; answering no signals an error, yes continues. |
+| _function_ | A function called with `(filename wav-size expected-size)`;
must return one of the symbols above, or `nil` (treated as `ignore`). |
+
+Blocks shorter than `greader-audiobook-size-check-min-words` words (default
10) are
+exempt from the check: short blocks such as chapter headings, footnotes, and
section
+endings produce unreliable estimates because TTS backends add proportionally
more
+silence and overhead relative to the text content. Set to `0` to check all
blocks.
+
+Set `greader-audiobook-size-check-tolerance` to `0` to disable the check
entirely.
+
### Audiobook customization
Use `M-x customize-group RET greader-audiobook RET` for the full list of
options. Key
@@ -405,8 +430,13 @@ variables:
| `greader-audiobook-block-size` | `"15"` | Block size: a string means
minutes, a number means characters. |
| `greader-audiobook-transcode-wave-files` | `nil` | Transcode WAV blocks via
ffmpeg. |
| `greader-audiobook-transcode-format` | `"mp3"` | Target format for
transcoding. |
-| `greader-audiobook-on-error` | `stop` | Error policy: `stop`, `skip`, or
`ask`. |
+| `greader-audiobook-on-error` | `stop` | Hard-error policy: `stop`, `skip`,
or `ask`. |
| `greader-audiobook-min-wav-size` | `1000` | Minimum WAV size in bytes;
smaller files are treated as corrupt. |
+| `greader-audiobook-expected-sample-rate` | `22050` | Sample rate (Hz) used
to estimate expected WAV size. |
+| `greader-audiobook-size-check-tolerance` | `50` | Lower-bound tolerance %:
flag if WAV < expected×(100-tol)/100; `0` disables. |
+| `greader-audiobook-size-check-min-words` | `10` | Min word count to run the
size check; `0` checks all blocks. |
+| `greader-audiobook-on-size-mismatch` | `warn` | Action on size deviation:
`ignore`, `warn`, `error`, `retry`, `ask`, or a function. |
+| `greader-audiobook-size-mismatch-max-retries` | `2` | Max re-conversion
attempts when `on-size-mismatch` is `retry`. |
| `greader-audiobook-create-m4b` | `nil` | Bundle all blocks into a single M4B
audiobook file. |
| `greader-audiobook-compress` | `t` | Compress the audiobook directory into a
ZIP file. |