A couple of years ago, I conceived the idea of burning my morning's
new email, as synthesized speech, onto a CD; this way, I could listen
to it as I drove to work or at other times when my eyes and/or hands
were busy.
So I finally did it last night. I ran an mbox file containing the
mail through this program to split it into one file per (shortened)
message:
#!/usr/bin/perl -w
use strict;
my $outputdir = "mailoutput";
mkdir $outputdir, 0777;
my $file = "0000";
sub nextfile {
$file++;
my $filename = "$outputdir/msg-$file";
open OUTFILE, ">$filename" or die "Can't open $filename: $!";
}
open OUTFILE, ">/dev/null";
my $state = "body";
while (<>) {
if (/^From /) {
nextfile;
print OUTFILE "New mail message.\n\n";
$state = "hdr";
} elsif ($state eq "hdr" or $state eq "wantedhdr") {
if (/^(From|To|Subject):/) {
print OUTFILE;
$state = "wantedhdr";
} elsif (/^\s+/ and $state eq "wantedhdr") {
print OUTFILE;
} else {
if (/^$/) {
print OUTFILE "Message body follows.\n\n";
$state = "body";
} else {
$state = "hdr";
}
}
} elsif ($state eq "body") {
if (/^-- / or /____/ or /----- Original Message -----/) {
$state = "crap";
} else {
print OUTFILE;
}
}
}
Having done this, I used Festival 1.4.1 with the "kallpc16k" voice to
convert the text files into .wav files. (On Debian, apt-get install
festival festvox-kallpc16k.) I understand that Festival (especially
more recent versions) may have an easier way to do this, but here's
how I did it. I wrote the following (somewhat fragile) shell script
and called it "txt2wav":
#!/bin/sh
set -e
festival <<eoscript
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Required_Rate 16000)
(Parameter.set 'Audio_Command
"cat \$FILE >> sound.$$.tmp")
(tts "$1" nil)
(audio_mode 'close)
(exit)
eoscript
sox -t raw -w -s -c 1 -r 16000 sound.$$.tmp -r 44100 -w -s -c 2 -t wav $2 rate
rm sound.$$.tmp
Once this was in place, I ran the following script to turn the
directory full of text files into a CD:
#!/bin/sh
for x in mailoutput/msg-????; do ./txt2wav $x $x.wav; echo did $x; done
# my cdrecord doesn't do -dao
#cdrecord -dao -v speed=2 dev=0,0 -audio -pad mailoutput/*.wav
# but this works:
cdrecord -v speed=2 dev=0,0 -audio -pad mailoutput/*.wav
# I tried using cdrdao to do that in order to eliminate silences between
# tracks, but never quite made it.
This puts one message per track on the CD-R, so you can skip messages
or replay them conveniently.
This worked reasonably well, but had the following problems:
- MIME-encoded messages were a problem; they take a long time to encode
and a lot of space on the CD. Likewise HTML messages.
- spam messages were a much more serious problem than usual. I manually
removed the spam from this CD.
- CDs are small. The 32 messages I put on the disc totaled 6559 words; the
.wav files were 565 megabytes, 83% of a normal CD-ROM and 70% of a large
CD-ROM.
- it was really slow. Unfortunately, I didn't time it, but I think the
encoding step took about 45 minutes on my K6-2/500, while the burning step
took about 30 minutes. This might be reasonable as a scheduled
morning job in order to have your previous day's email on a CD-R by
the time you get up, but it is obviously not a good way to speed up your
consumption of email.
- festival is not so smart about some email conventions. The following text:
> Me too!
>
Me too!
gets rendered as "greater than me too! greater than me too!" Some things
sound much worse, such as "%-%-%-%-%-%-%-%", "%%-%%-%%-%%",
"%%%-%%%-%%%", and
"http://www.inside.com/product/product.asp?entity=CableWorld&pf_ID=7A2ACA71-FAAD-41FC-A100-0B8A11C30373".
Similarly, diff output is incomprehensible, because the line
boundaries that are so important to its semantics get lost; embedded
quotes and parenthetical remarks of any kind are difficult to
understand, because you can't tell what's inside the quotes or
parentheses and what's outside. Also, ":-)" simply disappears.
- festival is also not brilliant about pronunciation. Words it mispronounced
in this sample follow:
III Murch .org aol Kragen Sitaker tstonramp Re: shit humour
bytesforall iicd 20020405 cqure eeye Ocx resumes politechbot
Alcatraz rsasecurity Qaeda .edu concatenated Thu /usr ii LA
However, many words I thought it had mispronounced were actually
misspelled in the input, and it did an admirable job with many words
I'd expect to be very difficult for text-to-speech systems.
--
/* By Kragen Sitaker, http://pobox.com/~kragen/puzzle2.html */
char a[99]=" KJ",d[999][16];main(){int s=socket(2,1,0),n=0,z,l,i;*(short*)a=2;
if(!bind(s,a,16))for(;;){z=16;if((l=recvfrom(s,a,99,0,d[n],&z))>0){for(i=0;i<n;
i++){z=(memcmp(d[i],d[n],8))?z:0;while(sendto(s,a,l,0,d[i],16)&0);}z?n++:0;}}}