reading email with a CD player

Kragen Sitaker Sat, 04 May 2002 00:21:15 -0700

A couple of years ago, I conceived the idea of burning my morning's
new email, as synthesized speech, onto a CD; this way, I could listen
to it as I drove to work or at other times when my eyes and/or hands
were busy.


So I finally did it last night.  I ran an mbox file containing the
mail through this program to split it into one file per (shortened)
message:

#!/usr/bin/perl -w
use strict;

my $outputdir = "mailoutput";
mkdir $outputdir, 0777;
my $file = "0000";

sub nextfile {
  $file++;
  my $filename = "$outputdir/msg-$file";
  open OUTFILE, ">$filename" or die "Can't open $filename: $!";
}

open OUTFILE, ">/dev/null";

my $state = "body";

while (<>) {
  if (/^From /) {
    nextfile;
    print OUTFILE "New mail message.\n\n";
    $state = "hdr";
  } elsif ($state eq "hdr" or $state eq "wantedhdr") {
    if (/^(From|To|Subject):/) {
      print OUTFILE;
      $state = "wantedhdr";
    } elsif (/^\s+/ and $state eq "wantedhdr") {
      print OUTFILE;
    } else {
      if (/^$/) {
        print OUTFILE "Message body follows.\n\n";
        $state = "body";
      } else {
        $state = "hdr";
      }
    }
  } elsif ($state eq "body") {
    if (/^-- / or /____/ or /----- Original Message -----/) {
      $state = "crap";
    } else {
      print OUTFILE;
    }
  }
}


Having done this, I used Festival 1.4.1 with the "kallpc16k" voice to
convert the text files into .wav files.  (On Debian, apt-get install
festival festvox-kallpc16k.)  I understand that Festival (especially
more recent versions) may have an easier way to do this, but here's
how I did it.  I wrote the following (somewhat fragile) shell script
and called it "txt2wav":

#!/bin/sh

set -e

festival <<eoscript
(Parameter.set 'Audio_Method 'Audio_Command) 
(Parameter.set 'Audio_Required_Rate 16000)
(Parameter.set 'Audio_Command 
               "cat \$FILE >> sound.$$.tmp")
(tts "$1" nil) 
(audio_mode 'close)
(exit)
eoscript

sox -t raw -w -s -c 1 -r 16000 sound.$$.tmp -r 44100 -w -s -c 2 -t wav $2 rate

rm sound.$$.tmp


Once this was in place, I ran the following script to turn the
directory full of text files into a CD:

#!/bin/sh
for x in mailoutput/msg-????; do ./txt2wav $x $x.wav; echo did $x; done
# my cdrecord doesn't do -dao
#cdrecord -dao -v speed=2 dev=0,0 -audio -pad mailoutput/*.wav
# but this works:
cdrecord -v speed=2 dev=0,0 -audio -pad mailoutput/*.wav
# I tried using cdrdao to do that in order to eliminate silences between
# tracks, but never quite made it.


This puts one message per track on the CD-R, so you can skip messages
or replay them conveniently.

This worked reasonably well, but had the following problems:
- MIME-encoded messages were a problem; they take a long time to encode
  and a lot of space on the CD.  Likewise HTML messages.
- spam messages were a much more serious problem than usual.  I manually
  removed the spam from this CD.
- CDs are small.  The 32 messages I put on the disc totaled 6559 words; the
  .wav files were 565 megabytes, 83% of a normal CD-ROM and 70% of a large
  CD-ROM.
- it was really slow.  Unfortunately, I didn't time it, but I think the 
  encoding step took about 45 minutes on my K6-2/500, while the burning step
  took about 30 minutes.  This might be reasonable as a scheduled
  morning job in order to have your previous day's email on a CD-R by
  the time you get up, but it is obviously not a good way to speed up your
  consumption of email.
- festival is not so smart about some email conventions.  The following text:

    > Me too!
    > 

    Me too!

  gets rendered as "greater than me too! greater than me too!"  Some things 
  sound much worse, such as "%-%-%-%-%-%-%-%", "%%-%%-%%-%%",
  "%%%-%%%-%%%", and
  
"http://www.inside.com/product/product.asp?entity=CableWorld&pf_ID=7A2ACA71-FAAD-41FC-A100-0B8A11C30373";.
  Similarly, diff output is incomprehensible, because the line
  boundaries that are so important to its semantics get lost; embedded
  quotes and parenthetical remarks of any kind are difficult to
  understand, because you can't tell what's inside the quotes or
  parentheses and what's outside.  Also, ":-)" simply disappears.

- festival is also not brilliant about pronunciation.  Words it mispronounced 
  in this sample follow:

    III Murch .org aol Kragen Sitaker tstonramp Re: shit humour
    bytesforall iicd 20020405 cqure eeye Ocx resumes politechbot
    Alcatraz rsasecurity Qaeda .edu concatenated Thu /usr ii LA

  However, many words I thought it had mispronounced were actually
  misspelled in the input, and it did an admirable job with many words
  I'd expect to be very difficult for text-to-speech systems.

-- 
/* By Kragen Sitaker, http://pobox.com/~kragen/puzzle2.html */
char a[99]="  KJ",d[999][16];main(){int s=socket(2,1,0),n=0,z,l,i;*(short*)a=2;
if(!bind(s,a,16))for(;;){z=16;if((l=recvfrom(s,a,99,0,d[n],&z))>0){for(i=0;i<n;
i++){z=(memcmp(d[i],d[n],8))?z:0;while(sendto(s,a,l,0,d[i],16)&0);}z?n++:0;}}}

reading email with a CD player

Reply via email to