I found a simpler (i.e scriptable) way: I wrote a simple python script
that uses ProjectX (cvs version) to extract the subtitles from the ts
file, BDSup2Sub to convert the subtitles to png images, ImageMagick's
convert to improve the resulting images and finally gocr to convert the
images to text.
It seems to work acceptably with recordings from the bbchd/bbcone-hd.
It's available here:

Both ProjectX and BDSup2Sub will spit a million warnings, but the final
result is ok. As always YMMV.

Yesterday I noticed that the timing offset is not fixed but depends on the recording (e.g, with my previous experiments I hardcoded a 4 seconds delay for the bbc channels, while yesterday I needed a 2 minutes delay). I don't know the cause of this offset (ProjectX or the crappy media player of my tv) but the problem can be easily solved with Subtitles:


